I’ve created a Browsertrix account, crawled a website with it, and then uploaded the snapshot online. Since the WARC files are publicly accessible, I’m wondering whether these files could contain my personal information, such as the email used for Browsertrix, payment details, etc., or if the WARC files completely exclude such private information?
The WACZ files do not contain any of your payment information. That is all handled through Stripe, Webrecorder never actually sees it!
The name of your org and the crawl workflow name are part of the logs that are saved inside the WACZ. Additionally, if a Browser Profile is used information sent by the website could reveal personal details of the logged-in account (highly depends on the site being crawled) which is part of why Webrecorder always recommends using dedicated archiving accounts.
1 Like