I’m working on a 10 year old batch of downloaded websites that have been living on a local server. They consist of directories made up of .htm or .html files, images, and other web data that existed at that time, so I was interested to try warcit to see if I could package them into a warc.gz file for replay and use, since this seemed to fit one of the use cases for warcit. I’m running Mac OS 10.15.7, and using the latest version of Chrome if it helps.
After getting warcit up and running, I got successful messages for compiling the web files that live in a test local directory I created into warc.gz files (e.g. “Wrote 12 resources to blogtest.warc.gz”).
I’m guessing some of this is user error, and these files are fairly old, but when I attempted to load the warc.gz into replay web.page (browser online version), the URLs seemed to index correctly, and the mime types and dates looked accurate, yet when I tried to click on them to view, I got the following error for every URL:
“The webpage at https://replayweb.page/w/id-dcbbecbfcedd/20120620152050mp_//Users/lw2cd/Desktop/Working/Blog_Examples/ameliaAbreuim-blowing-up-serious-ladies-with-uva-folks-its.htm might be temporarily down or it may have moved permanently to a new web address.”
I also attempted to load the same warc.gz package into WR player, just to see if it would load, but none of the pages even indexed there (probably not a big surprise).
My questions are:
- Is this user error? Am I attempting to use warcit or replay web.page in a way that it wasn’t intended (web files that have been downloaded and have been sitting offline for years, repackaged into a warc.gz to be read by replay?)
- If it’s not user error, what might be missing in the info exchange between the warcit warc.gz package, and replay web.page? Where is the likely source of error? Warcit or replay web.page?
Thanks so much for taking a look, and am happy to supply more details if needed.