Rebuilding an old wacz?

I have a large wacz from 2020 that no longer works in replayweb.page, because it is detected as containing no pages. If I unzip it, I can load up and browse the individual warc.gz it contains just fine, so I assume this is an issue with the top-level index. How can I rebuild this wacz so it’s loadable in replayweb.page?

You could take the unpacked WARC files and then package them up with py-wacz. That’s unfortunate that it once worked and doesn’t anymore. Is there any way you can share the WACZ? Are there any interesting errors that display in the JavaScript console when you try to replay? Maybe there’s some backwards incompatible change that was introduced at some point into ReplayWeb page. It would be nice to fix that if it’s the case.

Thanks for posting here @jack!

Thanks, @edsu! That tool did the trick!

Unfortunately, I can’t easily share the wacz. I vaguely remember running into indexing issues when I created it, and I might have been doing something non-standard at the time.

@jack ok, no worries. If it happens again please bring it up!

If other people land here in search of another tool for doing this in JavaScript, check out Harvard’s Preparator: