Replayweb.page not displaying pages

Hello!

My colleagues and I are having an issue replaying WARCs created with Archiveweb.page in Replayweb.page: when we upload a WARC, we see the following message:

No Pages are defined in this archive. The archive may be empty. [Try browsing by URL].

When we upload a WACZ of the same collection, it loads as expected in Replayweb.page. We tried with a few different webpages and got the same results.

Does anyone have any insight as to why this may be happening? I haven’t had issues replaying WARCs in Replayweb.page in the past.

Thanks for your help!

Sarah

AFAIK this has to do with how ArchiveWeb.page writes the WARCs? Warcit has a similar issue.

EDIT: Looks like this is a feature request :slight_smile:

You should still be able to browse all the URLs available in the file. Would generally recommend WACZ where possible as it will always contain the pages index and it doesn’t have to be generated.

I’ve noticed this recently too. I believe that the WACZ file, when unzipped, should contain a pages.jsonl file. I wonder if it’s not getting written in some cases or if there is a problem reading it. It probably should be recorded in a bug somewhere depending on what the problem is…