Export of Archive from ArchiveWeb.page does not include 301 redirect

when archiving this particular site, the use of 301 redirects is extensive.

while the archive can be played from the Chrome extension, as the recording appears to retain the links, exporting it to a WARC / WARZ file does not appear to include the redirects. so archives of articles which have multiple pages all become broken links.

1 Like

It sounds like you are using ArchiveWebPage to create the archive? Can you share your WACZ file or the site you are archiving?

Hi, Yes ArchiveWeb extension in Chrome.
For Example,

https://www.dndbeyond.com/sources/basic-rules

Links to different chapters from this main page go to… for example,
Ch. 1: Step-By-Step Characters
https://www.dndbeyond.com/compendium/rules/basic-rules/step-by-step-characters
however that is 301 redirected to
http://www.dndbeyond.com/sources/basic-rules/step-by-step-characters

(I have had a brief look at browsertrix, but i’ll need to do some work on a custom behaviour for the site, and I’m not that competent)

Apologies, I am getting intermittent behaviour,

when I posted this in ReplayWeb.Page I was getting every link broken, and I believed that to be due to the redirects.
However, I just deleted the archive from ReplayWeb and reloaded it from the existing wacz file, and the links are working again…

An older archive I have in the ReplayWeb is seeing the same broken links issue, but when I deleted it from ReplayWeb and reloaded the wacz file into it the page is loading fine…

Ok so i have looked into this a bit more.

When i have multiple collections open in ReplayWeb, the collections load normall initially and links work as expected.
After some time, and loading other collections, when i return to the earlier collection the links appear broken.
Removing the collection and reloading the warc file restores the normal behaviour.