I am trying to reproduce an issue where ReplayWeb.page in a large (32 GB, 126K pages) collection randomly displays incorrect page content instead of the correct content or a message that the page is not in the archive.
I have now been able to partially reduce this to a small test case. My demo collection
https://replayweb.page/?source=https://app.browsertrix.com/api/orgs/01e6c292-95fb-4a85-b3f1-73dfa16132a7/collections/00ff5e44-3949-4cf0-a501-cd73019dcee0/public/replay.json
contains only four pages:
https://www.hoeflichepaparazzi.de/forum/memberlist.php
https://www.hoeflichepaparazzi.de/forum/memberlist.php?page=5&pp=25&order=asc&sort=username<r=A
https://www.hoeflichepaparazzi.de/forum/member.php?488-anko
https://www.hoeflichepaparazzi.de/forum/member.php?1-admin
(parts of a user list and two user profiles.)
If I visit the page https://www.hoeflichepaparazzi.de/forum/member.php?488-anko
and try to navigate to any other user by clicking on its name (e.g. “Ruebenkraut” (https://www.hoeflichepaparazzi.de/forum/member.php?9336-Ruebenkraut
), “U_Sterblich” (https://www.hoeflichepaparazzi.de/forum/member.php?11080-U_Sterblich
), or “Murmel” (https://www.hoeflichepaparazzi.de/forum/member.php?7654-Murmel
)), the profile page of the user “admin” is always displayed. Only if I enter these URLs directly into the URL field, the correct message, e.g.
Archived Page Not Found
Sorry, this page was not found in this archive:
https://www.hoeflichepaparazzi.de/forum/member.php?11080-U_Sterblich
is shown.
In my original case, this happens not only to non-existent, but also to existent pages, which is really disturbing.
TIA
Heinz