(Hi all crossposted with Github since I didn’t know quite where to put this. Thanks in advance.)
I’m using the web-archive-site-mirror and it is, simply, great. It really does a wonderful job and we’re using it for multiple archives. We’re noticing though that on some mirrored sites, the links are not correctly rewritten on the archive mirror. For example, if you visit the mirror here, you’ll see the list articles in the top bar. If you look into the HTML, the urls for the links appear to be entered properly. The second you press the mouse button to click the link (before letting go), you’ll see a _ga query param added to the end of the link’s url (I’d assume that’s from Google Analytics). This does not happen on the live site but even more strange, it does not happen when you view the site on the Browsertrix website. I suspect this is a bug in the web archive site mirror tool and I’m happy to investigate but I’m not even really sure where to look. Any thoughts?
FYI: If you’d like to use the live site as reference for debugging, it will be taken down on Monday, April 20.