So this seems like it should be easy but I can’t quite figure it out.
We have a public collection on Browsertrix for a site we want to decommission. When we decommission the site, we want to create an AWS lambda to redirect to the collection page and autoload the specific page in the collection. Ex: We get a request to https://old.domain/path/to/page and our lambda redirects it to the browsertrix collection page (Browsertrix) and when the page loads, the replay-web-page container will replay https://old.domain/path/to/page.
If we used the replay-web-page web component on our own site, we’d simply set the url to whatever the originally requested URL was. And there’s good embedding info on the collection page for how to embed the web component. But what I’m looking for is whether there’s a defined, permanent URL interface for doing this on the Browsertrix collection page.
When I play around with the collection, I do see the page URL change and it appears to follow the format of #url=url_safe_copy_of_url. Is that the URL format you use? Or is that just incidental and I’m totally misunderstanding it? And if it’s the URL format can I consider that “stable”? Or should I expect that won’t be very permanent?
Links to a version of https://globalchange.gov in the global-change-research-program collection on our usgov-archive account. By changing the hashtag, you can navigate to a different URL in the collection, so that should work.
But, if you’re trying to automatically redirect an entire domain, you might be interested in the web archive site mirror system for your use case. This allows you to make a fully static site that will load from a web archive on a new domain, like https://new.domain/path/to/page and have it be loaded from the same collection. Or, if you have access to https://old.domain, you can repurpose it to work as a web archive. The above link provides a starter page to do that.
Thank you @ilya ! Thanks for the confirmation on the hashtag but an even bigger thank you for the tip on the mirror system. I did not realize there was a template for setting up a mirror system like that. I’ll talk to my client but that looks like it could be a great result.
I did have one question. Is there a particular reason globalchange.govarchive.us opens every internal link in a new window? Or was that just how the original website was designed?
@ilya So I’m trying out the web archive site mirror system and I’m struggling a bit in that some links are being changed to the correct domain and others are stuck using the old domain. So in effect, the site is “breaking out” of the service worker. I don’t see that behavior on the govarchive.us sites. The only thing I changed was the init file:
@ilya I did a bit more debugging, if I navigate directly to a page all of the domains are properly replaced. If I navigate to a page, click a link to go to another page then all of the links are to the wrong domain. I don’t know if that helps debug what might be going on?
Just so I know in the future, how long does it take for a cached version of the service worker to be updated? I haven’t worked with service workers much so this is a bit new to me.
You can switch to a specific version in of wabac.js by refreshing https://cdn.jsdelivr.net/npm/@webrecorder/wabac/dist/sw.js locally, or, for production, you can pin to a specific release, eg: change your local sw.js to use the version: