pchan3
January 21, 2025, 5:23pm
1
I was unable to successfully crawl the Google site at Storytelling With AI . I am wondering if there are specific parameters or settings I need to adjust. Below are the original and captured screenshots for reference:
manual-20250121170734-8d4ef639-dd5.wacz - Google Drive
Is this the same problem as logged in
opened 04:46AM - 14 Jan 25 UTC
bug
replay bug
### ReplayWeb.page Version
v2.2.5
### What did you expect to happen? What happ… ened instead?
On the live Google Sites website these images (which load using some weird iframe components) appear to be missing with a 404 error on replay. The image files appear to be archived and are present in the WACZ.
**Screenshot showing the nested iframes**
<img width="1626" alt="Screenshot 2025-01-13 at 11 34 23 PM" src="https://github.com/user-attachments/assets/dc5aa3b2-3253-4724-a7db-5adde9855e94" />
**Screenshot showing the image asset in the archive and also on the live site**
<img width="1792" alt="Screenshot 2025-01-13 at 11 41 31 PM" src="https://github.com/user-attachments/assets/14961986-2eff-4608-bcc9-e0d97c302a1e" />
### Step-by-step reproduction instructions
1. Archive https://sites.google.com/stanford.edu/a-digital-humanities-project with ArchiveWeb.page, wait for all images to load
2. Replay the page in ReplayWeb.page
### Additional details
This issue was originally found by Peter Chan in [this forum thread](https://forum.webrecorder.net/t/problems-archiving-google-sites/741/3).
pchan3
January 22, 2025, 7:26pm
2
Somehow, the problems seem to resolve themselves in subsequent crawls.