Recently setup browsertrix and everything has been going great so far. Its just that i would love to be able to set the number of windows that are concurrently archiving to any custom number since a lot of the sites i scrape dont have much rate limits and dont care about traffic/server load. Thanks in advance
(I use the GUI and dont really wanna mess around with that much in the cluster)
Glad to hear it’s going well! There’s a setting in values.yaml called max_browser_windows, it’s set to 8 by default but you can increase it as much as you like and redeploy using Helm. Depending on how you’ve set it up you can add that line to your own yaml file(s) and run the same helm upgrade --install btrix … command you used to first deploy it.
In addition to max_browser_windows, you may also want to consider crawler_browser_instances, which determines how many browser windows will run per crawler Docker image. This can be raised if you want to run fewer containers per crawl, in which case it will likely make sense to increase the number of resources available to each crawler image as well.
Thanks a lot to you both, will do that. Hope you have a great day