Http* wacz download performance setup?

In our local browsertrix cluster (virtuel master attached with 2 separate physical servers and separate mounted storage) we have a download performance issue using curl: 10-20 MBit/sec! and we have TB to download. The recieving storage should be fast.
If we download the same files with “minio” using a 9000 port tunnel it is reading the files with 140 Mbit/sec, what is currently ok. But you will soon stop support for minio as i understand.
If we do it from your cloud platform to the same internal servers it is done with 100-120 Mbit/sec!
And if we do the same from our internal virtual devel one server platform to the same internal servers it is done with 100-125 Mbit/sec!
We have no firewalls between the clusters and the internal servers we are downloading to.
Is there special http* download performance setup for curl in browsertrix?
Where is the bottleneck? We need som help here.

Thanks for any hints!

I have 3 API scripts which can reproduce the cases and get valid download urls using curl and minio.

Best regards

Tue

Hi Tue,

Would you mind giving us a little more detail so we can narrow this down a bit?

Namely, for the options other than grabbing the files directly from Minio, what requests are being made? Is this using presigned URLs returned from some API endpoints directly, or using Browsertrix API endpoints to stream the WACZs (and if so, which ones)?

Thanks!

Feel free to send the scripts you referenced to support@webrecorder.net if that’s the clearest way to pass along the details of what you’re describing here.

I have sent my 3 download performance scripts to your support email - i have created a internal case also with our KB storage people about the current storage setup and i will send you their answers.

Thanks! We’ll take a look on our end and I’ll aim to give you an update next week.