How to load custom behaviour into docker

Good evening,

I have hacked together a behaviour to crawl
When i test it by copying the dist/behaviors.js file into a chrome snippet, the behaviour runs “fine”.

How do i get it to run in the browsertrix-crawler docker ?

I have created a custom-behaviours directory and copied the dist/behaviors.js file there, but when i run the docker it errors out with:

"Waiting for custom page load failed","details":{"type":"exception","message":"Cannot read properties of undefined (reading 'siteSpecific')","stack":"TypeError: Cannot read properties of undefined (reading 'siteSpecific')\n at ( at CdpFrame.<anonymous> (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/util/decorators.js:94:27), <anonymous>:0:20)\n at #evaluate (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/cdp/ExecutionContext.js:341:23)\n at async ExecutionContext.evaluate (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/cdp/ExecutionContext.js:268:16)\n at async IsolatedWorld.evaluate (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/cdp/IsolatedWorld.js:96:16)\n at async CdpFrame.evaluate (file:///app/node_modules/puppeteer-core/lib/esm/puppeteer/api/Frame.js:339:20)\n at async Crawler.awaitPageLoad (file:///app/dist/crawler.js:1295:13)\n at async Crawler.loadPage (file:///app/dist/crawler.js:1266:9)\n at async Crawler.default [as driver] (file:///app/dist/defaultDriver.js:2:5)\n at async Crawler.crawlPage (file:///app/dist/crawler.js:537:9)\n at async PageWorker.crawlPage (file:///app/dist/util/worker.js:153:21)"}}

Ahh ok,

so i did some further looking around at the browsertrix-crawler repository.

What i was doing is taking the compiled behaviour.js from browsertrix-behaviours and putting that into the custom-behaviors directory.

seeing the tests in the browsertrix-crawler repository i can see now that i am supposed to put the native js of my custom behaviour in the custom-behaviors directory directly, not the compiled behaviours.js code.

of course I need to rework my behaviour, I had done it in typescript instead of js.

Nope I think i need help understanding how to load a custom behaviour “properly”…

I have managed to get the behaviour to run in the docker by:

compiling the browsertrix-behaviours with my custom behaviour code included.

running docker, passing the compiled behaviour.js to the location


Running the docker crawl with

docker run -i -p 9037:9037 -p 9038:9038 -v $PWD/crawls:/crawls/ -v $PWD/dist/behaviors.js:/app/node_modules/browsertrix-behaviors/dist/behaviors.js -v $PWD/crawl.basic-rules.custom.yaml:/app/crawl.basic-rules.custom.yaml -it webrecorder/browsertrix-crawler crawl --config /app/crawl.basic-rules.custom.yaml

I somehow suspect this is the the “right” way, so I would appreciate any pointers

Hi @FrontLineFodder, take a look at this part of the crawler docs: Browser Behaviors - Browsertrix Crawler Docs.

Once you have the custom behaviors directory mounted as a volume, you can add your custom behaviors with --customBehaviors /path/to/volume.

Let us know how that works out!

Note that it expects a directory, not a link to a single behavior file.