Device: Fedora 37 laptop, Ryzen5-4500U Original MRE: https://ocv.me/stuff/bugs/chrome/crypto-subtle-mt/v1/crypto-subtle-mt.html Improved MRE: https://ocv.me/stuff/bugs/chrome/crypto-subtle-mt.html The issue: `crypto.subtle.digest` does not perform as expected with simultaneous access from multiple web-workers. Recent chrome versions have improved very slightly, but still perform well short of expected. Observed on: Chrome version: 111.0.5563.146 linux-x64 from Fedora repos @ 2023-04-05 Chrome version: 114.0.5698.0 linux-x64 from download-chromium.appspot.com @ 2023-04-05T17:41Z Performance measurements: chrome, 1 worker: 405 MiB/s chrome, 2 worker: 584 MiB/s = 144% (less than expected 800 MiB/s, 200%) chrome, 4 worker: 584 MiB/s = 144% (less than expected 1600 MiB/s, 400%) Respective screenshots: https://ocv.me/stuff/bugs/chrome/crypto-subtle-mt/v2/chr1.png https://ocv.me/stuff/bugs/chrome/crypto-subtle-mt/v2/chr2.png https://ocv.me/stuff/bugs/chrome/crypto-subtle-mt/v2/chr4.png when comparing this to firefox, we see better (but still not quite as expected) performance: firefox, 1 worker: 300 MiB/s firefox, 2 worker: 585 MiB/s = 195% (close enough to 600 MiB/s, 200%) firefox, 4 worker: 779 MiB/s = 260% (less than expected 1200 MiB/s, 400%) Respective screenshots: https://ocv.me/stuff/bugs/chrome/crypto-subtle-mt/v2/ff1.png https://ocv.me/stuff/bugs/chrome/crypto-subtle-mt/v2/ff2.png https://ocv.me/stuff/bugs/chrome/crypto-subtle-mt/v2/ff4.png to check whether this could be explained by cpu boosting / turbo, let's compare to linux-native programs: linux, 1 procs: 740 MiB/s linux, 2 procs: 1444 MiB/s = 195% (close enough to 1480 MiB/s, 200%) linux, 4 procs: 2470 MiB/s = 333% (a bit short of 2960 MiB/s, 400%) Respective screenshots: https://ocv.me/stuff/bugs/chrome/crypto-subtle-mt/v2/linux1.png https://ocv.me/stuff/bugs/chrome/crypto-subtle-mt/v2/linux2.png https://ocv.me/stuff/bugs/chrome/crypto-subtle-mt/v2/linux4.png so even if we reduce our anticipations to 333% for 4 workers, none of the browsers have the expected performance currently. ---- My real usecase is slicing a large local file into 1 MiB chunks, then getting the sha512 checksum of each chunk. Since NVMe-storage makes this CPU-bottlenecked, I launch several webworkers which receive commands to read 1 MiB from the file and then hash it. Only the file-read operation itself is synchronous across workers. * On Firefox, this parallelizes acceptably, increasing the speed from 300 to 779 MiB/s (333%). * On Chrome, the speed remains mostly unchanged, increasing only from 405 to 584 MiB/s (144%). As a result, a fallback codepath which uses https://github.com/Daninet/hash-wasm instead of `crypto.subtle.digest` is actually faster than native sha512! The fallback was originally to get around the issue of `crypto.subtle.digest` being unavailable for non-https sites, but now it serves an additional purpose... Maybe the mutex is there to protect against some timing attack? Or, that's the only thing I could think of at least.