I have seen 2-4 seconds for fast storage systems, like the aforementioned NVMe + Cortex-A53 setup. For booted operating systems, You can enable full-disk compression for similar benefits. Modern filesystems such as btrfs and zfs support these features out of the box. Perhaps that would get the speedup you’re after.
Further, when you’re running an elf for the second time, you’re likely to have it cached in RAM from the previous run, unless it has been paged out.
Also, there is a swap-space-like-thing based on the principal that compression is much much faster than disk called zram or zswap, which can further reduce the chance to do a round trip to the disk.