PCIe 4.0 vs 5.0 SSDs: does doubling bandwidth actually make your PC faster?

PCIe 5.0 doubles per-lane bandwidth over PCIe 4.0 (4 GB/s vs 2 GB/s per lane), pushing x4 NVMe SSDs from ~7 GB/s to ~14 GB/s sequential. Early PCIe 5.0 drives (Phison E26-based, Samsung PM9E1) can saturate the link. The question is whether any real workload benefits from 14 GB/s when 7 GB/s already exceeds what most applications can consume.

Hardware tier
Storage
Persistent storage devices
Topic focus
PCIe 4.0 vs 5.0 SSDs
pcie-gen4-vs-gen5

How this is calculated

Sequential read speeds above 5 GB/s are only fully utilized by a handful of workloads: 8K video editing with uncompressed RAW footage, large dataset transfers between two PCIe 5.0 SSDs in the same machine, and AI training pipelines that stream massive datasets from local storage. For gaming, application launching, and general use, random 4K read performance is the bottleneck, and PCIe 5.0 drives are only marginally better than PCIe 4.0 drives at random reads (both are limited by NAND latency, not the interface). PCIe 5.0 SSDs also run hotter and often require active cooling. For most users in 2026, a good PCIe 4.0 drive is still the sweet spot.

Verdict

PCIe 5.0 SSDs are worth it for video professionals and AI/ML engineers who move multi-terabyte datasets. For everyone else, a top-tier PCIe 4.0 drive (Samsung 990 Pro, WD Black SN850X) is already fast enough that you won't notice the difference.

More Latency scenarios

Frequently asked questions

How much faster is L1 cache than RAM?
Roughly 40-60x faster for random access. L1 cache access is about 1 ns; DDR5 RAM is about 50-80 ns end-to-end including controller overhead. That's why keeping hot data in cache dominates real-world CPU performance.
Is NVMe SSD faster than RAM?
No. NVMe is fast for storage, but for random access its latency is around 50-150 µs versus RAM's 50-80 ns. That's a 1,000x gap. NVMe beats RAM only on raw capacity and persistence, never on latency.
Why is HDD so much slower than SSD?
A spinning HDD has to physically move a read head to the right track and wait for the platter to rotate into position, typically 5-15 ms per random access. An SSD has no moving parts and returns data in under 100 µs, roughly 100x faster for random reads.
What's the point of L3 cache?
L1 and L2 are tiny (KB to low MB) and per-core. L3 is much larger (tens of MB) and shared across cores, acting as a buffer before requests go to main RAM. It catches data evicted from L1/L2 and data shared between cores.
How many nanoseconds is one CPU cycle?
At 4 GHz, one cycle is 0.25 ns. At 5 GHz, 0.2 ns. Cache hits are measured in single-digit cycles; main memory access costs hundreds of cycles, which is why optimizing for cache locality matters enormously in performance-critical code.
Does DDR5 have lower latency than DDR4?
Not usually at the same relative tier. DDR5 improved bandwidth and capacity significantly, but true latency (in ns) for mainstream kits is similar to late-stage DDR4. The gains from DDR5 come from bandwidth and larger capacities, not lower memory latency.