In theory, a miss on a write-through cache could be handled by setting up the address and write line to the cache, and latching the data (from main memory) into the cache RAM as soon as the data setup and hold times have been met. This should allow a cache miss to be filled from main memory without stalling the CPU any more than main memory would anyway. But a cache controller that expects to fill multiple cache data words (i.e. multiple data cache entries are used for each tag line), or perform write-back caching, probably can't do this. Multiple main memory cycles would be needed to fill a miss in this case (but the hit rate may go up significantly). If it is a direct mapped write-back cache, then if the cache data is dirty it must be written to memory before the memory read to satisfy the cache miss can even begin. In theory, a smart cache design could use much wider main memory and paralleled cache data RAMs to increase the throughput to load the cache, but when the cache data RAMs are deeper rather than wider I can't imagine this is the case.