For the benefit of anyone reading this thread in the future, I found what causes this hang when I was writing
PADD: Slowdown during the first "DIR" is caused by DOS determining the number of free clusters by scanning the FAT. To save memory, it does it from the disk itself, sector by sector. Additionally, it updates some internal structures on every iteration of "I found a free cluster" which is very slow due to how DOS is optimized (size, not speed). The larger the FAT, the longer this takes. A partition that completely fills the FAT limit (any power of two, such as 128M, 256M, 512M, 1G, 2G) takes the longest to scan because the FAT is near or at the maximum size (~65,000 entries). Subsequent DIR times are fine, but if an INT 25/26 are performed (absolute disk read/write), the values are reset and boom, a DIR takes a minute again. (BTW, the very first version of DOS to implement >32M partitions, Compaq DOS 3.31, also has this hang.)
One way to cut this time down is to create partitions that are just over the size of a power of 2, so that when DOS creates the filesystem, the cluster size doubles and the FAT size halves. So, creating partitions that are (for example) 260M, or 518M, or 1.05G, etc., can help because there are only 32K+ entries to slog through instead of 64K+.
This is only a problem on 8088/8086/NEC v20/NEC v30 systems. On anything faster, the delay is so short that most people don't notice it.