1. Consider benchmarking new hardware as well. For me, there's already a few good-enough benchmarks for 8088..80486, eg. CheckIt and Landmark, but they crash on modern hardware, and I really would like to know how much faster nowadays Xeon boxes are compared to IBM 5150. Of course, I understand that benchmarking modern CPUs using 8086 instructions has some limitations, but I still hope for some not-quite-useless common benchmark for *ALL* PCs...
I promise that I will test and verify operation on my Core i7. The numbers will be quite comical, but I promise it will not crash.
How you run a DOS executable in Windows 7 and/or Vista, however, is your problem
BTW, Landmark, CheckIt, etc. weren't that good, and neither was Norton SI. They spent too much time trying to figure out how to estimate a "relative to 5150" value while still taking advantage of 286, 386, etc. instructions, had inaccurate results and/or crashed when the machine got too fast, etc. It is my frustration with those benchmarks that partially prompted me to start my own.
2. Consider benchmarking FPU as another test suite. Perhaps along with some emulation, to make it visible how much slower floating-point calculations are without the FPU.
I thought about that, but there are hardly any DOS games (the impetus of the benchmark) that use floating point. Also, floating-point benchmarks already exist (whetstone, etc.). Finally, the only emulation library I could compare with would be Norbert Juffa's lib (the one I use with my development environment) so I'm not sure how useful the comparison would be.
3. Consider benchmarking not-quite-compatible PCs. Eg. I would like to run this benchmark on Atari Portfolio, a palmtop with 40x8 screen, and some other oddities...
The benchmark will require 80x25 or better to run the main display, but I do plan on including command-line options for taking a system's metrics and storing them in the database using only the console. You could snapshot your Portfolio using command-line options, then take the database to another machine to compare to other systems.
The benchmark will do all timing through the 8253 timer, but since that only has a small-tick resolution of 55ms, I will most likely use the BIOS tick count variable as well. If anything is running that screws with the BIOS tick count, the results will not be accurate...
...actually, you've just given me an idea -- I can rely on ONLY the 8253 timer using interrupt-on-terminal-count and that way I can guarantee an accurate run without having to rely on the BIOS tick variable. Awesome, thanks for the idea