• Please review our updated Terms and Rules here

8086 instruction exerciser

Svenska

Veteran Member
Joined
Mar 19, 2007
Messages
761
Location
Sweden
Hi,

emulating a CPU is hard. No surprise there, I guess. :)

For the Intel 8080/8085 and Zilog Z80 processors, there are excellent instruction exercisers available (here for 8080/8085 or here for Z80, which will tell you if the CPU (or emulators) works correctly, by testing all instructions and computing a checksum. They helped me a lot fixing the last bugs in my own emulator (which is a quite fast i8080-emulator written in AVR assembly).

However, when looking at the 8086 (or newer), accurate information seems very hard to find. Intels datasheets are incomplete or even wrong, and most emulators seem to stop at "it boots DOS and runs my games", and bugs in less-commonly used instructions (e.g. BCD arithmetics) stay undetected.

Is there any instruction exerciser for the 8086 (or the 80286 real mode) available?

Best Regards,
Svenska
 
I had the same question many years ago and drew a blank.

I even went to Intel (from inside the large company I work for) and still got nowhere...

I use BOCHS as a simulator. I modified it for use with an Intel iSBC 286/10A CPU board in conjunction with an Intel iSBC 012CX memory board and some bespoke I/O equipment we use at work. The CPU card uses a couple of 8259 PICs and 8254 PITs (but at different locations to those in a PC) and doesn't have a screen - but an 8274 dual UART - so I had to modify the peripherals in BOCHS. This was quite simple in the end. We use our own multitasking real-time operating system and compiler tools - and I have successfully run this OK under BOCHS. Our Operating System does use the 8259 PIC in a different mode to DOS - so I had to modify the PIC to be a more 'generic and Intel-faithful' emulation.

Dave
 
That doesn't sound very promising, sadly.

Currently, I am building a tiny 8086/80186 emulator core and would like to be sure that it works correctly. While I could check against a true 80186, I wasn't really planning on building verification infrastructure...

It would be nice if something magically appeared.
 
Try this, it was written for an 80186 emulator but it only tests 8086 instructions - http://orbides.1gb.ru/80186_tests.zip

I took a short look, but these seems to just run the instructions and not test the results.


Anyway, I've had a more detailed look on how the original Z80/8080 exercisers work and porting to the 8086 can be done.
The per-instruction overhead is quite substantial (less so on the 8086), but the combinatorical explosion makes test design quite a challenge.

As an example, let's take the existing test for DAA/CPL/SCF/CCF (Z80) and compare to a comparable AAA/AAS/DAA/DAS (8086) test.

In both cases, all registers are initialized with random (but fixed) values.
The Z80 exerciser varies 2 instruction bits, 6 flag bits and 8 accumulator bits (16 bits total), generating 2^16 = 64k test cases.
The 8086 exerciser varies 2 instruction bits, 7 flag bits and 16 accumulator bits (25 bits total) to generate 2^25 = 32M test cases.

For this very simple instruction, the search space increases by a factor of 512. Running this on a true 8086 will be slow.
More advanced instructions will make this plainly infeasible (more and wider registers, many flexible addressing modes...).

A stock 8080 crunches through approximately 4000~5000 iterations per minute (~3 hours on the 750k iteration aluop test).

So, yeah... testing the ALU should be possible, but not with all combinations of inputs and outputs.
 
The search space can be reduced considerably by using bit patterns commonly used for memory testing: 55h, CCh, 00h, 01h, FEh, both by themselves as well as rotated 0-15 times as appropriate (ie. there's no point in rotating 00/FF, or CC more than 3 times, etc.). If you did this, you could test every opcode, not just the ALU.

You're right, it's highly impractical to perform a complete opcode exercise because it would take weeks to perform on real hardware.
 
Practically, you would either use a statistical testing method or break the problem down into smaller 'blocks' assuming that the blocks didn't interact too much (if at all).

For example, if you can decouple the registers themselves, from the logic of the ALU, from the addressing modes and from the instruction decode you could process/test these blocks separately.

A test of each register should demonstrate that you can set and clear each and every bit in each and every register. You would probably start off by setting, testing, clearing, testing each flag and then shifting a '1' into a register of 0000 and a '0' into a register of FFFF.

Next, you would pick a simple(ish) opcode (such as ADD) and make sure it worked in a simple (register) way.

You could then use the ADD to test the addressing modes out fully.

You could then test the various instructions themselves using the relatively simple register addressing mode.

Finally, you can reduce the testing on each instruction by concentrating on the boundary conditions and slipping a few random tests in amongst those.

It would be good for someone to come up with an x86 validation test I must admit...

I do suspect I know who has the Intel test vectors for the 8086 CPU... The problem is, they will not want to release them :-(!

Dave
 
An x86 validation test would be extensive since the behavior of the x86 line changed with every CPU. It is more practical, IMO, to pick a single CPU and concentrate on just that until all of the bugs are worked out.

Is the goal of the 8086 exerciser to find bugs in emulation? If so, then you'd need more than an ALU opcode tester. For example:

REP with segment overrides doesn't resume properly after an interrupt
POP CS is possible on 8086
Interrupt behavior after MOV SS vs. MOV any_segment_register
Behavior of 8D C2 ("LEA AX, DX") (See Raúl Gutiérrez Sanz's comments here: http://www.os2museum.com/wp/undocumented-8086-opcodes/comment-page-1/#comments )

...etc.

I do suspect I know who has the Intel test vectors for the 8086 CPU... The problem is, they will not want to release them

Couldn't hurt to ask. If they don't want to release the exact code, maybe they can release the methodology. The CPU is certainly beyond economic recovery for Intel.
 
I suspect that you're not interested in a stress test, as you're working with an emulator.

How about Sandsifter for exposing non-documented features? Run it on a "real" 8086, then run it on your emulator. Compare results.

In particular, have a look at the whitepaper under the "references" branch.
 
Sandsifter requires a 386+ because it relies on invalid opcode exceptions (and while 286 has those, sandsifter is written in python so it still needs a 386-class system to run on). So you can't run it on an 8086.

I know you wrote "real" 8086, but I guess I'm not sure what you meant?
 
I meant an 8086 licensed from Intel. Remember the Intel vs. NEC copyright lawsuit? The V20 doesn't execute some x86 instructions the way a real 8088 does, but it didn't seem to hurt sales. Similarly differences in the way shift counts are treated in the 8086 vs. everything else didn't seem to bother anything when later CPUs came along.

The more interesting question is "can one emulate an 8086 not only with respect to instructions, but also with respect to real-world timing?" In other words, can your emulated 8086 run at precisely the same speed as an 8 MHz 8086?

I'm not optimistic about that one. However, there is an FPGA implementation of an 8086/8088 running at 4.77 MHz. It's apparently done by running a simple 7-instruction 32-bit core at 100MHz. He claims it as a drop-in replacement and offers BIU modules for both the 88 and 86.
 
The search space can be reduced considerably by using bit patterns commonly used for memory testing: 55h, CCh, 00h, 01h, FEh, both by themselves as well as rotated 0-15 times as appropriate (ie. there's no point in rotating 00/FF, or CC more than 3 times, etc.).
The existing 8080/Z80 exercisers use a "test vector" (instruction bytes, register values including flags, a memory operand) and two additional vectors: One which is combinatorically explored, and one where every bit is flipped once. So a trade-off between coverage and execution time is possible. On the other hand, I think that covering the complete (relevant) search space is necessary to some extent: The 8080 exerciser was still able to uncover bugs even after all contemporary test programs were happy (the aux carry flag is hard to get right).

If you did this, you could test every opcode, not just the ALU.
I do not aim for complete opcode coverage. Some instructions must work for the exerciser to successfully work in the first place (e.g. JMP, Jxx), others depend on the envonment (e.g. I/O, INT/IRET) or behaviour beyond the application (e.g. TRAP flag). Also, the existing exerciser structure does not work well for most addressing modes.

Practically, you would either use a statistical testing method or break the problem down into smaller 'blocks' assuming that the blocks didn't interact too much (if at all).
I have looked at a few emulator designs, but am by no means smart. My approach is strongly table-driven, trying to unify as much of the instruction behaviour. Others approaches treat each opcode separately, implementing similar behaviour in many places (either manually or through macros). Then, there are JIT approaches, which generate code at runtime, possibly fusing instructon groups. All of these cases will result in different degrees of interaction between blocks.

It would be good for someone to come up with an x86 validation test I must admit...
I don't think that is feasible. The x86 space is far more varied than the 8080/Z80 space is, and behaviour varies a lot between vendors and implementations (e.g. CPUID or RDTSC). Specifically, I only care about a subset of 16-bit real mode - and trying to ignore implementation differences.

Is the goal of the 8086 exerciser to find bugs in emulation?
Partly. The main goal is aiding the implementation of an emulator. I wouldn't be surprised to uncover bugs in existing emulators, either. My main project requires an x86-compatible CPU core to run a few DOS applications in a restricted environment. I do not plan on being PC-compatible (not more than necessary) or to even support evil software tricks - there are other projects better suited (PCem comes to mind). Basically, some form of DOSBox tailored to my specific needs.

If so, then you'd need more than an ALU opcode tester. For example:

REP with segment overrides doesn't resume properly after an interrupt
POP CS is possible on 8086
Interrupt behavior after MOV SS vs. MOV any_segment_register
Behavior of 8D C2 ("LEA AX, DX") (See Raúl Gutiérrez Sanz's comments here: http://www.os2museum.com/wp/undocumented-8086-opcodes/comment-page-1/#comments )

...etc.
I agree, but none of these behaviours are easily testable within the existing exerciser framework. Also, they test for very specific behaviour of the 8086, which is only useful if you want to accurately emulate that very specific processor including all of its quirks. Software written to run on x86 (rather than 8086 or "IBM 5150") should not rely on these behaviours at all. Also, I will exclude any attempt at testing timing behaviours or limitations with self-modifying code (outside of the exerciser itself, that is).

I suspect that you're not interested in a stress test, as you're working with an emulator.
No, although I will need to be able to run the exerciser on a real machine in order to get "known good" results. I won't be able to test on a true 8086 either - only a single 80186 and 80486SX each (possibly an 80486SL if I get the machine to work).

How about Sandsifter for exposing non-documented features? Run it on a "real" 8086, then run it on your emulator. Compare results.
The approach taken by Sandsifter relies on invalid instructions and page faults, neither of which exist on 8086/80186. Again, this is interesting and useful work if one wants to recreate a "true" CPU, but this is not my goal.

The more interesting question is "can one emulate an 8086 not only with respect to instructions, but also with respect to real-world timing?"
I don't see any reason why this shouldn't be possible. Technically, it should be feasible to simulate the whole PC platform as a whole, starting from the 14.3 MHz main crystal by now.

In other words, can your emulated 8086 run at precisely the same speed as an 8 MHz 8086?
Nope, not even trying. The emulator is far from done (hence this side-project), but I already now know that the Trap Flag does not behave correctly. Also, any undefined or FPU instruction (including POP CS) will instantly kill the core and additional inaccuracies will definitely appear as well.

My 8080 core was originally written in AVR assembly. That version fits into 3 KB of flash memory and should run about as fast as a 2 MHz 8080 (which I cannot verify). If I can squeeze an 8086/80186 core into the same size region, it could be used to run an original VGA BIOS...
 
I'm having a bit of trouble determining your real objective, please forgive my density.

Why not use the DOSBox code as a starting point, if you want to emulate a complete PC or part thereof?
 
I do not aim for complete opcode coverage
...
I only care about a subset of 16-bit real mode - and trying to ignore implementation differences
...
My main project requires an x86-compatible CPU core to run a few DOS applications in a restricted environment. I do not plan on being PC-compatible (not more than necessary)

Your stated goals conflict with the whole point of writing an opcode exerciser (to be as accurate as possible). Why not just fork DOSBOX and be done?
 
Well, while an AVR might be a decent choice for an 8080 emulator, I'm not sure that it's the best fit for an 8088 with its preponderance of 16-bit operations.

MOHO only.
 
Why not use the DOSBox code as a starting point, if you want to emulate a complete PC or part thereof?
Because DOSBox misses a crucial functionality (SHARE support without booting DOS). It cannot be implemented easily either.

Your stated goals conflict with the whole point of writing an opcode exerciser (to be as accurate as possible).
Please look at the existing exercisers, their use and limitations. Neither suffices in fully reproducing their targeted CPUs, yet they are extremely valuable tools when implementing emulators for said CPUs. On the other hand, it is impossible to accurately exercise all opcodes without also requiring the complete environment within which the CPU operates. Especially from within the system itself.

The requirements of my main project (i.e. the emulator itself) are independent of the exerciser. If there was a good way to verify that I did not **** up the flag handling (as an example), then I would not need to bother in the first place - see the original post.

Why not just fork DOSBOX and be done?
Because it isn't that easy. Also, where'd be the fun? The prototype (which runs CP/M) consists of a 23 KB executable and requires 260 KB of memory.

Well, while an AVR might be a decent choice for an 8080 emulator, I'm not sure that it's the best fit for an 8088 with its preponderance of 16-bit operations.
It wouldn't need to be fast to be useful. After all, the VGA BIOS is mainly used for initialization and modesetting. The alternative involves tracing the hardware accesses and reproducing them within the AVR - which is always card-specific.
 
Back
Top