• Please review our updated Terms and Rules here

8088/8086 microcode disassembly

I'm definitely going to use this to improve my 8088 emulator. I don't think I'll interpret the actual microcode instructions (that would be slow), but I do plan on structuring the emulator to work the same way that the microcode program does (in terms of which opcodes share code, what the subroutines are - things like that). That should simplify it substantially.
 
Remarkably, I've uncovered some undocumented instructions which nobody seems to have noticed in the more than 42 years since the 8086 was released!
Good to see that you didn't reveal to the world the undocumented divide-by-zero instruction, an instruction added at the request of certain government agencies, an instruction that results in the 8088/8086 quickly heating up to the point of destruction (and burning a hole through the motherboard). :)
 
Good to see that you didn't reveal to the world the undocumented divide-by-zero instruction, an instruction added at the request of certain government agencies, an instruction that results in the 8088/8086 quickly heating up to the point of destruction (and burning a hole through the motherboard). :)

The world is not yet ready for a singularity on every desktop!
 
Is the REPZ MUL complementing specific to the 8086/8088. Trying it on several later CPUs causes the REP to be ignored.

e.g.

Code:
    MOV     AX,3
    MOV     BX,2
    REPE
    MUL      BX

Always returns AX=6 on later CPUs. Haven't tried it on a V20/V30 yet.
 
Is the REPZ MUL complementing specific to the 8086/8088.

Probably! Any CPU that doesn't use the same microcode program is extremely unlikely to have the same behaviour - reusing the REP flag to mean "negate" (and not resetting it at the start of the multiply and divide operations) is not a particularly obvious design choice. I'd be surprised if it worked on V20/V30 as well - they have significantly faster multiplies so certainly an entirely different microcode routine.
 
Very nice work.

With the Zilog Z80 CPU people just copied it...

With NEC, they had to change things!

I see this with some of the older NEC UARTS (NEC uPD7201). They are supposed to work the same way as the Intel devices they were copies of (if you don't use the 'enhanced mode') but we could never get them to work properly. I had to get some Intel 8274 silicon encapsulated - that solved my problems!

It wouldn't, therefore, surprise me at all that the NEC V20/V30 departs from the Intel microcode.

Dave
 
As everyone else in this thread has already said: Really impressive work! :)

This may be a stupid question, but how do you (and/or the CPU) determine where the entry point for a particular opcode or subroutine is?

---

Even more interesting (IMHO) would be the microcode for the 286 and 386. Could answer the question why a task switch in microcode is slower than doing the same thing in software, or how exactly ICE mode works.

Slight derail regarding ICE mode on the 286:
I found out why F1 0F 04 doesn't work on some machines, apparently if there is any other bus request (i.e. memory refresh!) at some point when executing this instruction, it locks up so badly that even a reset signal can't bring it back. When refresh is disabled, it seems to be 100% reliable.

One thing relevant to the thread topic, I observed that ALU reg<->r/m type instructions don't use temporary registers anymore (at least not the ten ones dumped by 0F 04). But reg<->imm still does.

And a possible easter egg: immediately after reset, one register is loaded with 002A - answer to life, the universe and everything?
 
As everyone else in this thread has already said: Really impressive work! :)

This may be a stupid question, but how do you (and/or the CPU) determine where the entry point for a particular opcode or subroutine is?

There is a decoder immediately above the main microcode ROM - 11 bits for each chunk of 4 microinstructions. Each of those bits can be 0, 1 or "don't care" (indicated by the 0, 1 and ? characters on the right of the disassembly every 4th line). The incoming opcode is compared to each of these (simultaneously) and the correct starting position is selected. The process works similarly for finding the right place in the microcode for long jumps, subroutines, effective address handlers and the "group" instructions (opcodes 0xf6, 0xf7, 0xfe and 0xff, the microcode for which depends on the R field of the modrm byte).

Even more interesting (IMHO) would be the microcode for the 286 and 386. Could answer the question why a task switch in microcode is slower than doing the same thing in software, or how exactly ICE mode works.

I may have to leave those for someone else to do. As well as not having a die image, I don't know if the microcode for these chips is documented as thoroughly in patents as that for the 8086. If not, it might be necessary to do hardware level reverse-engineering of the chip's circuitry (I didn't have to do any of that for this disassembly, other than extracting the bits).
 
And a possible easter egg: immediately after reset, one register is loaded with 002A - answer to life, the universe and everything?

Hitchhiker's was broadcast 8 March 1978. The 8086 design was started early 1976 and released on June 8, 1978. So while it is technically possible, it is highly improbable, especially since the chip's design would very likely have been finalized more than 3 months prior to availability.
 
Hitchhiker's was broadcast 8 March 1978. The 8086 design was started early 1976 and released on June 8, 1978. So while it is technically possible, it is highly improbable, especially since the chip's design would very likely have been finalized more than 3 months prior to availability.

I think that's a 286 easter egg - the reset microcode on the 8086 only loads all-ones (into CS) and all-zeros.
 
I think that's a 286 easter egg - the reset microcode on the 8086 only loads all-ones (into CS) and all-zeros.

Yes, I was talking about the 286 there.

I tried the REP prefix with MUL and IMUL on newer generation chips:
- The 286 and presumably everything newer ignores it (instruction works normally)
- The 186 does something bizarre: the result is not negated, and not loaded into AX but instead BP (for IMUL) or SP (for MUL)!

e.g.
MOV AX, 2
MOV DX, 3
MOV BP, 0
REP IMUL DX

leaves AX and DX unchanged, BP=0006
 
Last edited:
- The 186 does something bizarre: the result is not negated, and not loaded into AX but instead BP (for IMUL) or SP (for MUL)!

e.g.
MOV AX, 2
MOV DX, 3
MOV BP, 0
REP IMUL DX

leaves AX and DX unchanged, BP=0006

Oh wow, that's really interesting and actually pretty useful to be able to put the product elsewhere!

I would be interested to know the effect of the REPNE prefix, what effect both prefixes have on DIV and IDIV, and what happens if you use an 8-bit operand instead of a 16-bit one.
 
Oh wow, that's really interesting and actually pretty useful to be able to put the product elsewhere!

I would be interested to know the effect of the REPNE prefix, what effect both prefixes have on DIV and IDIV, and what happens if you use an 8-bit operand instead of a 16-bit one.

REPNE has the same effect, and 8 bit multiplies put their result into AH (MUL) or CH (IMUL).
IDIV with the prefix negates the result just like on the 8086, DIV works the same as without prefix.

My guess is this was added for the "IMUL rw,ew,imm" instruction - the REP flag is internally set to mean "use bits 3-5 of modr/m byte as destination", then it can reuse the normal multiply microcode.

As to the usefulness, if you're programming for this specific chip, and want the result in exactly those registers, then it might be :)
 
Last edited:
My guess is this was added for the "IMUL rw,ew,imm" instruction - the REP flag is internally set to mean "use bits 3-5 of modr/m byte as destination", then it can reuse the normal multiply microcode.

Ah, of course!

Very interesting that IDIV works the same way as the 8086 despite divisions being much faster on the 186. I guess they still needed that extra bit of state, and the old technique wasn't broken so so they didn't fix it! Though that does make me wonder where they stash the negate bit for IMUL and why they couldn't use it for IDIV too.
 
Back
Top