• Please review our updated Terms and Rules here

8086 emulator tech discussion

oh my lookie here.. finally!

fake86-rombasic-new2.png


:cool:
 
Congratulations! And also, you've won an award for the most screen captures in a thread .. ;-0

On an administrative note, please consider posting thumbnails that link to a full size image, and cut your .sig down a little bit - we're trying to keep things readable and have asked others to do the same recently.
 
Congratulations! And also, you've won an award for the most screen captures in a thread .. ;-0

On an administrative note, please consider posting thumbnails that link to a full size image, and cut your .sig down a little bit - we're trying to keep things readable and have asked others to do the same recently.

sorry for all the screenshots. i'm a bit proud of this program. i had given up and stopped work on it for a couple weeks, but i found my second wind. i'll shrink them in the future. it doubles up all the pixels anyway so i could shrink it 50% and you could still see everything.

drop on by my IRC channel soon again mike. we haven't seen you in a while. ;-0
 
Last edited:
Congratulations, Mike.

What's your next step? Convert the emulator into VHDL and dump it into an FPGA? :)

lol. never used either of those, but FPGAs look really cool. what i will do once DOS completely boots it port the entire thing over to C. it will be quicker, and portable. FB compiles for windows, linux, and 32-bit DOS. nothing non-x86 though. it's an excellent language for easily experimenting with code, and there's minimal fuss if i need to make major changes. can't beat it for prototyping stuff. lots of modern additions on top of the QB compatibility. pointers, unsigned data types, etc.

what's kind of funny, is that if i really wanted to, i could take the Fake86 code, spend about 15 minutes making minor changes, drop the emulated memory size to say 32 KB, then compile it in QuickBASIC 4.x/7.1 and run it on an 8088. oh god................


edit: i have it report total instructions executed, and average IPS upon exit. here's what it does with video rendering disabled on a 3.2 GHz Phenom II X4 (using only one of the cores of course) - this was measured as it executed alley cat for a few seconds.

Code:
Total instructions executed: 189,954,994
Average instructions/sec: 7,550,172

with video rendering, it dips down to about 7.2 million running the same code. so, yeah it could be faster. i'd call it acceptable for most things though.
 
Last edited:
Very Nice!

Very Nice!

Hi Mike,

Congratulations on getting your emulator to this stage! Really nice effort!

Doesn't matter that you wrote it in FreeBASIC. The horsepower gap between classic and current is so wide, I reckon you probably could've written that emulator in MS QBASIC, and still emulate a 4.77MHz 8088 system on any modern PC close enough in speed to the real thing, IMHO..

Like the CGA implementation - palettes look good :cool:


Regards,
Valentin Angelovski
 
Last edited:
Hi Mike,

Congratulations on getting your emulator to this stage! Really nice effort!

Doesn't matter that you wrote it in FreeBASIC. The horsepower gap between classic and current is so wide, I reckon you probably could've written that emulator in MS QBASIC, and still emulate a 4.77MHz 8088 system on any modern PC close enough in speed to the real thing, IMHO..

Like the CGA implementation - palettes look good :cool:


Regards,
Valentin Angelovski

thanks valentin! very true about the speed. at 3.2 GHz it's roughly the speed of a slow 286. i've always wondered how tough it would be to code an x86 emulator. now i know. very! :eek:

thanks to this project, i'm comfortable with writing assembly now since i got to learn exactly what each operation does.
 
Fake86 running atop a REAL 8088???

Fake86 running atop a REAL 8088???

thanks valentin! very true about the speed. at 3.2 GHz it's roughly the speed of a slow 286. i've always wondered how tough it would be to code an x86 emulator. now i know. very! :eek:

Indeed it is, but the most important thing is you now have a good portion of it working!

BTW, if you ever feel the need for speed(tm), here are a few more tips:

1.) Probably an obvious one, I found it matters when and how you calculate the equivalent address. One method I use to speed things up here for my emulator, is to perform the 4-bits left shift portion of the EA calculation only when a jump/call is executed. This saves time, since one would normally fully calculate the equivalent address i.e. EA = (CS*16) + IP at every opcode fetch..

2.) Fetching four opcode bytes at a time in four-byte 'chunks' helps speed things up considerably on any inline assembly executed on the virtual CPU. For example, my Flea86 XT went from 130KIPS/sec to more than 320KIPS/sec average instruction throughput on this technique alone!


thanks to this project, i'm comfortable with writing assembly now since i got to learn exactly what each operation does.

Quite a heck of a way to learn though, isn't it?
I mean, couldn't one just open up a book and read up on it or something? ;-)

Seriously though what I find most satisfying about doing this kind of project: Awesome feeling one gets after finding some silly state-behavioral problem in the CPU (or as in my case was it in the BIOS/chipset/hardware?) after stewing over the bug for (sometimes) up to days at a time.

what's kind of funny, is that if i really wanted to, i could take the Fake86 code, spend about 15 minutes making minor changes, drop the emulated memory size to say 32 KB, then compile it in QuickBASIC 4.x/7.1 and run it on an 8088. oh god................

That made me laugh so hard, once I digested the meaning of that statement. :happy7:

In any case, it sounds like you're having fun as well and there's still some more fun in store for you. :) Once again great work!


Cheers,
Valentin Angelovski

PS: Just for the record, I also tried running other classic home computer emulators on top of my emulator (pretty silly huh?). Whilst they ran very slow, they might actually run ok if I had more processor 'oomph available. Some of these early vintage computer emulators included Tandy (CoCo1/2), Apple II and Sinclair (ZX80/81)..
 
Last edited:
Indeed it is, but the most important thing is you now have a good portion of it working!

BTW, if you ever feel the need for speed(tm), here are a few more tips:

1.) Probably an obvious one, I found it matters when and how you calculate the equivalent address. One method I use to speed things up here for my emulator, is to perform the 4-bits left shift portion of the EA calculation only when a jump/call is executed. This saves time, since one would normally fully calculate the equivalent address i.e. EA = (CS*16) + IP at every opcode fetch..

2.) Fetching four opcode bytes at a time in four-byte 'chunks' helps speed things up considerably on any inline assembly executed on the virtual CPU. For example, my Flea86 XT went from 130KIPS/sec to more than 320KIPS/sec average instruction throughput on this technique alone!

hmm, yeah. the way i'm doing it now, i fetch an opcode byte and any prefixes for it like this:

Code:
reptype = 0:
useseg = ds
segoverride = 0

Do
    savecs = cs: saveip = ip: opcode = getmem8(cs, ip): StepIP 1
    totalexec = totalexec + 1

    'segment prefix check
    Select Case opcode
        Case &H2E 'segment CS
            useseg = cs: segoverride = 1
        Case &H3E 'segment DS
            useseg = ds: segoverride = 1
        Case &H26 'segment ES
            useseg = es: segoverride = 1
        Case &H36 'segment SS
            useseg = ss: segoverride = 1

    'repetition prefix check
        Case &HF3 'REP/REPE/REPZ
            reptype = 1
        Case &HF2 'REPNE/REPNZ
            reptype = 2
        Case Else
            Exit Do
    End Select
Wend

and then what follows is of course a big select case statement with all the other possible opcodes, and if the opcode being executed has a ModRegRM byte, i call my ModRegRM function which looks like this and calculates the EA:

Code:
Sub modregrm()
    temp1 = getmem8(cs, ip): StepIP 1
    mode = temp1 Shr 6
    reg = (temp1 Shr 3) And 7
    rm = temp1 And 7
    Select Case mode
        Case 0
            If rm = 6 Then Disp = getmem16(cs, ip): StepIP 2
            If ((rm = 2) Or (rm = 3)) And (segoverride = 0) Then useseg = ss
    	Case 1
    	    Disp = signextend(getmem8(cs, ip)): StepIP 1
            If ((rm = 2) Or (rm = 3) Or (rm = 6)) And (segoverride = 0) Then useseg = ss
    	Case 2
	    Disp = getmem16(cs, ip): StepIP 2
            If ((rm = 2) Or (rm = 3) Or (rm = 6)) And (segoverride = 0) Then useseg = ss
	Case Else
	    Disp = 0
    End Select

    Select Case mode
        Case 0
            Select Case rmval
                Case 0: EAptr = (useseg Shl 4) + getreg16(bx) + si
                Case 1: EAptr = (useseg Shl 4) + getreg16(bx) + di
                Case 2: EAptr = (useseg Shl 4) + bp + si
                Case 3: EAptr = (useseg Shl 4) + bp + di
                Case 4: EAptr = (useseg Shl 4) + si
                Case 5: EAptr = (useseg Shl 4) + di
                Case 6: EAptr = (useseg Shl 4) + Disp
                Case 7: EAptr = (useseg Shl 4) + getreg16(bx)
            End Select
        Case 1, 2
            Select Case rmval
                Case 0: EAptr = (useseg Shl 4) + getreg16(bx) + si + Disp
                Case 1: EAptr = (useseg Shl 4) + getreg16(bx) + di + Disp
                Case 2: EAptr = (useseg Shl 4) + bp + si + Disp
                Case 3: EAptr = (useseg Shl 4) + bp + di + Disp
                Case 4: EAptr = (useseg Shl 4) + si + Disp
                Case 5: EAptr = (useseg Shl 4) + di + Disp
                Case 6: EAptr = (useseg Shl 4) + bp + Disp
                Case 7: EAptr = (useseg Shl 4) + getreg16(bx) + Disp
            End Select
    End Select
End Sub



Quite a heck of a way to learn though, isn't it?
I mean, couldn't one just open up a book and read up on it or something? ;-)

i'm sort of a masochist, lol. i don't know. i do have Art of Assembly which is a great book, but this way i REALLY REALLY understand what all the instructions are doing. :mrgreen:



[/QUOTE]Seriously though what I find most satisfying about doing this kind of project: Awesome feeling one gets after finding some silly state-behavioral problem in the CPU (or as in my case was it in the BIOS/chipset/hardware?) after stewing over the bug for (sometimes) up to days at a time.[/QUOTE]

yeah, it really is. it was such a relief once i finally saw the DOS 2.11 disk image start to run MSDOS.SYS and at least start executing COMMAND.COM after one really stupid little bug fix!



That made me laugh so hard, once I digested the meaning of that statement. :happy7:

In any case, it sounds like you're having fun as well and there's still some more fun in store for you. :) Once again great work!

thanks, yeah i'm having a great time. it would be interesting to see it run on an 8088 though, no? ;)


PS: Just for the record, I also tried running other classic home computer emulators on top of my emulator (pretty silly huh?). Whilst they ran very slow, they might actually run ok if I had more processor 'oomph available. Some of these early vintage computer emulators included Tandy (CoCo1/2), Apple II and Sinclair (ZX80/81)..

have you considering upgrading the CPU you are using? that would be really cool.
 
hmm, yeah. the way i'm doing it now, i fetch an opcode byte and any prefixes for it like this:

/* Prefix opcode checking code snipped */

This is part looks ok - only major difference between this and my code is that I simply leave the checking of those prefixes inside the 'big select case statement', set override values when the appropriate prefix is encountered and then finish off each case with a 'goto' back to the beginning of the instruction decoder (to fetch the 'prefixed' instruction..)


and then what follows is of course a big select case statement with all the other possible opcodes, and if the opcode being executed has a ModRegRM byte, i call my ModRegRM function which looks like this and calculates the EA:

/* ModRegRM byte handler code snipped */

Once again this looks about right. You do not show the guts of your getmem8() function, where all the optimisations I am referring to apply.. :)


thanks, yeah i'm having a great time. it would be interesting to see it run on an 8088 though, no?

Having had the misfourtune (understatement) of needing to run my emulator at a crawl (say, fifty instructions/sec) my vote would be no.. lol

Seriously though, there was one really bizarre situation where I had to do this, desperately trying to find a really tricky problem (which turned out to be two obscure internal variables fighting for the same memory space in my code). Let me tell you, it was pure torture watching any emulated program load this way..


have you considering upgrading the CPU you are using? that would be really cool.

Aapart from allowing the host CPU to be overclock-able in software (which I recently added), I could use high-speed SRAM instead of the significantly-slower DRAM. Trouble with this idea is that high-speed SRAMs are very expensive but otherwise they should help alot (would also need to respin the PCB artwork and completely rewrite all of my memory management and graphics routines in the firmware - by no means a small task but it's possible.

Regards,
Valentin
 
ah i thought you were referring to calculating the address that the modregrm byte calculates for all those wacky addressing modes for some reason. reading your post again, i don't know how i came to that conclusion lol. yeah that would be faster definitely. i'm just calculating it every time. didn't even think of that.
 
i actually started re-writing the whole emulator in C last night, btw. i decided i'd like it to be faster and portable to non-x86 platforms. :p

it was pretty quick in FreeBASIC, but this will still be faster. probably by quite a bit. i am calling the FB version a "practice run" and i get to improve the whole thing from the ground up this time. i'm getting pretty far already. it was a slow day at my shop, so i had nothing better to work on.
 
ah i thought you were referring to calculating the address that the modregrm byte calculates for all those wacky addressing modes for some reason. reading your post again, i don't know how i came to that conclusion lol. yeah that would be faster definitely. i'm just calculating it every time. didn't even think of that.

Correct. Note that I was strictly referring to optimising the EA calculation for opcode fetches where CS is not modified (far jumps/calls etc.).

Basically, you only need to do the x16 multiply of CS part for the EA calc, whenever an update to CS is requested. You then store the result as a temporary 24-bit varialble (call it CSx16_temp or whatever), then simply add this variable to IP to obtain EA for successive opcode fetches.. This saves on the CSx16 multiply operation for the vast majority of opcodes..


Regards,
Valentin
 
i actually started re-writing the whole emulator in C last night, btw. i decided i'd like it to be faster and portable to non-x86 platforms. :p

In that case, I would recommend searching up on 'Just-in-time (JIT) processor emulation techniques. Note that it does add an extra layer of complexity and also increases emulator resource requirements, though the resultant speedup may be worth it..

Note I haven't used JIT processor emulation in my own project thus far due to the difficulties involved in making it all fit inside 16K code space maximum..


Regards,
Valentin
 
In that case, I would recommend searching up on 'Just-in-time (JIT) processor emulation techniques. Note that it does add an extra layer of complexity and also increases emulator resource requirements, though the resultant speedup may be worth it..
... and JIT has WHAT to do with portable C code exactly?!? JIT compilation is by definition processor target specific and ASSEMBLY LANGUAGE coding... since it involves recompiling the code you want to emulate to the processor target.
 
... and JIT has WHAT to do with portable C code exactly?!? JIT compilation is by definition processor target specific and ASSEMBLY LANGUAGE coding... since it involves recompiling the code you want to emulate to the processor target.

True, 100% portability is not possible with JIT since it adds code compilation features to the emulator. However, any host CPU-specific definitions should be stored as a separate driver file to maintain portability on the emulator core modules. My main reason for suggesting JIT to Mike was simply to make him aware of faster emulation solutions - but after going back and re-reading Mike's original post I think it'll be safe to say I went off half-cocked on this one! lol


Regards,
valentin
 
Last edited:
fake86-pcdos1-small.png


i see an A> :)

typing stuff at the prompt just fills the screen with garbage though. hmm.

i wonder what is going on here.
 
I see an A>

I see an A>

i see an A> :)

typing stuff at the prompt just fills the screen with garbage though. hmm.

i wonder what is going on here.


To me this looks like either the result of execution of a dodgy opcode (possibly a faulty math/stack/string handling instruction?) or even perhaps a problem in your chipset emulation that somehow hasn't quite resulted in a complete system crash?

I have encountered a few similar errors on bootup a (long) while back, but these all varied from bad CPU/BIOS/chipset emulation to hardware problems..

I take it you were at least able to enter the date and press the return key successfully, prior to reaching the prompt right?
 
yes, it lets you enter the date first. this is a nightmare to debug because one small mistake can cause all kinds of symptoms so it's hard to know where to begin. i dont have much in the way of chipset emulation right now. just the CPU and (most of) the 8259 features. here's another pic showing what it does with MS-DOS 6.22:

fake86-dos622-small.png


as you can hitting F8 there works as it should. F5 does too, but this is as far as it'll get. there aren't even a config.sys or autoexec.bat file on the disk image.
 
Back
Top