• Please review our updated Terms and Rules here

The beginnings of my 4004 project

This is version 0.70 and it is much more functional. CP/M, MS-DOS, and WIN32 console executables attached.

Breakpoints are implemented along with go.

You can set the speed (Hz) with throttle or turn the throttle off for the maximum speed. (Not for CP/M - there is no timer in CP/M so there is no throttle or performance update commands)

Performance Update will show the speed on the screen every 5 seconds like this <740000> would be 740kHz.

You an specify the monitor break key (default is ESC/0x1b).

It loads with this default program which waits on the user to type a key and then echoes that key to the display. I've added some comments (the U command does not have comments). This essentially uses a special instruction I added 0xFE called EMU for emulator call. The accumulator is used to pass data back and forth. This is a list of the commands supported by the EMU instruction:

Code:
0xFE EMU instruction (uses accumulator for data transfer)

Sent              | Returned         || Sent              | Returned         || Sent              | Returned
EC_HALT (0x0)     | none             ||                   |                  ||                   |
EC_STATUS0 (0x1)  | status*          ||                   |                  ||                   |
EC_RX0 (0x2)      | high nibble      || any               | low nibble       ||                   |
EC_TX0 (0x3)      | none             || high nibble       | none             || low nibble        | none

*status
  bit0 - ready to rx
  bit1 - ready to tx

Code:
-U 0-10
PGM:000   D1      LDM 1                                ;call EMU command for serial status
PGM:001   FE      EMU
PGM:002   F6      RAR                                    ;let's put bit0 (RX ready) in carry
PGM:003   1A 00   JCN A(JNC) 000                    ;if no carry we retry this loop from 0
PGM:005   D2      LDM 2                               ;if we are here there is a key ready to receive
PGM:006   FE      EMU                                  ;receive high nibble of key
PGM:007   B0      XCH 0                               ;put it at r0
PGM:008   FE      EMU                                  ;receive low nibble of key
PGM:009   B1      XCH 1                               ;put it at r1
PGM:00A   D3      LDM 3                             ;we want to echo it back with the tx command
PGM:00B   FE      EMU
PGM:00C   A0      LD 0                                ;get high nibble from r0
PGM:00D   FE      EMU                                ;send it
PGM:00E   A1      LD 1                                ;get low nibble from r1
PGM:00F   FE      EMU                                ;send it (character is echoed)
PGM:010   40 00   JUN 000                        ;jump back to start
-

Code:
A [addr]                      Assemble
B [[-]breakpoint] [-all] ...  Breakpoints
C srcrange tgtaddr              //Copy
D [addr|range]                Dump
E [addr]                      Enter
F range data                    //Fill
G [addr]                      Go
H                             Help
I [instructions]              Step into
K srcrange tgtaddr              //Compare
LI range                        //Load intel hex
LF range file                   //Load file
M [key]                       Monitor break key
N                             Initialize CPU
O [instructions]              Step over
P [on|off]                    Performance update every 5 seconds
Q                             Quit
R [reg=value] ...             Registers
SI range                        //Save intel hex
SF range file                   //Save file
T [speed|off]                 Throttle (decimal cycles per second or off)
U [addr|range]                Unassemble
V [on|off]                    Display pause when full
X                             Address spaces
Z range data                    //Search
 

Attachments

  • emu4004_0.70.zip
    81.3 KB · Views: 1
More changes:

Save file and load file added (load file still to be improved to support shorter files than the requested load, but it works)

Data is now packed in bytes instead of using a byte to hold a nibbe

Optimizations

Added go to command (temporary breakpoint)

Improved thottling (WIN32/MS-DOS only, not CP/M)
 

Attachments

  • emu4004_0.80.zip
    86.6 KB · Views: 1
I'll do another build tonight; I got a lot done on it last night. It can now load/save binary file and intel hex.
 
Here it is, 0.90.

Code:
A [addr]                      Assemble
B [[-]addr] [-all] ...        Breakpoints
C srcrange destaddr           //Copy
D [addr|range]                Dump
E [addr]                      Enter
F range data                  //Fill
G [addr]                      Go
GT addr                       Go to address (temporary breakpoint)
H                             Help
I [instructions]              Step into
K srcrange destaddr           //Compare
LI range [file]               Load intel hex
LF range file                 Load file
M [key]                       Monitor break key
N                             Initialize CPU
O [instructions]              Step over
P [on|off]                    Performance update every 5 seconds
Q                             Quit
R [reg=value] ...             Registers
SI range [file]               Save intel hex
SF range file                 Save file
T [speed|off]                 Throttle (DECIMAL cycles per second or off)
U [addr|range]                Unassemble
V [on|off]                    Display pause when full
X                             Address spaces
Z range data                  //Search

Addresses are in the format [name:]address
Ranges are in the format [name:][start[-stop]]
A range with only the name: refers to the entire address space
Use the X command to view address space names
Fill or search data can be multiple FF FF or F F
All values are in hexadecimal except throttle and intel hex line numbers
 

Attachments

  • emu4004_0.90.zip
    87.1 KB · Views: 1
Finally, all options implemented and many optimizations completed.

Still at version 0.95 until I get some testing done on it. The next part of this project will be writing some code to run on it so hopefully that will show if there are any bugs!

Code:
A [addr]                      Assemble
B [[-]addr] [-all] ...        Breakpoints
C srcrange destaddr           Copy
D [addr|range]                Dump
E [addr]                      Enter
F range data                  Fill
G [addr]                      Go
GT addr                       Go to address (temporary breakpoint)
H                             Help
I [instructions]              Step into
K srcrange destaddr           Compare
LI range [file]               Load intel hex
LF range file                 Load file
M [key]                       Monitor break key
N                             Initialize CPU
O [instructions]              Step over
P [on|off]                    Performance update every 5 seconds
Q                             Quit
R [reg=value] ...             Registers
SI range [file]               Save intel hex
SF range file                 Save file
T [speed|off]                 Throttle (DECIMAL cycles per second or off)
U [addr|range]                Unassemble
V [on|off]                    Display pause when full
X                             Address spaces
Z range data                  Search

Addresses are in the format [name:]address
Ranges are in the format [name:][start[-stop]]
A range with only the name: refers to the entire address space
Use the X command to view address space names
Fill or search data can be multiple FF FF or F F
All values are in hexadecimal except throttle and intel hex line numbers
 

Attachments

  • emu4004_0.95.zip
    89.1 KB · Views: 1
My CP/M performance is a bit disappointing - it ran 4095 nop instructions in 4.69s - so 4095*8/4.69 = a clock speed of 6.985 kHz!

I'll test the MS-DOS version on my Toshiba T1100 plus again and see how it does.
 
My CP/M performance is a bit disappointing - it ran 4095 nop instructions in 4.69s - so 4095*8/4.69 = a clock speed of 6.985 kHz!

I'll test the MS-DOS version on my Toshiba T1100 plus again and see how it does.
The 4004 is actually quite fast compared to a 8080 or Z80. About 100:1 for a simulations is not all bad. Running on a current processor is what I thought you had in mind.
The Pittman assembler is about the best test you can find.
Dwight
 
I was looking through your commands. How do you attach RAM and I/O to your simulator. There is not a lot you can do without RAM. It should be using 4002s for the basic RAM but you can add such things as 42xx parts. The 4289 is quite useful but there are others( I used one on my Maneuver Board ). The 4265 is really complicated but one of the most interesting for expanding I/O or even adding RAM space for more data.
Dwight
 
Hi Dwight!

I probably need to so some optimizations on the loop that does the emulation still.

My goal was to make a program that could run on all three - CP/M, DOS, and WIN32 console.

It only implements 4001's and 4002's.

I've got these address spaces:

Code:
EMU4004 emulator 0.95
Provided as is; no warranty; use at own risk
Type H <enter> for help
-X
Name      Description                   Range     Type
PGM:      Program memory                0-FFF     Bytes
PGMI:     Program memory input ports    0-F       Nibbles
PGMO:     Program memory output ports   0-F       Nibbles
DATA:     Data memory                   0-7FF     Nibbles
STAT:     Status memory                 0-1FF     Nibbles
DATAO:    Data memory output ports      0-1F      Nibbles
-

PGM: is the full 4K program memory that the CPU can access. (16x 4001's)
PGMI: represents any ROM chip ports configured as inputs (16x 4001's, one nibble per chip for 16 chips).
PGMO: is the same but the outputs (16x 4001's).
DATA: represents the data part of the RAM (32x 4002's).
STAT: represents the status part of the RAM (32x 4002's).
DATAO: represents the output ports of the RAM (32x 4002's).

I don't have a way to connect the ports to anything yet (PGMI:, PGMO:, DATAO: ), but when I move this project to an atmega1284p, then I will have to come up with a faculty to connect them to actual AVR I/O pins.

Do you think the Pittman is better than the A04? Or just different. I was thinking of modifying the Pittman to run with my emulators EMU instruction for I/O instead of it doing the big bang serial I/O.

Thanks for taking a look at it!
 
Pittman's assembler isn't a particularly good assembler but it does hit just about every primary capability of the 4004 instruction set.
The output isn't a particularly friendly format. It has not macro ability. It is an excellent test of a simulator.
Dwight
 
I was looking through your commands. How do you attach RAM and I/O to your simulator. There is not a lot you can do without RAM. It should be using 4002s for the basic RAM but you can add such things as 42xx parts. The 4289 is quite useful but there are others( I used one on my Maneuver Board ). The 4265 is really complicated but one of the most interesting for expanding I/O or even adding RAM space for more data.
Dwight
I would hope for 4289 support in this project, configurable with some kind of a file. This has been a shortcoming of other emulators.


I’m glad you mentioned that the 4265 is complicated because I found it amazingly complicated and I thought it was just because I was dumb! There COULD be other reasons.
 
I did emulate the 4008/4009 or 4289 ability to read/write program memory:

case 0xe3:
//wpm, implemented in 4008/4009 or 4289, a way to read or write to program memory
//rom port 14, 0=read, 1=write
//rom port 15, high address nibble of memory to read/write
//src, middle and low address nibbles of memory to read/write
//write, set rom port 14=1, set rom port 15, load src, load acc high nibble, wpm, load acc low nibble, wpm, set rom port 14=0
//read, set rom port 15, load src, wpm, wpm, read rom port 14 high nibble, read rom port 15 low nibble
//as a consequence of this instruction, data memory for dcl:src is also written if present, load and restore if important or use dcl for a non
// existent memory bank
 
0.96

Much more responsive key processing.

Redesigned emulation loop increases performance for all platforms, but especially CP/M. For CP/M I replaced the switch() which was not yielding a jump table, but a long set of 256 compare/jmp operations. This was replaced with an if tree that can isolate all 256 opcodes in 8 comparisons.

Added execution time and average speed (DOS and WIN32 only)

CP/M testing (2MHz 8080) v0.95 = 4133 Hz, v0.96 = 13470 Hz (3.25 times faster)
DOS testing (7.16MHz 8086) v0.95 = 92297 Hz, v0.96 = 103391 Hz (1.12 times faster)
DOS testing (VM workstation modern PC) v0.95 = 525615560 Hz, v0.96 = 880065334 Hz (1.67 times faster)
WIN32 testing (modern PC) v0.95 = 600929291 Hz, v0.96 = 2340454835 (3.89 times faster)

Certainly the modern PC has no problem keeping up with the i4004's 740kHz!

I'm pleased that the CP/M version is finally up to speed. I found something online about Intel saying that a 10MHz 8086 was about 10 times faster than a 2MHz 8080 and my benchmarks are pretty close to that. 103391*10/7.16*0.1=14400 Hz and my CP/M version is running at 13470 Hz.

I also added an INST.DAT which is what I use for benchmarking. It tries to run all of the instructions and then loops back to itself. You can load it with "LF PGM: INST.DAT".
 

Attachments

  • emu4004_0.96.zip
    89.7 KB · Views: 1
Last edited:
Back
Top