On an 11 second run, the baud rate isn't that significant and actually insignificant in your revised code. Your times show your eZ80 system to be the equivalent of a Z80 at about 91 MHz whereas my various benchmarks have always been in the range of 167-195 MHz.
Since your eZ80 times are more than twice mine, I'll repeat my unanswered questions:
- Is any of your code running from flash memory (internal defaults to 4 wait states, external = ? waits)?
- Is your RAM set to use wait states? I use zero and that's where CP/M and MBASIC run on my system.
- Has PHI been checked to be 50 MHz? Most F91 systems use the PLL programmed with a multiplier value.
I would agree that UART should not make that big a difference but it does.
I am running from RAM.
I dont think i have WS set but I will check.
I think you said using RTC for timing such things can be off 100%, maybe that is it, its off 100%. Instead of 1 second to calculate all square roots between 1 and 1000 maybe it is taking 0 seconds, IDK.
Again this is not meant as a bench mark just a general indicator. Clearly when I run this same thing on real model 4 or emulator there is a huge difference.
Being interpreted BASIC many things are happening here such as garbage collection, moving bytes around, math etc.
This brings me back to points I have made before about speeding up TRSDOS without making modifications to OS. Many times in LOWCORE it is sufficient to use a do nothing loop of 256 to let BUS SETTLE etc.
This is no ways near long enough using an eZ80.
Then you are faced with making longer loops and that requires more memory for counters.
I have decremented a 16 bit counter in assembly and it burns right thru it. Since there is not one spare byte in some places to lengthen loops that creates new problems to be solved.
Time delays such as:
ld b,0
loop: nop
djnz loop
Seem to be all cached up in eZ80 and executes blindingly fast. You would need to do this loop thousands of times to get same desired effect as a real model 4. I guess you would only need long delays if interfacing to older legacy peripherals.
As I said, I can decrement a 16 bit number to 0 and not even notice delay. Try a loop using 32 bit numbers on real model 4.
TRSDOS 'pause' SVC would need to be rewritten to take advantage of eZ80 programmable timers.
I have macros for my assembler that lets me do things like for/next.
This loop causes about 1 second delay just long enough to read a few lines on screen.
Code:
cpu_dly FOR 1,10,1
cpu_dly_1 FOR 1,65535,1
NEXT cpu_dly_1
NEXT cpu_dly
FOR START,END,STEP
This loop counts 10*65535 and I get a delay of about 1 second.
Code:
FOR MACRO START,STOP,STEP
SCOPE
ld HL,($1)
ld ($4),HL
JR $5
$4 DW 0 ; I hold count when running (+8).
$1 DW START ; Start count from xx (+10).
$2 DW STOP ; Stop when equal or greater than STOP (+12).
$3 DW STEP ; Step to increment by (+14).
$5 equ $
ENDMAC
NEXT MACRO pointer
ld hl,(pointer+8) ; POINTER
ld bc,(pointer+14) ; STEP
add hl,bc
PUSH HL
LD BC,(pointer+12) ; STOP LIMIT.
OR A ; Clear carry flag.
SBC HL,BC
POP HL
jp nz,pointer+3
JP M,pointer+3 ;>
endmac