• Please review our updated Terms and Rules here

A little VM for 8bit machines, stack based or reg based?

anormal

Experienced Member
Joined
Nov 18, 2010
Messages
103
Location
Canary Islands / Spain
hi!,
I am doing some coding here for a script-language and a VM. I also want to implement this in older cpus (6202 and z80), if possible.
My original arch. design has 8 8bit generic registers (one for flags), and 2 16bit (PC and SP), able to address 64KB, generic arithmetic/logic operations, and the usual stuff (but simpler).

Now, I am in the process of generating code (I already coded lex/syntax tasks in these examples: http://rosettacode.org/wiki/Compiler).
And I was thinking maybe is a better idea to implement this via stack-based operations instead of reg-based (as the rosettacode example).

The scripts will be small routines linked to objects for a little rpg I am thinking, functions return address will work via stack as usual, I plan to use vars globably, so maybe I don't need to pass anything in the stack for function params.

People with experience in 6502/z80 coding, what do you think is better? I mean, space is more important than speed for this.
Maybe I need to implement the VM in these micros and run some tests?

Anything to think about before going to code specific code for these cpus? I'll implement first the VM in C for testing in my W10 machine.
 
It really depends. On typical 8-bitters, both stack manipulation and handling of virtual registers are going to require a fair amount of juggling, and there's no definitive way to tell other than to hack up some test cases and compare. That said, if space is your primary concern, I've found that it's somewhat easier to go with a stack-based model. Having the parameter specifications be implicit makes your token encoding/decoding simpler (since you don't have to bother extracting a register specifier from the "opcode,") allows each implementation to optimize specifically for a fixed behavior, and frees up bits that would otherwise be required in the token (an eight-register VM would require at least three, if it's a simple register-to-accumulator type design) for other purposes.

F'rexample, you could reserve a range of opcodes (say, $00-$7F) to serve as a freebie "push constant" instruction that doesn't require any additional bytes other than the constant data; depending on how your dispatch routine is coded, it wouldn't be any slower and could even be faster (if you optimize for it - though that would make other tokens slower.) Or to serve as the offset for a relative branch instruction, or...it's up to you, really. Take the time to really sit down and think about the most common operations you're going to be using and work out an instruction set that fits what you aim to do, then design the more basic elements of your architecture around that.

If you do decide to go for a stack-based model, a review of Brad Rodriguez's "Moving Forth" series of articles and other 8-bit Forth literature is very instructive as far as implementation models go.
 
Last edited:
A stack based VM makes for a simple compile target, which is important if you plan on compiling on an 8 bit machine. Compiler simplicity is under-appreciated in this day and age of multiple GHz machines with multiple GBs of memory. There are always trade-offs, of course. A small, simple VM may take up less memory, but with less efficient compiled code generation. Interpreting byte codes on an 8 bit machine is slooowwww compared to native code. The more efficient and better match of your byte codes to your source language, the faster the speed of execution at the expense of a larger VM.

My decade+ experiment with a self hosted compiler and accompanying VM on the 6502 can be found here:
https://github.com/dschmenk/PLASMA
along with the VM description:
https://github.com/dschmenk/PLASMA#the-virtual-machine
and bytecodes:
https://github.com/dschmenk/PLASMA/wiki/PLASMA-Byte-Codes

Version 2.0 even includes a JIT compiler to convert oft-used routines to native code on the fly, taking advantage of the 6502 or 65816 - whichever is found:
https://github.com/dschmenk/PLASMA/wiki/PLASMA-JIT-Compiler-Implementation

It is also being used as the high level machine code for Lawless Legends, an in-development adventure game for the Apple II:
https://www.facebook.com/LawlessLegends/

The self-hosted compiler is itself written in PLASMA. It is small and fast enough to develop on a 1 MHz 6502 and floppy disks without extreme pain. A little acceleration and a hard disk/modern SS storage is actually a comfortable development environment.

Good luck with your endeavor. Just remember to think small. The icon on your desktop for your web browser is probably larger than your compiler will be.
 
Generally I would say that stack based VMs are easier to target. However, a well thought out register VM can be pretty impressive.

I've written about two recently which you may like to look at for inspiration and to compare:

The Mouse Programming Language on CP/M
This is a simple stack orientated language that interprets a stream of characters. The language itself is interesting and while the interpreter is only about 2k, this is for the complete language, which includes support for arrays, functions, procedures, nested control structures, local variables, recursion and several methods of passing parameters from one procedure to another. This is much more than you need, but playing with it and the other one, Sweet 16, may help you to get a feel for what would be right for you.

Sweet 16 (The 6502 Dream Machine) Ported to the VIC-20
This uses sixteen 16-bit registers and was created by Steve Wozniak to reduce code size and make it easier to handle 16-bit pointers and arithmetic for his Apple Integer BASIC. It has a nice instruction set and the code for it is only about 300k. One nice feature is that you can switch back and forth between Sweet16 and native 6502 code. This could help with code generation when you need to make things more performant.

Best wishes

Lorry
 
Guys, really thanks for all the info. You made me think in another ways I was no focusing

@commodorejohn's: You are right, my initial architecture needed some bit juggling for encoding registers, as you said, one simple

move reg1,reg2,
needed at least 2 bytes, my initial idea was, sorry too much years in x86 :D, 8x16bit registers, r0 PC and r7 SP, and the rest
addresable hi/lo byte, aka ax, ah, al :D
but the scripting needed for managing a little rpg doesn't require 16bit vars I thought, except PC and SP.

My current idea is a mix of what really does the vm, and what should do the interpreter (gfx, sound, basic i/o), so some opcodes are
just talking to the interpreter. But I still have more than 200 free opcodes entries, so you idea of packing info the opcode space would made
the scripts smaller and so more space for level data and graphic tiles.

@resman: compiling in 8bit machine is out of my scope, for some things, but primarily my lack of knowledge of 8bit machines, except basic cpu architectures, some asm I've read, and some machines as Sinclair Spectrum/c64 I've seen some long videos explaining architecture and basic programing, memory layout etc... So no totally unknown, but by far out my experience, I know the theory but never coded in 8bit.

As you said, the compiler is much easier with a stack machine, because it is just walking the AST and output to memory or to stack.
Register allocation is out my knowledge by now, so I'll need to read about it, etc. Maybe in the future, if I first do the stack-based.

So, I am doing a visual GUI for designing and coding the scripts, and this will produce a final binary, you append the interpreter for a specific machine and that's the idea by now.
I am going to check all the links you sent, they seem very interesting thanks!


@Lawrence Woodman: thanks for this, I'll check them to get more ideas. Especially the register allocation and how you managed the regs.


Well, I am thinking now about the stack-based, it will be easier at first, and also the interpreters. A turn-based old-school RPG with simple 2d tiles, doesn't need speed, it could be a better idea to start with this and then test some easy code in a C64 and a Spectrum for example.

Thanks for ideas guys!

Edit: The PLASMA project seems very advanced :D, going to read it all, probably rip many ideas as at first sight seems impressive :D
That game, I have the same idea for the layout, but with simpler everything, as my initial idea was targetting Sinclair Spectrum, so tiles
will be monochrome, etc, etc... You also have a great guy doing the graphics.
Probably we could have another conversation in the future about text compression, I planned to put really lot of text (similar to a Gamebook).
Thanks!
 
Last edited:
My current idea is a mix of what really does the vm, and what should do the interpreter (gfx, sound, basic i/o), so some opcodes are just talking to the interpreter. But I still have more than 200 free opcodes entries, so you idea of packing info the opcode space would made the scripts smaller and so more space for level data and graphic tiles.
Yup - and depending on how your dispatch routine is written, you could choose to optimize with special handling for "range" opcodes, i.e. breaking out of a normal jump-table sequence if you can tell ahead of time that it's a special range (i.e. if the MSB is set or clear, which on the 6502 at least you can determine as soon as the token is read.) This can make the special range slightly faster while making the normal opcodes slightly slower, but the other thing it can do (from a space perspective) is cut the size of a jump table dramatically - if (say) 128 of the 256 possible opcodes are reserved for a special handler, you'd only require 256 bytes for the table vs. 512. Whether that's a meaningful savings in the scope of the overall project is up to you to determine, but it's one possible way of doing things.

Text compression is another fun topic - many ways of doing things there!
 
hi!,
I am doing some coding here for a script-language and a VM. I also want to implement this in older cpus (6202 and z80), if possible.
My original arch. design has 8 8bit generic registers (one for flags), and 2 16bit (PC and SP), able to address 64KB, generic arithmetic/logic operations, and the usual stuff (but simpler).

For the 6502 case, have you already looked at how SWEET16 does things on an Apple II?

https://en.wikipedia.org/wiki/SWEET16
 
If this is a for-purpose design, take your requirements and design your instruction set around it. Stack architectures are nice, but, depending on your needs, not essential.

I can speak from experience. Years ago (1979), I designed a BASIC compiler that was still in use commercially until about 5 years ago. I designed two instruction sets--one for the compiler that worked in terms of abstract compiler dataypes. Very little in the way of arithmetic instructions, but very rich in terms of symbols and attributes.

For the runtime, I designed yet another instruction set that was oriented toward BASIC requirements, including descriptors for variables. So (digging around), here's a simple BASIC program (99 bottles of beer) and its compiled instructions. Note that there are no integers in this--it's all floating point math. So 1043 hex = 100 decimal.

Code:
   0010     1      NumberOfBeers = 100
0010    460210	NPUSI	021043
0014    8801D6	NPOP	NUMBEROFBEERS
   0017     2  
   0017     3      for ThisMany = NumberOfBeers to 0 step -1
0017    8601D6	NPUS	NUMBEROFBEERS
001A    8801D7	NPOP	THISMANY
001D    1F	NPUSQ	01
001E $aaab:
001E    0F	NPUSQ	00
001F    27	NPUNII	10
0020    7A01D7	NSCL	THISMANY
0023    F12810	PBF	$aaac
   0026     4          print ThisMany," bottles of beer on the wall"
0026    660100	IOI	01 00
0029    8601D7	NPUS	THISMANY
002C    67	IOIN	
002D    501C20	SPUSI	1C20626F74746C6573206F662062656572206F6E207468652077616C6C
004B    68	IOIS	
004C    69	IOT	
   004D     5          print ThisMany," bottles of beer."
004D    660100	IOI	01 00
0050    8601D7	NPUS	THISMANY
0053    67	IOIN	
0054    501120	SPUSI	1120626F74746C6573206F6620626565722E
0067    68	IOIS	
0068    69	IOT	
   0069     6          print ThisMany," Take one down, pass it around,"
0069    660100	IOI	01 00
006C    8601D7	NPUS	THISMANY
006F    67	IOIN	
0070    501F20	SPUSI	1F2054616B65206F6E6520646F776E2C20706173732069742061726F756E642C
0091    68	IOIS	
0092    69	IOT	
   0093     7          if ThisMany <> 0 then print(carr=2) ThisMany-1," bottles of beer."
0093    8601D7	NPUS	THISMANY
0096    4A00	NCNEI	00
0098    F128A2	PBF	$aaad
009B    2F	NPUSQ	02
009C    3F	NPUSQ	03
009D    660101	IOI	01 01
00A0    8601D7	NPUS	THISMANY
00A3    22	NSUBII	1
00A4    67	IOIN	
00A5    501120	SPUSI	1120626F74746C6573206F6620626565722E
00B8    68	IOIS	
00B9    69	IOT	
   00BA     8      next ThisMany
00BA $aaad:
00BA    0F	NPUSQ	00
00BB    F227EC	B	$aaab
00BE $aaac:
   00BE     9      print
00BE    660100	IOI	01 00
00C1    69	IOT	
   00C2    10      print "No more beer!"
00C2    660100	IOI	01 00
00C5    500D4E	SPUSI	0D4E6F206D6F7265206265657221
00D4    68	IOIS	
00D5    69	IOT	
   00D6    11      end
00D6    7801	CMR	01

Only about 200 bytes of compiled code.
 
Yes, this is only for this purpose, just run a rpg in a classic layout, map, stats, log, with limited resources. So, after some thinking, I moved functionality to the interpreter, for example have opcodes like these:

>load_item_stat,item,stat
>set_item_stat,item,stat

All of this is just very preliminary, some gui design, binary formats, and doing the compiler now, but I'll rethink everything via stack-based, and maybe I can do some testing later.

And yes, I read about Sweet16, when I read 300bytes, I thought, well that should be the size of the VM entry/exit code :D :D
8bit coding really makes you think in another ways, and Wozniak is really a smart guy...
 
Yes, this is only for this purpose, just run a rpg in a classic layout, map, stats, log, with limited resources. So, after some thinking, I moved functionality to the interpreter, for example have opcodes like these:

>load_item_stat,item,stat
>set_item_stat,item,stat

All of this is just very preliminary, some gui design, binary formats, and doing the compiler now, but I'll rethink everything via stack-based, and maybe I can do some testing later.

And yes, I read about Sweet16, when I read 300bytes, I thought, well that should be the size of the VM entry/exit code :D :D
8bit coding really makes you think in another ways, and Wozniak is really a smart guy...


If that is your goal, then I wouldn't even bother with a compiler. Just tokenize your script and interpret it directly. You will probably use a stack while evaluating expressions (if you get that sophisticated). Don't make it harder than it needs to be.
 
Thanks guys, you really help me a lot.
New points of views and great code make me think about things I haven't contemplated.
If I got something of this running I'll put here some examples.
 
Back
Top