New to ASM, assistance setting up

ASM · Aug 8, 2012

Hello everyone,
I have been putting it off for a long time but I really wish to start learning Assembly.
I have done bits here and there but I really want to knuckle down and start learning (I am at school so I can learn in my free time).
I currently have a 1990's book I am learning from for the x86 instruction set.

Soon I will be getting my hands on a Compaq portable, similar to the IBM 5155.
I was wondering what is the best approach to writing & executing the code itself,
I would assume that I could use 'debug' in MS-DOS. Again I am quite new; I have not really looked at the different syntaxes for NASM, MASM etc. so I am not sure how DOS works in this regard.
I hear that the BIOS can run BASIC from the ROM but I am not too sure if ASM is supported (I would assume that it would need the instruction set stored)

Thank you for any suggestions and helping clear this up for me!

per · Aug 8, 2012

You can test small pieces of code in DEBUG, but if you want to make actual programs then A86 is a good choice for a beginner-level. A86 will make simple .com programs and the source code can be pure code as it doesn't need all the headers and arguments that MASM source-code needs. If you want to make bigger programs, then you should consider MASM instead.

You can't really do assembly in BASIC. You can poke some machine-code into RAM and run it, but that's even more tedious than using DEBUG (In DEBUG you can at least use actual Assembly mnemonics). Compaq PCs didn't have BASIC in ROM either; only the IBM PCs had that.

Chuck(G) · Aug 8, 2012

For very simple programs, such as determining how certain instructions work, you can use DEBUG itself as it contains a small assembler--you simply have to calculate your own storage addresses. Use the "INT 3" instruction to get back to the debug display from your program. As an example, suppose that I wanted to see how the AAD instruction works. From the reference, we know that it operates on the AX instruction to compute AL=(10*AH)+AL; AH=0. So we can set the AX register to a specific value, run an AAM and see what the result is:

Code:

D:\>[B]debug[/B]
-[B]rax[/B]
AX 0000
:[B]0304[/B]
-[B]a100[/B]
13C7:0100 [B]aad[/B]
13C7:0102 [B]int 3[/B]
13C7:0103
-[B]r[/B]
AX=0304  BX=0000  CX=0000  DX=0000  SP=FFEE  BP=0000  SI=0000  DI=0000
DS=13C7  ES=13C7  SS=13C7  CS=13C7  IP=0100   NV UP EI PL NZ NA PO NC
13C7:0100 D50A          AAD
-[B]g=100[/B]

AX=0022  BX=0000  CX=0000  DX=0000  SP=FFEE  BP=0000  SI=0000  DI=0000
DS=13C7  ES=13C7  SS=13C7  CS=13C7  IP=0102   NV UP EI PL NZ NA PE NC
13C7:0102 CC            INT     3
-[B]q[/B]

D:\>

Note that my entries are in boldface. I used location 0100H as a starting point because that's the default for the start of COM-type programs.

At some point, you're going to get tired of computing your own branch and storage addresses and you'll need to move to a good assembler. There are different proponents for various tools. I like Microsoft MASM, but that by no means is a universal choice.

ASM · Aug 8, 2012

per said:
You can test small pieces of code in DEBUG, but if you want to make actual programs then A86 is a good choice for a beginner-level. A86 will make simple .com programs and the source code can be pure code as it doesn't need all the headers and arguments that MASM source-code needs. If you want to make bigger programs, then you should consider MASM instead.

You can't really do assembly in BASIC. You can poke some machine-code into RAM and run it, but that's even more tedious than using DEBUG (In DEBUG you can at least use actual Assembly mnemonics). Compaq PCs didn't have BASIC in ROM either; only the IBM PCs had that.

Thank you for the information!
I will look into A86, I like the idea of no headers.
I see that you own a comapaq portable, what is actually loaded onto the ROM then?

Chuck(G) said:
For very simple programs, such as determining how certain instructions work, you can use DEBUG itself as it contains a small assembler--you simply have to calculate your own storage addresses. Use the "INT 3" instruction to get back to the debug display from your program. As an example, suppose that I wanted to see how the AAD instruction works. From the reference, we know that it operates on the AX instruction to compute AL=(10*AH)+AL; AH=0. So we can set the AX register to a specific value, run an AAM and see what the result is:

Code:

D:\>[B]debug[/B] -[B]rax[/B] AX 0000 :[B]0304[/B] -[B]a100[/B] 13C7:0100 [B]aad[/B] 13C7:0102 [B]int 3[/B] 13C7:0103 -[B]r[/B] AX=0304 BX=0000 CX=0000 DX=0000 SP=FFEE BP=0000 SI=0000 DI=0000 DS=13C7 ES=13C7 SS=13C7 CS=13C7 IP=0100 NV UP EI PL NZ NA PO NC 13C7:0100 D50A AAD -[B]g=100[/B] AX=0022 BX=0000 CX=0000 DX=0000 SP=FFEE BP=0000 SI=0000 DI=0000 DS=13C7 ES=13C7 SS=13C7 CS=13C7 IP=0102 NV UP EI PL NZ NA PE NC 13C7:0102 CC INT 3 -[B]q[/B] D:\>

Note that my entries are in boldface. I used location 0100H as a starting point because that's the default for the start of COM-type programs.

At some point, you're going to get tired of computing your own branch and storage addresses and you'll need to move to a good assembler. There are different proponents for various tools. I like Microsoft MASM, but that by no means is a universal choice.

Thanks for the info!
(I am guessing putting 100 will change to 0100 because of the 16bit registers?)

per · Aug 8, 2012

ASM said:
I see that you own a comapaq portable, what is actually loaded onto the ROM then?

The ROM just contains the BIOS and the Power-On-Self-Test/Bootloader. If it boots with nothing in the disk drives, then it will try to load BASIC from a Compaq-OEM MS-DOS disk instead.

Chuck(G) · Aug 8, 2012

Yes, all numbers in DEBUG are hexadecimal. It does not speak decimal. So the next number after 109 is 10A.

My point is that you really need to know how instructions work before you go and write large chunks of code. For example, what flags does the INC instruction affect, versus the ADD instruction? It's good to know that before you set off scribbling madly.

cr1901 · Aug 8, 2012

My suggestion is to read the DOS version of "Art of Assembly Language Programming" by Randall Hyde (which also has a lot you need to know about the IBM PC architecture as well as assembly), and decide on an assembler (one that doesn't run on just DOS- I wouldn't use DOS as a programming bench right now) and stick with it. There are too many x86 assemblers to count, and I daresay NONE of them are fully compatible with one another- it's extremely frustrating.

Most assemblers (especially DOS) accept assembly source in a syntax known as Intel syntax (the other syntax being AT&T- which is a pain and completely opposite to Intel syntax- SRC then DEST vs. DEST then SRC, for example). Intel syntax can further be classified into having support for Microsoft's Macro Assembler features/syntax (MASM) (supports Intel memory models, ASSUME directive, '@' for address-of operator, etc), support for Netwide Assembler features/syntax (NASM) (no memory models, no ASSUME directive, '%' for address-of operator, etc), and small COM-only or "basic" assemblers (probably it's own syntax). The only modern assembler I can think of using AT&T syntax is the GNU Assembler; I don't know anyone who uses it for 386 and above code.

I have found so much legacy/modern code examples in MASM syntax that I can't currently justify learning all the differences of NASM. Maybe a good project to do is to create a "NASM2MASM" parser and vice versa (it would also brush up on my FSM skills lol).

The following is a list of assemblers that I have used/tried in the past, along with some quick comments.

A86- Older assembler which runs on Windows. I don't know which features it supports, b/c I didn't know how to use the command line well back then

.
FASM- Flat assembler. Closer to NASM in principle, but incompatible syntaxes? I don't remember.
GAS- GNU assembler. AT&T syntax. That. is. all!
HLA- High Level assembler. Randall Hyde's x86 assembler. Used throughout non-DOS versions of Art of Assembly. Supports numerous features/high-level constructs, but not compatible with MASM, to my knowledge.
MASM- Widely used compiler supporting various macros and features (one could write a book on them- in fact people have!). Modern versions (comes with visual studio) only meant for Windows exes. I haven't tried older versions, and never had much luck with the related MASM32 project.
NASM- Gets rid of all the confusing features of MASM- a "What You Code is What You Mean" Assembler- does not make assumptions about the location of code/data that MASM does (and therefore the Intel memory models are not required). Easier to make a flat binary.
TASM- Dos-only assembler which competed directly with DOS versions of MASM. Run in DOSBOX or a VM. Works fine, just not a fan of using a VM as a programming environment.
WASM (Watcom)- Modern Assembler which supports some variant of MASM syntax and features, but not all. Runs on Windows/Linux, compiles for DOS, windows, etc depending on linker settings. Supposedly, JWASM has better MASM support.
WASM (Wolfware)- Old assembler that occassionally I find modern code for. DOS Only. Supports a macro to invoke dos interrupts directly, making porting a possible pain.
YASM- NASM rewrite/compatible assembler with AT&T syntax support. Joy!

I personally use WASM (Watcom) currently, because of it's versatility in host platform and target, as well as sufficient MASM compatibility (most corrections to legacy code become obvious).

So yes, there are A LOT of x86 assemblers lol. Don't go through the same struggle I did lmao!

reenigne · Aug 8, 2012

I prefer to use YASM to cross-assemble from a modern machine myself. YASM and NASM are very similar, but I think I chose the former because of Avery Lee's recommendation, though VC-compatible debug information isn't useful for 8088 assembly.

Back when I was running an assembler under DOS, I used to use A86 which is small and fast but the syntax isn't quite compatible with YASM/NASM.

The other assembler I have some experience with is GAS but that's really designed to be a compiler backend so it's not terribly useful for assembling code written by hand.

Chuck(G) · Aug 8, 2012

I think your understanding of MASM is rather limited. You are not constrained by the standard memory models--they exist in MASM to reduce effort on your part, but you don't have to use them. It's perfectly possible to write "flat" 32-bit code in MASM--and has been almost since the 386 came out.

I started 8086 assembly with ASM86 under ISIS-II, long before the 5150 came out. It was really, really slow, because it had to first read in and digest a long list of OPDEFs.

pearce_jj · Aug 8, 2012

If you're just starting out Borland's Turbo Pascal might be worth a look, as assembler code can be dropped in-line either for code chunks or entire procedures. And with v6 there is the super IDE that will step-through ASM too with register watches, and better yet it will all run just fine on that Compaq portable

cr1901 · Aug 8, 2012

Chuck(G) said:
I think your understanding of MASM is rather limited. You are not constrained by the standard memory models--they exist in MASM to reduce effort on your part, but you don't have to use them. It's perfectly possible to write "flat" 32-bit code in MASM--and has been almost since the 386 came out.

I know that you don't have to use them- NASM proves that. Additionally, I was limiting my discussion to real mode assembly, mostly- I don't understand protected mode sufficiently (nor have had the time to read the 386 technical reference) well to really do 32-bit x86 assembly. And I don't see the point in using assembly on modern systems since Windows (NT at least- maybe 9x) will not allow direct hardware access using OUT and IN instructions, instead forcing you to use the Windows API. Assembly by this point becomes 90% API calls and 10% assembly instructions

.*

*Take this statement with a grain of salt. I hardly have done enough of it to really know whether that's true or not.

ASM · Aug 8, 2012

cr1901 said:
My suggestion is to read the DOS version of "Art of Assembly Language Programming" by Randall Hyde (which also has a lot you need to know about the IBM PC architecture as well as assembly), and decide on an assembler (one that doesn't run on just DOS- I wouldn't use DOS as a programming bench right now) and stick with it. There are too many x86 assemblers to count, and I daresay NONE of them are fully compatible with one another- it's extremely frustrating.

Most assemblers (especially DOS) accept assembly source in a syntax known as Intel syntax (the other syntax being AT&T- which is a pain and completely opposite to Intel syntax- SRC then DEST vs. DEST then SRC, for example). Intel syntax can further be classified into having support for Microsoft's Macro Assembler features/syntax (MASM) (supports Intel memory models, ASSUME directive, '@' for address-of operator, etc), support for Netwide Assembler features/syntax (NASM) (no memory models, no ASSUME directive, '%' for address-of operator, etc), and small COM-only or "basic" assemblers (probably it's own syntax). The only modern assembler I can think of using AT&T syntax is the GNU Assembler; I don't know anyone who uses it for 386 and above code.

I have found so much legacy/modern code examples in MASM syntax that I can't currently justify learning all the differences of NASM. Maybe a good project to do is to create a "NASM2MASM" parser and vice versa (it would also brush up on my FSM skills lol).

The following is a list of assemblers that I have used/tried in the past, along with some quick comments.

A86- Older assembler which runs on Windows. I don't know which features it supports, b/c I didn't know how to use the command line well back then .
FASM- Flat assembler. Closer to NASM in principle, but incompatible syntaxes? I don't remember.
GAS- GNU assembler. AT&T syntax. That. is. all!
HLA- High Level assembler. Randall Hyde's x86 assembler. Used throughout non-DOS versions of Art of Assembly. Supports numerous features/high-level constructs, but not compatible with MASM, to my knowledge.
MASM- Widely used compiler supporting various macros and features (one could write a book on them- in fact people have!). Modern versions (comes with visual studio) only meant for Windows exes. I haven't tried older versions, and never had much luck with the related MASM32 project.
NASM- Gets rid of all the confusing features of MASM- a "What You Code is What You Mean" Assembler- does not make assumptions about the location of code/data that MASM does (and therefore the Intel memory models are not required). Easier to make a flat binary.
TASM- Dos-only assembler which competed directly with DOS versions of MASM. Run in DOSBOX or a VM. Works fine, just not a fan of using a VM as a programming environment.
WASM (Watcom)- Modern Assembler which supports some variant of MASM syntax and features, but not all. Runs on Windows/Linux, compiles for DOS, windows, etc depending on linker settings. Supposedly, JWASM has better MASM support.
WASM (Wolfware)- Old assembler that occassionally I find modern code for. DOS Only. Supports a macro to invoke dos interrupts directly, making porting a possible pain.
YASM- NASM rewrite/compatible assembler with AT&T syntax support. Joy!

I personally use WASM (Watcom) currently, because of it's versatility in host platform and target, as well as sufficient MASM compatibility (most corrections to legacy code become obvious).

So yes, there are A LOT of x86 assemblers lol. Don't go through the same struggle I did lmao!

Thank you for the long list of opinions on various assemblers!
I am actually currently reading a book called 'the art of assembly', might be a later model, will have to check.
I have been suggested NASM from various people from the cross compatibility perspective, I guess I will do a little research on YASM
The reason I wanted to get into ASM in the first place was so I could get the entire picture of what is going on, I suppose that could also be the reason why I enjoy old technology.
I will be going to university soon, choosing my course carefully.

sergey · Aug 8, 2012

pearce_jj said:
If you're just starting out Borland's Turbo Pascal might be worth a look, as assembler code can be dropped in-line either for code chunks or entire procedures. And with v6 there is the super IDE that will step-through ASM too with register watches, and better yet it will all run just fine on that Compaq portable

Same applies to Turbo / Borland C... Turbo Debugger is pretty useful and much more user friendly than DOS debug.

per · Aug 8, 2012

ASM said:
The reason I wanted to get into ASM in the first place was so I could get the entire picture of what is going on, I suppose that could also be the reason why I enjoy old technology.
I will be going to university soon, choosing my course carefully.

That's a very good approach to learning about microcomputer architecture.

One book I could recomend is "The IBM PC From the Inside Out". It starts off by teaching how a generalized CPU work, then it goes on to teach x86 assembly (including how to use stuff like the BIOS and the DOS envroniment). It also explains TTL logic cirquits, how the PC is designed, the 8088 (on a hardware level) and how to generally code for most I/O devices in the PC (including the most common ISA cards). If you need more info, the book does contain a lot of references for further reading.

barythrin · Aug 8, 2012

Some of the free assembly IDEs will be nicer to code in but I still think just starting off play with debug. It's native, already installed, and you quickly get to learn how to edit your memory locations when you make code edits. However for the absolute intro and playing with "what does this int do?" it's quick and easy to make a .com file. One other reference I enjoyed (actually I used several books including the Indispensable PC Hardware Book) but helppc was a dos program that had an interactive help and examples on interrupts which I found extremely useful when dabbling with writing a proof of concept OS. There's an HTML converted version here http://stanislavs.org/helppc/. It lets you find what interrupt does what or just look for what function the interrupts can do (like print a character to the screen, etc).

I've seen better but here is also a quick sample on writing a program using debug. Again the fun will wear off the more you modify your code or you can use edit to modify a bat file that automatically enters your code into debug, (debug < sample.txt) in sample.txt you go ahead and include the commands:

Code:

a
mov ax,0200 	;set the register AX with hexadecimal value 0200h; the interrupt sub-function in AH
mov dx,0041 	;set the register DX with the hex value 0041h; the character to be output in DL
int 21 	;call interrupt 21h; since the value 02h is in AH, this tells DOS to output a character
int 20 	;call interrupt 20h; this tells DOS to terminate the program

n printa.COM
rcx
000A
w
q

your bat file could be whatever (make.bat) and make.bat calls:
debug < %1

make printa.txt would now route your little macro and assembly to debug and at the end it tells debug how large the program is, writes it to whatever .com you told it to (n filename) and quits leaving you back at the shell.

It's low tech but it'll get you started until you find the need for better/easier coding. Of course I'm sorta getting you started the "wrong" way but I liked it lol then you figure out what a live saver labels are in the smarter compilers.

Chuck(G) · Aug 8, 2012

Another book that I recommend if you can find it, is Chris Morgan and Mitch Waite's "8086/8088 16-Bit Microprocessor Primer", McGraw-Hill, 1982 ISBN0-07-043109-4. It came out just as the 5150 came out (it refers to the 5150 as the "Acorn"), so there's nothing there that relates specifically to the IBM PC, which is probably a good thing. The assembler used is Microsoft XMACRO-86, which is not an altogether bad thing. While the syntax is very different from standard Intel syntax, it's easier to answer the question "What does MOV mean?"

barythrin · Aug 9, 2012

Since I'm lost in archive.org at the time, here's something interesting. Day 1 Part 1 of Intel x86 architecture and assembly programming from the controversial Khan University thing. Sorta interesting on two levels since I've seen lots of articles complaining about these training and quality or lack of. Anyway, I'm not sure sitting through a video based training on assembly would be any better than reading a book though lol.

ASM · Aug 10, 2012

Hello again,
I really appreciate the help.

I have a few questions regarding machine code
A few people have stated that machine code which is directly executed on the CPU is not visible by the user
I am not sure if this is true but do early computers have an exception to this?

I am not too clear on this area, things like .bin files have viewable mechine code right (in hex)?
Is this different to what is executed on the machine?
If theoretically you were able to write in machine code would you be able to pump it into the processor without a compiler/assembler?

thanks,

Chuck(G) · Aug 10, 2012

ASM said:
HI have a few questions regarding machine code
A few people have stated that machine code which is directly executed on the CPU is not visible by the user
I am not sure if this is true but do early computers have an exception to this?

I'm not entirely certain of what you mean by "not visible by the user". Care to elucidate?

I am not too clear on this area, things like .bin files have viewable mechine code right (in hex)?
Is this different to what is executed on the machine?

It depends upon the file. .COM files are essentially memory-image--they load at 100H in a segment of the system's choosing and are entered at the beginning (100H). There is no attempt by the system to determine the file type. You can rename a letter to your grandmother to be a .COM file and the system will attempt to execute it. One exception to this in some later versions of DOS and Windows is the case where the file starts with the characters "MZ". The system will try to interpret this as a .EXE file (see below). If it fails, it will fall back to treating it as a memory image file. (This is what's known in the trade as a "kludge").

.EXE files are heavily structured and have much added information for things such as segment sizes, segment relocation and so on. They are essential for providing larger (more than 64K) programs a way to execute. Since, by definition, a program that large involves multiple segments, a special structured format was needed to tell the system loader how the various segments are related and where they should be placed. In Windows, .EXE files also contain resource information. The .EXE format is very flexible, so it's possible to have, for example, an OS/2 "bound" executable where the DOS version and the OS/2 PM version of the same program are combined--the operating system selects which to load.

There are other executable file types, mostly associated with Windows, such as .DLL (dynamic link libraries) which contain commonly-used code that can be shared by several programs, but I'm getting off-track here.

If theoretically you were able to write in machine code would you be able to pump it into the processor without a compiler/assembler?

Certainly! A file is just a container. In fact, there are some very cute batch scripts that create executable files using nothing more than ECHO statements.

ASM · Aug 10, 2012

Chuck(G) said:
I'm not entirely certain of what you mean by "not visible by the user". Care to elucidate?

I was reading that you are not able to view the machine code being processed in modern CPU's because it would compromise their security, the ability to reverse engineer and such.

New to ASM, assistance setting up

Experienced Member

Veteran Member

25k Member

Experienced Member

Veteran Member

25k Member

Veteran Member

Veteran Member

25k Member

Veteran Member

Veteran Member

Experienced Member

Veteran Member

Veteran Member

Veteran Member

25k Member

Veteran Member

Experienced Member

25k Member

Experienced Member