• Please review our updated Terms and Rules here

Pseudo 3d ASCII

goostaw

Member
Joined
Nov 11, 2018
Messages
26
Location
Poland
Idea for combining top view (roguelike type) and grid based first person rpg (ala Wizardry) in text mode. The target is an 8088/86 based PC with a CGA card.

First try in vintage c ++ (Open Watcom) and pdcurses.

Video from test (music unfortunately not mine):


On 386 and VGA it works fine, but on my 5155 it's very slow. The program logic itself works quickly so the problem is probably in display. My guess is pdcurses uses the BIOS functions to display characters in this case.

That’s why I'm going to get rid of pdcurses and display it directly to screen memory and use int16h for the keyboard.
If someone would like to look at the current code and comment or advise something, I would be grateful.

https://github.com/goostaw/pseudo3d_ascii/tree/master

Since I'm just learning, I will probably have a few questions for more experienced forum users.
I hope you will help.
 
Yes. Cool. Pdcurses fired.
CGA direct addressing solves the speed issue, BUT:

And although it works great on hp200lx, stock IBM CGA produces a lot of nice snow. :)


It looks like I will have to deal with this issue.

What could be the most effective method in this case?

btw. inline assembly question!

The question is for Open Watcom 1.9, but I also tested it on Borland C ++ 3.1 (same).

_asm {} used in a method of an example class "does not see" members of that class, both private and public static.
Only global or local variables can be referred to within the scope of _asm.
There is of course a workaround for this, but it seems weird.

Is it supposed to be like that?

Sample, test code:

Code:
typedef unsigned char byte;

byte far* cga_global = ( byte far* ) 0xB8000000L;

class Screen{
		byte far* cga_priv;
	public:
		Screen() : cga_priv( cga_global ){}
		static byte far* cga_static;
		void clear();
};
byte far* Screen::cga_static = cga_global;

void Screen::clear(){
	byte far* cga_local = cga_priv;
	byte far* &cga_localRef = cga_priv;

	_asm
	{
		les DI,cga_global // works
		//les DI,cga_local // also works
		//les DI,cga_priv // variable not declared
		//les DI,cga_static // variable not declared
		//les DI,cga_localRef // It compiles but hangs 
		mov AX,0720h
		mov CX,2000
		rep stosw
	}
}
int main(){
    
	Screen scr;
	scr.clear();
        return 0;
}
 
Writing directly to video memory will be much quicker than pdcurses and/or BIOS calls. However, if you're targeting IBM CGA cards in 80-column text mode you might want to take into account snow suppression.

Also, as a general rule you should avoid guessing the cause of performance problems - you should instead measure and see exactly how long each part of your program is taking. One really crude but effective method of doing this (if you don't have a proper profiler) is to switch the border colour at the start of each section of code. Then the colours that show up most often correspond to the parts of your program that are taking the most time.
 
That looks very good! I never have seen 3D ASCII used in this way.

(music unfortunately not mine):
Nice music, but it keeps playing inside my head. Very familiar, but no idea what the title is or where it comes from. So what is it?
Dziękuję bardzo!
 
That looks very good! I never have seen 3D ASCII used in this way.


Nice music, but it keeps playing inside my head. Very familiar, but no idea what the title is or where it comes from. So what is it?
Dziękuję bardzo!


Sergei Prokofiev - Dance of the Knights from Romeo and Juliet ballet. - rather vintage

Na zdrowie! :)


btw. I seem to just hit the 10-post magic line, so my posts finally show up immediately.
On this topic yesterday I wrote a post that is not yet available. If I do not repeat myself, I will wait a while.
 
Last edited:
One really crude but effective method of doing this (if you don't have a proper profiler) is to switch the border colour at the start of each section of code. Then the colours that show up most often correspond to the parts of your program that are taking the most time.

Thanks a lot. A great way indeed. :)
 
A little update:
1. To get to the topic of CGA snow I rewrote two functions that load characters onto the screen in inline asm.

4A20C2E3-EC43-4196-B262-C7657FF09FF4.jpeg 187EEDAF-2832-4823-823F-5093939E6D42.jpeg (Sorry for the jpg - long way to the internet)

The first one loads a single character. The second - a string. That’s all.
(Btw. I can't get directly to the class members. Hence these local definitions - one of my posts above - maybe someone will help)
This alone sped up the display a bit and reduced "snow".

2. I read a few threads about cga snow by forum experts and came to the conclusion that I must thoroughly understand this phenomenon.

3. So far I know:

  • that the cause of "snow" is simultaneous memory access via Cpu and CRT controller.
  • If I understand correctly, minimize the phenomenon. the processor is supposed to load data into memory while cga is not scanning ram?
  • 3DAh in I/O CGA Control ports is "status register" Bit 0 OR bit 3 will tell me when to work with data copying.
 
So far I know:

  • that the cause of "snow" is simultaneous memory access via Cpu and CRT controller.
  • If I understand correctly, minimize the phenomenon. the processor is supposed to load data into memory while cga is not scanning ram?
  • 3DAh in I/O CGA Control ports is "status register" Bit 0 OR bit 3 will tell me when to work with data copying.
That's correct... you can use either one of those bits, but the behavior is different:

Bit 0 is the real indicator for when it's "safe" to write to CGA RAM. It signifies "blanking", which is either horizontal (happens for a very short period, once per scanline) or vertical (lasts for 62 CGA scanlines, once per frame). Throughout these blanking periods, the output to the CRT is the overscan (border) color; nothing is being read from video RAM, so there's no bus contention (and no snow) when you write to it.
On a 4.77MHz 8088 PC the horizontal blanking period only gives you enough time to copy 2-4 bytes "snowlessly", while vertical blanking allows something like 920-960 IIRC, but that's if you're doing sustained word transfers (rep movsw); your mileage may vary. Polling won't tell you if the blanking is horizontal or vertical, so you can't know how much data it'll let you copy - in practice you have to poll before every byte/word transfer. Yep, that's slow, and I think that's also what the PC BIOS does, if you wonder about the appalling speed of the BIOS routines (I could be wrong, haven't checked in a while).

Bit 3 signifies vertical *retrace*, i.e. when the electron beam is being deflected back to the top of the CRT before drawing the next frame. Yep, I'm over-explaining stuff you may already know, but a lot of people seem to confuse "retrace" and "blanking" so that should be cleared up.
This retrace takes place during vertical blanking - it starts 24 scanlines after the beginning of a vblank, and lasts for 16 scanlines, compared to the total vblank period of 62 scanlines. So whenever bit 3 changes fro 0 to 1, you're already know you're in a blanking period and don't need to check bit 0. You also know that you have time to copy around 920*(1-24/62) = ~562 bytes, again in the best case of "rep movsw".

IF you use the same routine for all your screen writes (so that you always have a constant speed), you could poll for bit 3 and see how much data this lets you copy before you start seeing snow.

Otherwise, I'd do all my writes to an off-screen buffer and use the same polling to determine when it's safe to send a ~562-byte chunk of that buffer to CGA-land w/rep movsw.

There are ways to go one better and use the entire vertical *blanking* period as well as horizontal blanking, at acceptable speeds, but that's a big headache which you really don't need... honestly, for a turn-based 80x25 text game, even the above is probably overkill. :)
 
Last edited:
VileR - Thanks a lot for the very detailed answer. Very helpful. :)

I am testing 2 versions:

1. No buffer. - Word written during horizontal blanking. (Checked bit 0) So far, I'm just glad that it works and I haven't noticed a single snowflake. The program has slowed down of course, but it works much faster than with BIOS calls.

2. Writing to Buffer and from buffer to screen while moving vertically. (Checked bit 0 and 3 )

I need to think about it in more detail. Looking at the first effects, it looks the most promising.


I can't rewrite the entire buffer to the screen in one go. (rep movsw) for one reason: My screen is logically divided into windows. (class Window) And while windows communicate with each other, each displays a different logic and each has its own dynamically allocated buffer.
The buffer is linear and the visible screen slice is obviously not. I need a shift.
So far, I have 2 windows
First - 16x16 characters (2D map)
512 bytes passes without problems. Zero snow.
The second window is too big to fit in time. I have to think how to divide such large windows elegantly and effectively. I have enough for today. :l

You can probably write it faster and better: View attachment 64284



Another take: Get your game working at an acceptable speed first, then worry about snow avoidance later.

At first I thought the logic was fast enough for 8088 and CGA. (when I look at this movie it warms up pretty well) but this "avoiding the snow" eats up time pretty well.
I wonder if it is worth doing. Probably some good compromise is the solution.
 
Were there many CGA clones that still had issues with snow, or was that primarily an issue with the original IBM card? I don't know that I'd consider it worth the trouble, unless it's really important to you that it look pretty on Ye Olde Original Hardware. Certainly every EGA/VGA I've ever seen has avoided the issue.
 
Gotta say, even the no-buffer version with bit-0 polling looks good! True, it's visibly slower but that's really not a big deal with a game of this type. You could say it's faithful to the look-and-feel of a lot of 1980s text-based PC games that did the same.

If you really want more speed, then your off-screen buffer will probably have to be arranged the same as the CGA framebuffer, so using just the bit-3 method you could could copy the whole things in 8 frames, or 7.5 times per second. But tbh I don't know if that's really worth the trouble, since the current snowless version looks quite acceptable I think.


Were there many CGA clones that still had issues with snow, or was that primarily an issue with the original IBM card? I don't know that I'd consider it worth the trouble, unless it's really important to you that it look pretty on Ye Olde Original Hardware. Certainly every EGA/VGA I've ever seen has avoided the issue.

Mostly IBM and a minority of early 3rd-party clone CGAs.

I'd say the snow avoidance should be optional, as in a command line argument or similar. If your setup's video output doesn't have a snow problem, it'd be common courtesy to refrain from slowing it down. :)
 
It seems that both "direct horizontal" and "buffer vertical" avoiding snow are working.


Speed is indeed acceptable considering it is supposed to be an option.
Especially in this type of game, as noted by VileR.
it can definitely be optimized, but not at this stage.
Now it's time for some fun. I guess it could use the monsters wandering here and there.
 
Love the ASCII graphics style! I'd be keen to try this on my HP 100LX when it is in a playable state :)
 
Back
Top