• Please review our updated Terms and Rules here

Trixter's latest magic... Holy how-in-the-hell!!!???

Impressive! I'd love to read some writeup of the techniques behind this.

I've been writing some Z80 sprite-drawing code recently, trying to get it as fast as possible. Right now the fastest I've gotten drawing with transparency is at 1/4th the speed of the same image being drawn by a set of unrolled ldir-s (no transparency). Nevertheless, for the target system this is still a bit limited!
 
I've been writing some Z80 sprite-drawing code recently, trying to get it as fast as possible. Right now the fastest I've gotten drawing with transparency is at 1/4th the speed of the same image being drawn by a set of unrolled ldir-s (no transparency). Nevertheless, for the target system this is still a bit limited!

I can give you a quick-and-dirty explanation of how my sprite routine works. I basically divided things up in all possible cases... based on 16-bit words. Something like:
- Starting on even or odd scanline (CGA uses separate bitplanes for even and odd scanlines)
- Starting on even or odd x-coordinate (there are 2 pixels packed in a byte in the mode I use)
- All pixels in word opaque
- All pixels in word transparent
- Some pixels in word opaque/transparent
- All pixels in byte opaque
- All pixels in byte transparent
- Some pixels in byte opaque/transparent

I then coded hand-optimized assembly 'templates' for each case. Then I derived some heustistics for when to select which variation for the fastest/smallest possible code (sometimes it is faster to process things per word, other times it is faster to process per byte. Not a problem you would have on Z80).
Then I made a 'compiler' for this: I load a bitmap, and have the compiler automatically generate the proper blocks of code for each case, inserting the proper pixel/masking data into the 'template', with a few peephole optimizations added (eg, if you have multiple opaque bytes/words next to eachother, it merges the pointer updates to a single instruction)
So basically the sprite code is 'perfect' hand-optimized code for drawing sprites with transparency.
 
Last edited:
@Scali

Found out The running XT problem.
Have a 5160 with Hercules AND a Hercules GB200(CGA) card.
SW1 is on MONO setting.
With "MODE co80" and "MODE MONO" I switch between the modes/cards.

SO at mode CGA I got the massage "Runtime error 201 at 0AEC:015F

I took out the MONO card, SW1 at 40x25 - so only CGA GB200 Hercules card.
Up and running fine with that GB200 card.

Tried a Tulip CGA card (switchable between MDA and CGA with a switch) but that card had sync issues with the program 8088MPH.
Found the manual by a change on a google search, its a "DGA" called card with a YAMAHA V6363 videoprocessor.

Pics of that card be found here on the forum
http://www.vintage-computer.com/vcf...t-Videocard-with-Yamaha-V6363&highlight=v6363

My pics of running the program on the 5160 XT with Hercules GB200 CGA card

With both cards in XT MDA and CGA-error code
5160-XT_8088mph-foutmelding.jpg


Tulip-DGA_Yamaha-V6363_MDA-CGA.jpg


5160-XT_8088mph-CGS-start.jpg

The Tulip DGA card (Yamaha V6363) sync error -Scrolling Text between Demo pics went OK.
5160-XT_8088mph-DGA_V6363.jpg
 
The demo is meant to be run on a NTSC composite monitor, not a RGB monitor. It takes advantage of the NTSC color fringing artifacts to display all those colors.
 
The demo is meant to be run on a NTSC composite monitor, not a RGB monitor. It takes advantage of the NTSC color fringing artifacts to display all those colors.

More specifically, it takes advantage of the color fringing artifacts as generated by an 'old style' IBM CGA card.
'New style' IBM CGA cards have slightly different output, making the colours slightly different. This means that things will look 'wrong' in our demo.
All the clones we have tried so far are even more off colour-wise than the 'new style' IBM CGA cards.
 
I can give you a quick-and-dirty explanation of how my sprite routine works. I basically divided things up in all possible cases... based on 16-bit words. Something like:
- Starting on even or odd scanline (CGA uses separate bitplanes for even and odd scanlines)
- Starting on even or odd x-coordinate (there are 2 pixels packed in a byte in the mode I use)
- All pixels in word opaque
- All pixels in word transparent
- Some pixels in word opaque/transparent
- All pixels in byte opaque
- All pixels in byte transparent
- Some pixels in byte opaque/transparent

I then coded hand-optimized assembly for each case. Then I derived some heustistics for when to select which variation for the fastest/smallest possible code (sometimes it is faster to process things per word, other times it is faster to process per byte. Not a problem you would have on Z80).
Then I made a 'compiler' for this: I load a bitmap, and have the compiler automatically generate the proper blocks of code for each case, inserting the proper pixel/masking data, with a few peephole optimizations added (eg, if you have multiple opaque bytes/words next to eachother, it merges the pointer updates to a single instruction)
So basically the sprite code is 'perfect' hand-optimized code for drawing sprites with transparency.
Thanks a lot!

I didn't think much about instruction-encoded images, but it's clear it's faster than a universal draw-image routine for things with transparency.

The target system is blessed with one big bitplane for the entire display, but you have 4, 2 or 1 bit per pixel so X-position matters. All modes use the full 32K of video-RAM as bitmap, and that's precisely why speed is everything. There is a vertical scroll register and a proper 256-color palette, but otherwise the only other thing the graphics hardware provides is an extra wait-state when video-RAM is paged in.
 
Last edited:
Does this demo have any hope of working on a Turbo XT with a 8Mhz V30? I'm curious if a PC Transporter's internal CGA circuit will work. Chances are the full length CGA clone card I got with a "CIC 8645BE" won't work :p
 
Yes, to be exact, the capture on YouTube is also the one that was shown at Revision. We captured it on the spot from my PC/XT.
My configuration was like this:
- IBM PC/XT 5160 from 1987
- Old style IBM CGA card
- Serial card
- Floppy controller
- 5.25" 360k FD drive
- Harddisk controller
- Seagate ST225 HDD
- 640k of memory
- Sound Blaster Pro 2.0
- IBM PC DOS 3.30 (note that the demo does not work with 2.x versions of DOS)

The HDD and serial port were only for convenience during development, and were not actually used during the demo.
The Sound Blaster Pro 2.0 was used for the capture, because it has a PC speaker connection on its mixer. This allowed us to tap the signal from the motherboard, and pass it through the SB Pro mixer, then out to a 3.5" jack, so we could connect it to the capture device, and adjust the levels for recording.
The SB Pro itself was not actually used during the demo of course, and in fact, no SB software was installed on my machine whatsoever. Not even a SET BLASTER-statement in my autoexec.bat.

More specifically, it takes advantage of the color fringing artifacts as generated by an 'old style' IBM CGA card.
'New style' IBM CGA cards have slightly different output, making the colours slightly different. This means that things will look 'wrong' in our demo.
All the clones we have tried so far are even more off colour-wise than the 'new style' IBM CGA cards.

The old style IBM CGA card is a must here, new style CGA made for a jumpy image on both a CGA and a composite monitor whenever the extreme color screens appeared. Anything else will likely be worse.

The regular PC music is output at a good volume, but the MOD music at the end really requires some type of amplifier. I am not a fan of the PC Speaker input on the Sound Blaster Pro and later Creative cards, but if that is what these guys used, I'm good with it for this.
 
That is the reason why the demo will probably crash on emulators.
But even if it doesn't crash, some effects will not look/sound right, because they rely on cycle-exactness of the CPU, the CRTC and video memory wait states.
And then there is probably no emulator out there that will correctly simulate the high-colour tweakmodes with NTSC artifacting.

This demo will also not work entirely correctly on most clones, because just having a 4.77 MHz 8088 and a CGA-compatible adapter is no guarantee for cycle-exactness with the real IBM PC/XT and CGA. We have also found that the artifact colours on clone CGA (ATi Small Wonder/Paradise PVC4) tend to be different from real CGA.

So no dice on my PVC4.... :( I am going to run it anyway - could probably get proper result by shifting phase by 135 degrees. Does it time right on a 10 MHz?
 
I am very impressed in that this demo uses just about every conventional method and then some to display color from a CGA card onto an NTSC monitor. 320x200 color composite graphics, 640x200 color composite graphics, 160x100 color graphics, 40-column text and hacked 80-column text modes.
 
I tried it on my Tandy 1000SX with a NEC V20 running in 4.77 MHz mode (which it reported as being 8% too fast) and it mostly ran fine, except the first animation kept going much longer than it should have, causing the entire demo to take about 15 minutes to complete, instead of just under 8½ minutes. Also, the artifact colors were wrong, but that was expected.
 
This demo won first place in the oldskool demo category at Revision 2015 against some tough competition. It also got the third highest number of positive votes of any demo shown at the patry, so a heartfelt congratulations are in order!
 
Very, very stunning demo. Makes me wonder how games would have looked back then, had the programmers known such tricks.

You can get a few hints from people who did know tricks back then; try running Spy Hunter (uses vertical scrolling) or Super Zaxxon (uses diagonal scrolling). And of course, California Games uses a custom timer interrupt programming to switch RGB palettes mid-screen a few times to simulate up to 7 colors in 320x200. Personal favorite part of the game that does that is the hackysack, which arranges the graphics cleverly so the transition between red-cyan-white and red-green-yellow is hidden.

The only thing I don't quite get is the intro text. Why would anyone think an IBM PC from 1981 would crush a C64 in a demo compo? The C64 didn't even exist in 1981. It's later hardware and also made for games. Souldn't it say: "C64 would crush IBM in a compo, right?"

That's because most demosceners think "286+EGA+sound device" when you mention "old PC demo", as that was the realistic birth of the PC demo scene. There has never been an 8088+CGA demo of this caliber and we wanted the audience to understand exactly what we were dealing with.

Have a 5160 with Hercules AND a Hercules GB200(CGA) card.
SW1 is on MONO setting.
With "MODE co80" and "MODE MONO" I switch between the modes/cards.
SO at mode CGA I got the massage "Runtime error 201 at 0AEC:015F

I believe that address is in the video detection code. What's odd is that I have a CGA and an MDA card in my 5160, and the detection code works fine, so I'm afraid I don't know what to tell you, sorry.

I tried it on my Tandy 1000SX with a NEC V20 running in 4.77 MHz mode (which it reported as being 8% too fast) and it mostly ran fine, except the first animation kept going much longer than it should have, causing the entire demo to take about 15 minutes to complete, instead of just under 8½ minutes. Also, the artifact colors were wrong, but that was expected.

What was the "first animation" that ran too long? I'm curious.

This demo won first place in the oldskool demo category at Revision 2015 against some tough competition. It also got the third highest number of positive votes of any demo shown at the patry, so a heartfelt congratulations are in order!

Definitely a dream come true for us.

Q: "How can Trixter possibly create anything better than 8088 Domination?"
A: "Work with people who are better than he is!"
 
So i tried this, and here are results.
I have IBM PC 5160, IBM CGA, 256kb RAM.
On my IBM DOS 2.0 didnt work, there was runtime error...
I tried DOS 3.0, ok it works but, collors was different, i know, it is for NTSC monitor not for RGB

IMG_2407_změna velikosti.jpgIMG_2408_změna velikosti.jpgIMG_2409_změna velikosti.jpgIMG_2410_změna velikosti.jpgIMG_2411_změna velikosti.jpg
 
2 scenes before this was not displayed, i dont know why.

IMG_2417_změna velikosti.jpg

after this scene i saw only this and nothing more, no exit.

IMG_2418_změna velikosti.jpg
 
2 scenes before this was not displayed, i dont know why.

I think it's because you only have 256k memory. Some parts need more, the largest part needs about 507k.
And indeed, we currently use some code that is not compatible with DOS 2.x. If we do a final version, we may address this problem. We mostly used some DOS 3.x functionality because it was easier to code, not because it would not be possible at all with DOS 2.x.
DOS 1.x will not work though, for the simple reason that it does not support 360k floppies. However, even THAT could be fixed, in theory.. if we were to make a version that runs off a 120k floppy on both sides, and prompt the user to flip the disk at the appropriate time.
 
On my IBM DOS 2.0 didnt work, there was runtime error...

We had originally hoped to target DOS 2.0, but discovered quite late on (too late to fix) that the Turbo Pascal runtime that Trixter used for his parts has some DOS calls that were not supported in versions before 3.0. We might see if we can fix that up for a final version.

2 scenes before this was not displayed, i dont know why.

Probably due to lack of RAM - the 3D shapes and Kefrens bars require more than 256kB.

after this scene i saw only this and nothing more, no exit.

That's similar to what I see for the final part on DOSBox - what CPU are you using? If it's something other than an i8088 (especially if the prefetch queue works differently) that might account for that. Again, lack of RAM is the more likely explanation though.
 
Back
Top