• Please review our updated Terms and Rules here

Problems with 8/A & RL02 drives

DrCharles

Experienced Member
Joined
Sep 2, 2014
Messages
101
Location
West Plains, MO
I've posted on cctalk and alt.sys.pdp8 already, but figured I would join here for a more specific forum ;)

Lately my 8/A, 32K SRAM board kit, two RL02 drives, OS/8 has been refusing to boot.

Does anyone know if OS/8 uses the interrupt facility of the RL8A controller? I discovered that the console TTY port can also interrupt and when I hit a key on the TTY, the port is holding the int request asserted... since interrupts are by default disabled, it seems that OS/8 is enabling them during the boot process even though it's not completing. But if the interrupt line is being held low by the TTY port, it can't also "hear" the RL02 if it's trying to interrupt.

At one point I was applying finger-tip pressure to the RL8A card, hit the boot switch and OS/8 came up! I started listing a directory, which worked fine until I let go the pressure on the board - the system crashed immediately. Can't get it to boot again except it will sometimes print "V" on the tty instead of the "." prompt and it won't do anything else
(endless loop in zero page but ultimately waiting for drive at a skip-on-done instruction).

Ran some diagnostics this evening. DJKKBA (CPU exerciser) ran 8 passes without a failure. DHKMAD (memory checkerboard) also ran until I got tired of waiting. AJRLHA (RL seek/fctn) works on both drives. But - AJRLIA (read/write) on a scratch pack fails immediately. Then I unloaded the drives and tried AJRLAC (RL8A diskless diagnostic) a.k.a. controller card test - MANY failures all involving the middle 4 bits of the 12 bit data starting with DAR - silo. If this were an interrupt problem it'd be all bits, not the middle nibble...

So I believe that a 4-bit TTL device (buffer, RAM, register) on the card has failed. Hoping for a bad solder joint which would explain how flexing the card temporarily fixed but then made it worse. More likely a defective IC (pin or bond wire inside the plastic DIP). E13, an 8234, is the first suspect the next time I feel like wrestling with this.

Anyone got an RL8A (M8433) card I can buy or borrow, just to make sure the failure is actually on that card before I spend a lot of hours with a logic analyzer?
thanks
Charles
 
I've posted on cctalk and alt.sys.pdp8 already, but figured I would join here for a more specific forum ;)

Lately my 8/A, 32K SRAM board kit, two RL02 drives, OS/8 has been refusing to boot.

Does anyone know if OS/8 uses the interrupt facility of the RL8A controller? I discovered that the console TTY port can also interrupt and when I hit a key on the TTY, the port is holding the int request asserted... since interrupts are by default disabled, it seems that OS/8 is enabling them during the boot process even though it's not completing. But if the interrupt line is being held low by the TTY port, it can't also "hear" the RL02 if it's trying to interrupt.

At one point I was applying finger-tip pressure to the RL8A card, hit the boot switch and OS/8 came up! I started listing a directory, which worked fine until I let go the pressure on the board - the system crashed immediately. Can't get it to boot again except it will sometimes print "V" on the tty instead of the "." prompt and it won't do anything else
(endless loop in zero page but ultimately waiting for drive at a skip-on-done instruction).

Ran some diagnostics this evening. DJKKBA (CPU exerciser) ran 8 passes without a failure. DHKMAD (memory checkerboard) also ran until I got tired of waiting. AJRLHA (RL seek/fctn) works on both drives. But - AJRLIA (read/write) on a scratch pack fails immediately. Then I unloaded the drives and tried AJRLAC (RL8A diskless diagnostic) a.k.a. controller card test - MANY failures all involving the middle 4 bits of the 12 bit data starting with DAR - silo. If this were an interrupt problem it'd be all bits, not the middle nibble...

So I believe that a 4-bit TTL device (buffer, RAM, register) on the card has failed. Hoping for a bad solder joint which would explain how flexing the card temporarily fixed but then made it worse. More likely a defective IC (pin or bond wire inside the plastic DIP). E13, an 8234, is the first suspect the next time I feel like wrestling with this.

Anyone got an RL8A (M8433) card I can buy or borrow, just to make sure the failure is actually on that card before I spend a lot of hours with a logic analyzer?
thanks
Charles

There is this thread in a Swedish forum where Anders is repairing a similar 8/a with RL02. It runs now but he had several problems with it, one being the interrupts. Now this thread is in Swedish so it can be hard to understand, but you might contact him for help.
 
Charles,

I too have an RL8A and pdp-8/a. I am very sure that OS/8 uses the databreak operation of the RL8A. I think your diagnosis is right on the mark so far. I've seen your postings on cctalk, but have never been able to post there. It took a long time just to get someone to add me to the e-mail distribution list.

Good luck finding a loaner RL8A. They are precious and I would not trust mine to any kind of shipping. It took me a long time to find and buy the one that I have, and it was not cheap. Perhaps you should put the RL8A in a quad height card extender and start shooting locations with cold spray while running AJRLAC. You have a mechanical problem, and cold spray is great for finding those.

AJRLAC is a very thorough test. I studied it closely when I had problems using my RL8A with my home-made 32kW SRAM based memory. The problem was with my memory, which I did fix, but for a long time I studied the RL8A because the failure would occur during the databreak transfer. An unrelated side comment - the RLV11 and RLV12 qbus pdp-11 controller's diskless diagnostics work very similarly to the one for the RL8A. From end to end I've found the RL drives and their controllers a fine bunch of engineering, which explains why we still run them today.

FYI (and also slightly unrelated) I have built Reinhard Heuberger's RL02 emulator and it does work fine with the RL8A. I have built a bootable RL02 OS/8 pack image, with it all passing the ultimate test of running Collosal Cave Adventure.

Lou
 
There is this thread in a Swedish forum where Anders is repairing a similar 8/a with RL02. It runs now but he had several problems with it, one being the interrupts. Now this thread is in Swedish so it can be hard to understand, but you might contact him for help.

There are pictures of the backplane of our PDP-9 on that forum!
 
Charles,

I too have an RL8A and pdp-8/a. I am very sure that OS/8 uses the databreak operation of the RL8A. I think your diagnosis is right on the mark so far. I've seen your postings on cctalk, but have never been able to post there. It took a long time just to get someone to add me to the e-mail distribution list.

Good luck finding a loaner RL8A. They are precious and I would not trust mine to any kind of shipping. It took me a long time to find and buy the one that I have, and it was not cheap. Perhaps you should put the RL8A in a quad height card extender and start shooting locations with cold spray while running AJRLAC. You have a mechanical problem, and cold spray is great for finding those.

Continued frustration. I tried cold spray and found nothing... today I looked at the board under a bright light and found two IC pins that had no solder at all. Not sure how a wave-soldered board can miss a pin here and there (bubbles?) but it obviously was that way from the factory! Anyway I soldered the dry pins, and even the junctions of the gold fingers to the traces in case the bond was broken there. Same problem. If the board is flexed JUST right, AJRLAC will run without errors (although the system still won't boot). But if you so much as touch it with a finger, immediately the errors begin with the DAR-Silo failure (expected 5337, actual 5017), MA not cleared by RLCD, data break to wrong field... everything pointing to a data bus fault.

I've tried the 32K SRAM in different slots, and the RL8A in different slots (while complying with the tech manuals that say the RL8A has to go between the CPU and the memory board). Still the same problem.

The reason I thought it might be a bad solder joint or cracked trace on the backplane is that (before this RL02/RL8A failure) my core memory boards were also showing failures. It's a bit of work to take the whole chassis apart for examination of the backplane under a light/magnifier. Besides, what backplane problem would exist in any slot for the RL8A but allow the SRAM and CPU to pass all tests? I'm stumped...

I'd be willing to mail you my RL8A for you to try it in your working system. That'll tell me for sure if the fault is on the RL8A or somewhere else! What do you think?

thanks
Charles
 
I think I'm the guilty person. We got sidetracked from the main discussion and talked about the PDP-9 backplane. Hope it's ok to link to your images like that.

You can do anything you want with images from the RICM WWW page. That image only shows 2/3 of the backplane, so it is almost as tall as a person. MattisLind has a PDP-9 if you would like to see one up close.
 
Had some more time to play with the 8/A. It's cooler upstairs (around 70) since the cold front/rain came through.

Turned on the system... and drive 0 would not finish loading. The odd thing is that it didn't show a fault light either, just spun for several minutes without the Ready light coming on. I tried another pack, same thing. (All lamps are working, I checked. in particular the Ready lamp has a dim keep-alive glow visible).
SO I just left it on for a while. and while running the diskless controller test it showed the error "Drive ready asserted"... sure enough, the drive had finally gone ready after ten or so minutes!

I then ran AJRLIA (read/write test) and AJRLKA (performance exerciser) for a good half-hour each and there were no errors on either drive... aargh. If these errors won't stay broken they are a royal PITA to find...

Am I correct in assuming that if a good pack is loaded, and the drive is seeing the clock signal from the RL8A card, that failure to load is a problem in the RL02 itself?
 
It's been nearly a year since I had time to play with this, and not surprisingly it had not fixed itself ;) The board would boot OS/8 in another system but crash when opening a file with EDIT.

I posted the gory details on cctalk but in short (pun intended :) ) something on the RL8A board was pulling down the MD4 bus line to below 2 volts, and would change when I flexed the board. With light and magnification, following that trace on the board, I could not locate the fault. So I tapped a cliplead connected directly to the hefty +5 supply to an IC pin on MD4, and it cleared the short. Must have been a tin whisker or something tiny and conductive... I would rather have seen it, but it seems to be gone now!

Now it runs AJRLAC diskless diagnostics without error even when the board is flexed. Using a scratch pack and AJRLIA, the drives themselves only show an occasional stay-on-track error, which is becoming more rare with continued use of the read/write diagnostic. It's really hot in the computer room too (90F) and the drives have not been operated since last year.

Hopefully it will boot OS/8 once again, after I remake the pack which I suspect has been corrupted by the short to that data bit line.
 
The problem came back :( but I think I've got it fixed for good this time.

I created another OS/8 pack with vtserver on the 11/23+ (takes three hours!) but it booted up and was working - then it crashed. Flexing the board made it work for a few more minutes, but then I was back to square one.

As I just posted on cctalk and alt.sys.pdp8, *this* time I think I finally got it fixed! Famous last words, I know...

The first fault on the diagnostic was CA not returning the proper value. So I started there with short toggle programs and scope. While tracking down an intermittent fault on D1 CA 3 (Command Reg A bit 3, the fourth data bit, which also goes to the disk address register), I spotted a tiny solder whisker under the W8/W9 jumper from CA 3 and barely touching another trace (unknown function, didn't feel like tracking it down).

But after clearing that whisker, even while flexing the board all diagnostics work, the drives themselves pass read/write tests, (and the intermittent stay-on-track error for Drive 1 seems to be gone too). :)

David Gesswein just sent me a modified version of dumprest for RL and Omni-USB card at 40/41. Looking forward to trying it out!
 
System is up and running OS/8 again. Had some ASR-33 problems to fix, too, so that took a while. Those things are complicated machines. And I had to find a problem with the boot loader code being corrupted - luckily fixed by reseating the boot ROMs on the Option 2 board.

There is a remaining strange symptom that I can't track down. I posted the details on cctalk mostly in the thread "Even more PDP-8/A weirdness". When running AJRLHA (seek/function diagnostic) the tested drive will show a fault immediately, initially an operation incomplete during seek. Then the fault will clear and the drive will complete the pass (about 10 min) with no further errors. The diagnostic then switches to the other drive and the same thing happens.

The only other time I see Fault lights is when I boot OS/8 and that's only temporary. When I flip the Boot switch, both drives flash their Fault lights, then immediately go Ready again and the "." OS/8 prompt prints on the TTY.

AJRLAC (controller test), AJRLIA (read/write), AJRLKA (Performance exerciser) all work correctly for extended periods of time reading and writing to scratch packs, as does the toggle-in Oscillating Seek program in the RL manual. OS/8 boots, formats packs, copies files to/from either drive, runs Focal, PAL, etc.

I did learn that the controller reports "operation done" immediately after a Seek command is sent to the drive, even though it can take up to 100 ms to actually complete. It occurred to me and Henk that perhaps the diagnostic itself is buggy (e.g. thinking the first seek is complete when it's not, and issuing another command), but I don't have the source code or know where to find it. That wouldn't explain the flash of the Fault lights when booting, either.

BTW, if the SR is set to 4000, these diagnostics will automatically continue after printing out each error (the default with SR=0000 is to pause after an error until a CR is typed on the console) which I discovered by guesswork.

Heck with it, the system does everything I want it to do, so I may just stop debugging at this point...

I'm getting an RX01 later this week (and paid big-bucks for an RX8E controller card), so that will be the next fun install and debug :)

-Charles
 
RX01/RX8E installed and working, as is a high-speed paper-tape reader via a PC8E card I had sitting around. (Other threads).

Today I booted up OS/8 on RL02 (Drive 0)... but Drive 1 will not go ready. The LOAD light comes on, push the button and it spins up, but the READY light never illuminates and diagnostic tests show that the drive is indeed not ready. Tried a different pack, same result. It was working all last week and during the extended debugging sessions last month. Now what. :mad:

This happened once a long time ago and while getting ready to remove the covers and start scoping, I let the drive spin for an extended period (I think about 30 mins) and suddenly it went ready and then worked perfectly thereafter, without touching anything. Now apparently whatever is failing has returned. My guess would be a leaky electrolytic capacitor that has to reform, or possibly a cold solder joint, but there are a lot of places it could be (spin-up, head read amp, servo positioner)... sigh.
 
Slid the drive out on its rails. Don't even think it "bumped" against the stops, but it went ready. Pushed it back in, still working, but unload & reload and it wouldn't show ready again.

So I slid it back out, disconnected and reseated the cable and terminator, cable-tied them to the rack as shown in the engineering drawings, and now it's working every time, in or out. No point in taking the covers off and testing now, since the failure as usual disappeared when I approached with a threatening oscilloscope :)
 
Back
Top