• Please review our updated Terms and Rules here

PDP-8/e mysterious halts

thunter0512

Veteran Member
Joined
Sep 27, 2020
Messages
859
Location
Perth in Western Australia
My new PDP-8/e had another weird fault which appeared and after about 20 - 30 minutes fixed itself.
The problem was that the system dropped out of the running state as if I had toggled the Halt switch.
The run-light turned off and sometimes toggling Cont would get it going for a few seconds and then it would halt again.
Sometimes Cont, Load Ext Load wouldn't do anything.
After a few power cycles and some head scratching the system recovered and now again works perfectly.

Any thoughts about possible causes?

These intermittent problems are very hard to diagnose and fix.
Hard errors are nice and fixable using DEC's excellent documentation and the great contributions on this forum.

I still haven't been able to work out what has caused the weird extended memory checkerboard core memory faults which I have not been able to reproduce for the past few weeks since I originally reported it.

Thanks
Tom
 
I wasted many hours trying to debug an intermittent issue like that on the IC level...
It turned out to be a flaky 5V pin on the backplane's power cable. I'd check that first before doing anything else; in my case I could force it to halt by jiggling the cable or tapping the connector with a screwdriver handle.
 
Just now I tried jiggling the cable and also tapping the connector with a screwdriver handle. The system won't halt as it did earlier.

Something that affects the RUN FF is the Power Good signal. I wonder if something between the Power Good signal generation in the power supply and the Run FF is dying. There is the delay circuit consisting of E36, E26 and E42 plus a cap, two resistors and two diodes shown on page 3-52 of the maintenance manual.
 
Something that affects the RUN FF is the Power Good signal. I wonder if something between the Power Good signal generation in the power supply and the Run FF is dying. There is the delay circuit consisting of E36, E26 and E42 plus a cap, two resistors and two diodes shown on page 3-52 of the maintenance manual.
I had a very similar problem previously, before the 8/M chassis, while running with mini backplane v1 and the voltage protector board. It turned out to be a marginal high level of POWER OK signal driven from the home brew voltage protector board. (Unfortunately, I designed the POWER OK driver based on the Omnibus spec and didn't clone the DEC circuit.)

Even when the system isn't flakey and has a solid failure, this is a difficult area to debug because of the wired-OR function at the input of the Run FF.

You've probably already thought of this, but could set up the logic analyzer to trigger on the Run FF toggling to zero and capture the Run FF async clear and: STOP L (DS2), POWER OK (BV2), KEY CONTROL L (DU2), and the received POWER OK which on my schematic is called PWR NOT OK L. So, the question is: what's clearing Run?

Do the high logic levels of POWER OK, STOP L and KEY CONTROL L look good?
 
Even when the system isn't flakey and has a solid failure, this is a difficult area to debug because of the wired-OR function at the input of the Run FF.

I can fixing hard failures. The new 8/e has these strange problems which appear out of nothing and then disappear without any trace (core issues a few weeks ago and now the RUN FF getting cleared). These problems are difficult. Note that this problem is not even intermittent, but a hard failure for a very short while followed by perfect behavior for a very long time after. The core problem was very similar - since the "self repair" it has run the extended memory checkerboard without a hiccup for several days non-stop.

You've probably already thought of this, but could set up the logic analyzer to trigger on the Run FF toggling to zero and capture the Run FF async clear and: STOP L (DS2), POWER OK (BV2), KEY CONTROL L (DU2), and the received POWER OK which on my schematic is called PWR NOT OK L. So, the question is: what's clearing Run?

If I could reproduce the problem, I would have already hooked up the logic analyzer. The problem has "fixed itself" though, so I have to try to investigate all possible sources of the problem to then check the components involved.

Do the high logic levels of POWER OK, STOP L and KEY CONTROL L look good?

They looked fine when I checked, but that was after it "fixed itself".

Thanks for all the suggestions.
 
Despite the fact that you're probably opertaing in a stable, gentle, thermal environment, I'll throw in a oddball suggestion of turning off the fans, maybe sticking it in a large cardboard box (or small closet), and running it for a while to see if there is possibly a subtle thermal issue. I would be careful to not put it into "extremis" conditions, but short of that you might elicit some reversible behavior without too much effort. Or demonstrate that it's *not* a thermal issue ...
 
Despite the fact that you're probably opertaing in a stable, gentle, thermal environment, I'll throw in a oddball suggestion of turning off the fans, maybe sticking it in a large cardboard box (or small closet), and running it for a while to see if there is possibly a subtle thermal issue. I would be careful to not put it into "extremis" conditions, but short of that you might elicit some reversible behavior without too much effort. Or demonstrate that it's *not* a thermal issue ...
Hmmm - I have now run the 8/e with the lid on and the two air-inlets partially blocked and indeed the system now halts occasionally (after about 5 - 30 minutes run-time).
It is not a hard error like before, but it may enable me to locate the cause.
Like before it is like the Halt switch was toggled. Toggling Contine nicely resumes where it left off.

I have considered adjusting the 5V rail up/down by say 5 % to see if that provokes the problem. We used to do this on CDC CYBER mainframes during the monthly 2 hour dedicated maintenance periods to locate any failing or marginal modules.
 
I have considered adjusting the 5V rail up/down by say 5 % to see if that provokes the problem. We used to do this on CDC CYBER mainframes during the monthly 2 hour dedicated maintenance periods to locate any failing or marginal modules.

I have adjusted the 5V rail from the original 5.15V down to 4.6V and up to 5.25V without being able to provoke the spurious halt which causes the RUN FF to turn off at either extreme.

I have now returned the 5V rail to the old 5.15V and will play some more with increased temperature as suggested by Paul.
 
Back
Top