My PET 2001-8 Revival Project

jimdinunzio · Jun 1, 2015

Hi,
I started by replying another thread and decided I better start a new one rather than hijacking that one!
My first posts start here:
http://www.vintage-computer.com/vcforum/showthread.php?47472-New-Project-Commodore-Pet-2001#4

First I'd like to thank all those who have contributed to this knowledge base on the PET. It's already helped me tremendously!

However, I'm now a bit stuck with a computer that boots to basic and is usable for about 10-15 minutes before hanging with "Illegal Quantity Error". Then you have to keep it off for 2 hours before it will work again for another 10 minutes. It's very frustrating after initially made good progress with this thing. Note: I do not have a scope... not yet anyway. The IRQ line is a steady low when the machine is hanging. More on that below.

I suspected the 6520 PIA #1 (handling keyboard/tape/blinking cursor) because without that chip the machine always boots. Thinking this was a thermal issue, I tried spray freezing different areas of the board and cycling power, and after that didn't work I even tried putting the whole thing in the freezer for 10 minutes. (the condensation scared me) No dice. It is more complicated than just temperature I guess. Could it be some lingering capacitance?

I've read several other repair blogs where a capacitor was acting up after warm up in the reset circuit. Not mine. The reset line seems good. Voltage supply lines seem good at a steady +4.85-5.10. Another blog reported the same exact symptoms of a hang after tape loading, but swapping the 6520 chip with the second one solved that problem. I've swapped them and had no change in behavior. I even warmed up the machine to the hang then swapped the #1 6520 with the other one which I had pulled out earlier.

I did an inspection of the board under a magnifier and improved some questionable solder joints. No change.

Then I checked the IRQ line during the hang. It is steady low. It seems like an interrupt occurred that never returned. Clock signal is still going. Could this be a flaky ROM problem involving the ISR somehow not returning from the interrupt or the IRQ line not rising when it is supposed to? There's a 3.3k resistor connected to 5v to do the rising and both look ok. With the PIA#1 out I noticed that the IRQ line at the 6502 is steady high meaning there are no interrupts occurring, so that's why it boots. Something is going wrong with interrupts. During normal operation the IRQ line is buzzing at least with the cursor. I guess my machine just gets sick of being interrupted!!

Could it be a bad capacitor somewhere even when measured voltages all seem stable? I bought some and am ready to start replacing old ones but not sure which ones to replace. Maybe the power electrolytic ones or decoupling ones near the 6520 or 6522?

I tried swapping a different 6502 from my old Apple II+ on a hunch, but the problem persisted.

Maybe I'm going too far down a rabbit hole. Any suggestions?

Next I'm planning to warm it up then pull the board out and do an integrity check on the ROM sockets.
Another thing I am thinking of doing is rig up test clips to connect to IEEE port and Keyboard plug to have the PET do its built in diagnostic during my 10 minutes of working computer. Has anyone had any luck with that? Other than that maybe I need to identify a flaky ROM, perhaps where that ISR is, or get a PET Vet.

Thanks for reading, and any suggestions would be welcome!
Jim

jimdinunzio · Jun 14, 2015

After seeing that the cpu was stuck at an interrupt with $ffff on the address bus once the hang occurs and After trying the old pet self test using a keyboard connector patch I made and user port patch with alligator clips which seemed to get stuck verifying the F800 ROM i decided to order a ROM chip replacement for it (H7). That is cheaper than buying the eeprom reader and adapter necessary to check the roms data so I'm trying this first.
Now I’m trying to get a copy of SuperMon or TinyMon that will run on my 2001-8. Best I’ve come up with on the internet is a complete text listing of TinyMon. I have a cassette with SuperMon4 but that doesn’t work on the 8k. The reason I wanted a monitor is I hope it could help me see the registers when the computer crashes. Does anyone know if this is true or is the monitor not very useful for troubleshooting ROM issues. Anybody have a copy of a monitor? Supermon 1 for 8k or the original tiny mon that Commodore had on tape for original pets?
On the bright side repairing the tape deck was as easy as buying a replacement rubber belt.
No replies here. Have I stumped everyone? I hope this thing is fixable.

Jim

jimdinunzio · Jun 20, 2015

Hi,
Well I got the ROM replacement, tried it out and, alas, it did not fix the problem. It's back to the drawing board. What the heck could go "bad" after 5-10 minutes of use and cause the computer to hang at the $FFFE/F IRQ vector interrupt and rebooting not help until several hours go by and freezing the board has no effect? I'm going back to the I/O chips, PIA 1 and VIA. The IRQ originates from there and maybe there is a problem. Since I encountered identical behavior with the two PIA chips, perhaps it is the VIA. Pulling out the PIA #1 allows the system to boot even when warm but then no interrupts are happening.

I could really use the loadable (machine language) monitor so maybe it will output some registers before it hangs. I'm still trying to locate it on the net. Does anyone have it in a TAP or PRG format that they could send me?

Thanks,
Jim

jimdinunzio · Jun 28, 2015

Hi,
In case anyone has any ideas, I'm continuing to update this thread.
I typed in the original TIM from the printed code. Several mistakes found when disassembling due to 8's looking like B's or 0 with the strike line. Thanks go to the Vice emulator, the only way I could do any of this. TIM works on my machine. I left TIM running until the system crashed and got this:View attachment 25021.

B* PC SR AC XR YR SP
.; F75F B4 32 E6 00 E4

The SR flags indicate that a BRK software interrupt has occurred and that interrupts are disabled.
The location $F75F is in the middle of an instruction as shown below. Somehow the PC was set to
$F75E which has value 00 and caused the BRK. I noticed that the PC value printed is the BRK address + 1.

The closest code in the rom as documented by "Programming the Commodore PET" is $F75B which compares the jiffy clock's value with the constant held in a table and if it matches, it resets the jiffy clock to 0.

F75D BD 00 02 iF75D LDA x0200,X

I also tried swapping the zero page RAM again with RAM from higher up that I have run tests on, but the hang persisted.

I thought I should verify the F000-F800 ROM. I wrote a small basic program and saved out the proper code for F000-F800
and loaded in at $0800. The ROM checked out at least when everything is working. I will use this program to check other ROMS as well. However I let it repeat verifying when the computer inevitably crashed and at that moment the program reported location F7C7 returned $D1 instead of the correct value, $D0. The low bit turned on for some reason. Interesting.

I'm thinking something may be going weird with the IRQ generation circuit after 5-10 minutes. I'll need a scope if I'm ever going to check that. I'll see if I can locate an inexpensive option on that.

Jim

pski · Jul 2, 2015

Hey Jim,

Just wanted to say that I'm waiting to see how it turns out. I think you're a bit far down the rabbit hole where very few travel these days so I can't offer much help from here.

Good luck and keep at it!

Pete

dave_m · Jul 2, 2015

jimdinunzio said:
I'm thinking something may be going weird with the IRQ generation circuit after 5-10 minutes.

Please check the state of the signal 'SYNC' (G8-pin18 ) when the hang occurs. It is the 60 Hz signal that the PIA#1 is programmed to respond with the interrupt. It should always be running and will have a DC average voltage of about 3.5V or 4.0V on the voltmeter. If it gets stuck at zero or +5V, it may tell us something. You have a very interesting problem, but it is quite fixable.

jimdinunzio · Jul 7, 2015

continuing

continuing

dave_m said:
Please check the state of the signal 'SYNC' (G8-pin18 ) when the hang occurs. It is the 60 Hz signal that the PIA#1 is programmed to respond with the interrupt. It should always be running and will have a DC average voltage of about 3.5V or 4.0V on the voltmeter. If it gets stuck at zero or +5V, it may tell us something. You have a very interesting problem, but it is quite fixable.

Hi,
Thanks for the encouragement! It helps a lot because without it I'm not sure if I'm just wasting my time. I had to take a break after spending a lot of time getting frustrated and worked on some other home projects.

Since I last wrote I tried debugging from the hardware side again and briefly had the portable ARM based DSO201 scope with bandwidth of only 1Mhz. The first thing I checked was the SYNC line you mentioned and the signal shape looks the same before and after the hang and the frequency is steady at 60Hz +- 3 or so. The IRQ line signal on the CPU and exiting the PIA #1 looks just like text book with a sharp dip and then a curved rise back to 5v for the first 5-10 minutes (also at 60Hz) . then abruptly flatlines to low and stays there. I tried checking various clock points thinking that maybe the crystal is having issues, but this scope can't handle that bandwidth. I'm getting a real scope with 50Mhz bw (Siglent SDS1052DL) this week and should be able to check anything on that board and I'll post the waveforms here.

From other blogs this usually means the VIA chip is bad or marginal. I suppose there's a non-zero chance that both of my VIA chips are equally marginal. Maybe I should get a replacement just to rule that out. I think I saw a modern compatible replacement for it somewhere. If anybody has a confirmed source please let me know.

Jim

dave_m · Jul 8, 2015

jimdinunzio said:
The IRQ line signal on the CPU and exiting the PIA #1 looks just like text book with a sharp dip and then a curved rise back to 5v for the first 5-10 minutes (also at 60Hz) . then abruptly flatlines to low and stays there.

OK, if the 60 Hz SYNC signal is running when the interrupt gets stuck, the problem may be in the PIA #1 (G8 ) or in the ROM where the interrupt handler exists. In BASIC 4 I know it is in the $E000 area ROM. I'm not sure what type of ROM you have the old 6540 or the 2316?
-Dave

jimdinunzio · Jul 19, 2015

dave_m said:
OK, if the 60 Hz SYNC signal is running when the interrupt gets stuck, the problem may be in the PIA #1 (G8 ) or in the ROM where the interrupt handler exists. In BASIC 4 I know it is in the $E000 area ROM. I'm not sure what type of ROM you have the old 6540 or the 2316?
-Dave

Hi Dave,
I have 6540 ROMS in this 2001-8. I verified the ROM data from E000-E800 against the one in the VICE emulator with a basic program and small machine language "peek" user function. It checked out fine while the computer was running.
I narrowed the memory verification to E66B-E75B which includes the ISR and had it verifying it repeatedly until it crashed. So it looks like the ROM is ok. If the problem were the PIA #1 chip itself than why did swapping it with the other PIA #2 chip behave exactly the same? I just haven't found any smoking guns yet, but now have the new scope so I'm going to try looking again.

Jim

jimdinunzio · Aug 1, 2015

Captured waveforms

Captured waveforms

Hi,
I've captured some waveforms trying to debug the board. 6502_00_in is the clock in at 1Mhz. 6502_01_out/6502_02_out are the two clock outs. These are sawtooth in appearance. Is that correct? I've already swapped the 6502 from my Apple II+ one early on in debugging the PET board and it didn't behave any different so I thought the CPU was fine.

6502_00 in

6502_01 out

6502_02 out

6502_IRQ was taken when the computer was working. I noticed a slight right to left jiggle of the down transition when monitoring the IRQ pin, which you can see by comparing 6502_IRQ with 6520_IRQA image from the next post. I know those are from different sources but I'm telling you I saw a jiggle just like the difference between these two images while on the 6502 IRQ pin.

The 6520 CB1 pin shows the 60Hz hw interrupt SYNC signal which is maintained before and after the hang.

6502_IRQ

6520_CB1

continued on next post.

jimdinunzio · Aug 1, 2015

captured waveforms (continued)

captured waveforms (continued)

Here is the IRQA from the 6520 chip (PIA #1)

I was concerned that I saw a right to left (time) wobble in the downward transition which *may* get worse as the machine is heating up. I'll have to do some more monitoring to see that for sure. Once the computer hangs the IRQ lines stay low of course.

I can capture and post any signal now so if you have any ideas, please let me know.

6520_IRQA

dave_m · Aug 12, 2015

jimdinunzio said:
Once the computer hangs the IRQ lines stay low of course.

When it hangs, I would expect that the address lines are in the interrupt handler range in $E000 ROM (H3). Can you check the 16 address lines and list which ones are 1,0 or pulsing. Pulsing address lines may indicate the CPU is stuck in a loop. Also check the 8 data lines for some clue. Perhaps because of a bad $E000 ROM (when warm), it executed a KIL instruction (illegal instruction) which will hang the CPU. If so, the CPU Sync signal (CPU-pin 7) would be stuck and not pulsing. If it is pulsing, the CPU is still fetching instructions and did not hit a KIL instruction.

When it hangs, did you say you squirted the H3 ROM with circuit freeze and gave it a reset?

KC9UDX · Aug 12, 2015

It seems to me that my 2001-32N does this. If my memory worked better I'd probably be able to help very quickly... Might help too if I wasn't reading this on a tiny screen at 3AM.

If /IRQ gets asserted continuously after some point, try pulling out every socketed chip that has an /IRQ output and see if it stays low.

VIA chips are still manufactured and available at multiple sources, especially Jameco.

dave_m · Aug 12, 2015

KC9UDX said:
If /IRQ gets asserted continuously after some point, try pulling out every socketed chip that has an /IRQ output and see if it stays low.

KC, we would expect that the interrupt is coming from the G8 PIA#1 due to the proper pulsing of the Vertical Sync signal on its CB1 line, but it wouldn't hurt to verify. I think Jim has stated that with G8 removed, the PET always makes it to the Commodore screen where it patiently waits for an interrupt which never happens. Since he has swapped PIAs, to me the problem may indicate bad zero page RAM or a bad H3 ROM although maybe the interrupt routine calls subroutines on other ROMs? A check of the address lines when stuck in the interrupt may tell us something. Also if he had a scope, I would have him carefully check signals on the PIA for suspicious activity.
-Dave

KC9UDX · Aug 12, 2015

When I got my 2001-32N, it had one signal that would at random be asserted and get stuck that way. It would usually happen within the first minute of booting.

I can't recall if it was /IRQ, /NMI, /RESET, or what. Whatever the signal was, it went to too many soldered chips for me to isolate it, so I put a jumper wire to short the pullup resistor, which made the whole system work very reliably for a long time. In retrospect, that doesn't make any sense, especially if it was any of those three signals. But, I lost my notes. Whatever it was, it did make sense at the time.

I went to look at it last night to see if I could figure out what I did, and now my 2001-32N is not working. This is a pretty big coincidence, because I've been using it a lot. I'll have to put it on the bench, and find the new problem. When I do, I can investigate the old problem.

KC9UDX · Aug 13, 2015

Well I guess I won't be taking my 2001-32N out of service, it came back to life.

Once getting a ?WYNTAX EVVOV on boot and then a quick SYS64790 and it's back in business..

dave_m · Aug 13, 2015

KC9UDX said:
Once getting a ?WYNTAX EVVOV on boot and then a quick SYS64790 and it's back in business..

The text should be SYNTAX ERROR so hex 13 (S) became hex 17 (W) and hex 12 (R) became hex (16) indicating data bit 2 is getting stuck high. Check RAMs I1 thru I8.

KC9UDX · Aug 13, 2015

RAMs are all new and have been swapped several times. I thought I wrote this all up in a thread here, but I looked last night and couldn't find it. This particular machine almost always does the WYNTAX EVVOV thing on a cold boot. It sometimes says GOMMODOVE FAWIC and -2.56E35 FYTEW something or other. Whatever causes it never lasts long enough for me to find it. I suspect I may have a tiring ROM, or uP. I lab toward ROM though. I think I swapped the 6502 during my initial troubleshooting but I'm not positive.

Sometimes zero page must be corrupt because I'll have to do a warm boot and it will be fine. Other times, it will work just fine in spite of the corrupted boot message.

It's very stable and reliable after it's been on a minute. If that ever stops happening, I'll put it on the bench. I can't open it where it sits (on a shelf below a ][+)

Last night was an exception, it has never done that before. It had been on continuously for over a week. Several days in, it was still running, but then I ran a program that cleared the screen and went into an infinite loop. When I tried to break out of it last night, it wouldn't respond, and upon cycling the power I got a blank screen.

I wouldn't leave it on like that but I started a very long file transfer with another machine accidentally on the same GPIB bus, and turning off the PET would have disrupted that.

jimdinunzio · Aug 13, 2015

dave_m said:
KC, we would expect that the interrupt is coming from the G8 PIA#1 due to the proper pulsing of the Vertical Sync signal on its CB1 line, but it wouldn't hurt to verify. I think Jim has stated that with G8 removed, the PET always makes it to the Commodore screen where it patiently waits for an interrupt which never happens. Since he has swapped PIAs, to me the problem may indicate bad zero page RAM or a bad H3 ROM although maybe the interrupt routine calls subroutines on other ROMs? A check of the address lines when stuck in the interrupt may tell us something. Also if he had a scope, I would have him carefully check signals on the PIA for suspicious activity.
-Dave

Hi Dave,
Thanks for the help.
Yes, it is still true when the G8 is removed the PET always makes it to the Commodore screen. I've swapped the PIAs with no change. Also I've verified the 0 page RAM pretty well by moving the chips up to higher locations and using the basic RAM test more than once. I have run ROM verification tests as well for E000-E7FF and F000 - FFFF sucessfully before the system hangs . When the system hangs the memory checker suddenly printed a mismatch and when TMON was installed a BRK instruction was hit just before it hangs. (see earlier posts). Every time I check after the hang the address bus is at $FFFF and the data bus is $E6. The 6502 Pin 7 is stuck LOW which you pointed out indicates an illegal instruction was hit. The chip select are CS4:LOW, CS3:LOW, CS1:HI on H7 which I believe means it's selected. The address and data lines of the PIA, 6502, and H7 are all consistent.
The values at FFFE/FFFF are 6B E6, which indicates location $E66B, the IRQ handler. At this point I would think the next step would be the CPU trying to fetch an instruction from that location, but it doesn't seem to make it there. Wouldn't the address bus change to load the instruction? I have a ROM replacement board and chip I bought flashed to the H7 ROM contents. Swapping it in didn't make any difference in this behavior.

Thanks,

Jim

KC9UDX · Aug 13, 2015

Keep in mind that the 6502 could be fetching either an instruction or data at $FFFF. Which it is, determines what happens next. I don't think the wrong thing is happening here.

Also, assuming it is correctly reading the vector (which I believe it is), and /IRQ is held low, the IRQ vector is being continuously fetched.

Is /IRQ being held low at the time of the crash? Does removing the PIA after the crash cause the system to resume? (For best practise, install the chip with the IRQ output pins bent up, and use small screwdrivers or wires in the socket to make and break contact)

My memory is being jogged here and it seems this is what I did. Mind you, the only diagnostic tools I used were a logic probe and the PET's usual CRT display.

I may have to put my PET on the bench anyway, just to help you out. I think it's likely it has the same problem yours has. And again, my solution isn't actually correct, but it does make for a working PET.

My PET 2001-8 Revival Project

Member

Member

Member

Member

Moderator

Veteran Member

Member

Veteran Member

Member

Member

Member

Veteran Member

Space Commander

Veteran Member

Space Commander

Space Commander

Veteran Member

Space Commander

Member

Space Commander