So I have a mystery I’d like to understand. My NetDrive code works great with every packet driver I’ve tested it with except for 3Com PCI packet drivers. Some of those are known to be buggy, but let’s assume that’s not the problem here.
My device driver loads at CONFIG.SYS time and hooks the timer interrupt (IRQ0) for two purposes:
With no remote drives connected everything is fine. If you connect a remote drive, the machine crashes a few seconds later. I’m pretty sure that it’s trying to send an ARP response when it crashes. If there is no ARP traffic everything is fine, but if it has to send an ARP response under the timer interrupt the machine will crash.
The failing code looks like this:
The pseudo-code in the middle is not special, I just wanted to keep this readable. There is a call to a routine to do the interrupt because the packet driver can be anywhere from interrupt 0x60 to 0x6F and I need to use a jump table to branch to the right INT instruction. All registers that get touched are saved and restored, and that code works for both sending regular UDP packets and ARP packets - just not for ARP packets under the timer interrupt with the 3Com PCI packet driver.
After much experimentation I found a variation of code that works:
The code in the middle is exactly the same. The big difference is in how I do the chaining to the other interrupt handlers. In the failing code I just do what I need to do, then use a far jump to chain to the next interrupt handler. In the working code I push the flags and make a far call; that combination makes it look like an interrupt has occurred when the next handler gets control, and it’s IRET instruction will pop both the return address and flags off of the stack before my code continues.
The code should be equivalent except for when my code runs. In the first variation my code runs first (tries to send a packet), then chains to the other interrupt handlers. In the second variation my code chains first, and then runs.
The only thing I can think of that might matter is that the working code allows the interrupt condition to be cleared on the 8259, and somehow the 3Com code is sensitive to that. Is there something else I’m missing?
(I can't send packets under the received packet interrupt either, even though it works on every other packet driver. I suspect that path has the same problem, but my interrupt chaining technique wouldn't be the cause there.)
My device driver loads at CONFIG.SYS time and hooks the timer interrupt (IRQ0) for two purposes:
- Implementing a simple countdown timer so I can detect timeouts.
- Sending ARP responses
With no remote drives connected everything is fine. If you connect a remote drive, the machine crashes a few seconds later. I’m pretty sure that it’s trying to send an ARP response when it crashes. If there is no ARP traffic everything is fine, but if it has to send an ARP response under the timer interrupt the machine will crash.
The failing code looks like this:
Code:
timerInt proc far
cmp cs:arp_pending, 0
jz timerInt_chain_only
<switch to private stack>
<push some regs>
<call to a routine that does int 0x60 to send the packet>
<pop the regs>
<restore original stack
timerInt_chain_only:
jmp cs:[timerOrigInt]
timerInt endp
The pseudo-code in the middle is not special, I just wanted to keep this readable. There is a call to a routine to do the interrupt because the packet driver can be anywhere from interrupt 0x60 to 0x6F and I need to use a jump table to branch to the right INT instruction. All registers that get touched are saved and restored, and that code works for both sending regular UDP packets and ARP packets - just not for ARP packets under the timer interrupt with the 3Com PCI packet driver.
After much experimentation I found a variation of code that works:
Code:
timerInt proc far
pushf
call cs:[timerOrigInt]
cmp cs:arp_pending, 0
jz timerInt_chain_only
<switch to private stack>
<push some regs>
<call to a routine that does int 0x60 to send the packet>
<pop the regs>
<restore original stack
timerInt_chain_only:
iret
timerInt endp
The code in the middle is exactly the same. The big difference is in how I do the chaining to the other interrupt handlers. In the failing code I just do what I need to do, then use a far jump to chain to the next interrupt handler. In the working code I push the flags and make a far call; that combination makes it look like an interrupt has occurred when the next handler gets control, and it’s IRET instruction will pop both the return address and flags off of the stack before my code continues.
The code should be equivalent except for when my code runs. In the first variation my code runs first (tries to send a packet), then chains to the other interrupt handlers. In the second variation my code chains first, and then runs.
The only thing I can think of that might matter is that the working code allows the interrupt condition to be cleared on the 8259, and somehow the 3Com code is sensitive to that. Is there something else I’m missing?
(I can't send packets under the received packet interrupt either, even though it works on every other packet driver. I suspect that path has the same problem, but my interrupt chaining technique wouldn't be the cause there.)