The best place is probably via e-mail to make sure everyone is included in the discussion (I don't think Aitotat visits this forum much anymore). E-mail addresses are at the top of XTIDE_Universal_BIOS\Src\Main.asm. But if you prefer to discuss here in this thread then that is also fine of course.Hi,
Is it the good place to discuss with the XT IDE Authors ?
I wanted to add support for the CH375 too a few years ago but the lack of documentation (in english) certainly made it a challenge. If you want to take a stab at it then just go ahead. I'll help in any way I can.I would like to add the USB Device CH375 Support to the XT IDE, and make suggestion for speed optimization (XT IDE can be speed up on 8086/8088 easily)
Hi,The best place is probably via e-mail to make sure everyone is included in the discussion (I don't think Aitotat visits this forum much anymore). E-mail addresses are at the top of XTIDE_Universal_BIOS\Src\Main.asm. But if you prefer to discuss here in this thread then that is also fine of course.
I wanted to add support for the CH375 too a few years ago but the lack of documentation (in english) certainly made it a challenge. If you want to take a stab at it then just go ahead. I'll help in any way I can.
Regarding speed optimizations for 8086/8088; I'm guessing it's going to involve loop unrolling?
The CH375 supports multiple drives, right? Each drive has its own DPT (Drive Parameter Table) so the drive related variables should probably go there. The rest somewhere else in RamVars (see RamVars.inc).Hi,
Thanks.
The lack of documentation is not a problem, there are some .PDF and the existing driver/BIOS.
Yes, I unrolled the loop and removed the sub function CALL, Delays (Not Needed)
I also did a 16Bit I/O Version as the LoTech board allow to use the A1 Adress line instead of the A0 (It is working)
I have multiple things to check :
- How the RAM is used to store variables.
I could describe it but it's a bit involved. LOL :D- How to integrate a new "Driver" in XT IDE
You don't have to worry about that at all. It depends entirely on the order of option ROM initialization and drive detection. The order of drive detection (for drives under XUB control) depends on the controller configuration in XTIDECFG (Primary, Secondary, etc).- How the BOOT Sequence Work (In case of DUAL Board (CF and USB))
Have fun! (And do not hesitate to ask questions!)- In fact I have to learn everything about XT IDE First
Some things that come to mind;- The CH375 part is not a problem at all, it is soooo Simple.
Just send a patch to me when ready.- How to "Officially" Add my work to XT IDE.
Yes, I'm aware of that, but only by unrolling the PIO transfer loops even more than they already are, right? The problem is that doing that takes up a lot of ROM space. The current amount of unrolling is a compromise between speed and size. The small builds are too small for more loop unrolling but we might be able to do something about the large builds. Or did you have something else in mind?- I found part of the XT IDE can be improved to speedUP the 8BIT PIO for 8086 and 8088... So, the XT IDE Can still be improved.
Older DOS versions do not support extended interrupt 13h, so they speak CHS only, which is why we need the LBA to P-CHS conversion as mentioned above.It is more simple to do a Driver than a BIOS.
For example, the DOS Block driver all work in LBA Mode, So We don't really have to worry about the CHS and associated problems...
No, I did not.Did you start coding something anyway ?
I am still not clear about what is a DTP. I saw in other BIOS Code it is a table for the Disk, and the pointer to these tables is in the IRQ table.The CH375 supports multiple drives, right? Each drive has its own DPT (Drive Parameter Table) so the drive related variables should probably go there. The rest somewhere else in RamVars (see RamVars.inc).
Yes, this is why I say it is more simple in a Driver, we don't have to do the CHS supportOlder DOS versions do not support extended interrupt 13h, so they speak CHS only, which is why we need the LBA to P-CHS conversion as mentioned above.
I understood the size problem, as doing PIO is more complex on the XTIDE due to the 2 different port for the High and Low. This make the code bigger and even more if unrolled.Yes, I'm aware of that, but only by unrolling the PIO transfer loops even more than they already are, right? The problem is that doing that takes up a lot of ROM space. The current amount of unrolling is a compromise between speed and size. The small builds are too small for more loop unrolling but we might be able to do something about the large builds. Or did you have something else in mind?
Thanks for all the tips.* No IDENTIFY_DEVICE response means there are no P-CHS values. You will need to come up with a sensible way to convert the LBA count to P-CHS, preferably in such a way that it does not waste disk space unnecessarily and also is compatible with "everything else". Perhaps just make it X:16:63? Though that might not be the most efficient way regarding disk space usage. Aitotat might have some idea on how to best do this.
I am still not clear about what is a DTP. I saw in other BIOS Code it is a table for the Disk, and the pointer to these tables is in the IRQ table.
But.. anyway, it need to be in RAM if we want to modify it. (Is it also in RAMVAR ?)
Also, the RAMVar are stored in the PC Ram by stealing 1Kb at the end of the RAM, and it seems to be a really small structure. and you said in another answer that the RAMVAR is small. But 1Kb is Huge.
CH375 can support multiple drive I suppose with USB Hub, but as XTIDE is more for Boot, detect and show only one is sufficent.
I understood the size problem, as doing PIO is more complex on the XTIDE due to the 2 different port for the High and Low. This make the code bigger and even more if unrolled.
Currently the XTIDE read 16Bytes per loop, this is a good comprimize for performance.
I was thinking about doing a lodsw for disk write.. But in fact it is done (I did not read everything in detail, sorry)
Anyway, the 8086 code can be optimized for the Read with a stosw and adding two exchange, it add only one byte to the code and speed up the 8086 read by 10% (If using 16Bit memory)
The "Physical CHS" values (P-CHS) are presented by the drive in the IDENTIFY_DEVICE response. The upper limits for those are set by the ATA specification and are (should be) a maximum of 16383:16:63. The P-CHS values are then converted in various ways into "Logical CHS" values (L-CHS) that the BIOS (and DOS) uses. This is what's commonly known as the "geometry translation" and how it is done should preferably be the same across all software/BIOSes for obvious reasons.Thanks for all the tips.
Regarding this specific point, I am aware about the 504Mb disk limit support, due to the problem of using 8Bit for Head.
The XT IDE anyway show big disk with 265 Heads, instead of the physical limit of 16 that existed in the past.
Do you know how the old DOS version behave it there is more than 16 Heads ? Do they just ignore and do like if there is 16, then we lose a lot of space (As 16 Heads are used out of 256)
If we show the USB Disk as x.16.63 like you propose, this block the size to 504Mb, then the boot on FAT16 Disk > 504Mb and FAT32 will not be possible.
With CF, I purchase 256Mb, 512 CF, so I suppose the XT IDE show them as having 16 Heads. this will be more a problem for USB Key as if we want ot use USB, this is with big Keys (4Gb or more) and we need to still have windows able to read them.
Hi,With the "standalone" CH376 BIOS (https://gitlab.com/hakfoo1/v40-bios/-/blob/main/disc.asm), I queried the device for a LBA sector count, then set the geometry to 16 heads, 63 sectors, and as many cylinders could fit. This caps at 504Mb, and the drives seem to work when taken to another PC (although, obviously, wasting a lot of space). This is based on the assumption that flash drives are cheap and disposable in a way CompactFlash never got to be.
My naive thought is that it doesn't matter how you do the CHS->LBA math as long as it's the ONLY system trying to "set it", and the other systems either access content only LBA direct, or rely on some stored-on-disc info to determine the chosen geometry. It would get uglier if you had one device trying to map a specific CHS location assuming 16 heads/63 sectors, and then another device used 256 heads/63 sectors without actually confirming.
https://github.com/homebrew8088/8088-PC-Compatible/blob/main/bios/CGA_bios/asm/int13.asm does something interesting; it walks the partition table to try to read geometry info out of the Volume Boot Record. I assume this is prone to failure if the partitioning is nonstandard, but it does effectively let people "throw their hands up" and rely on the (assumed) bigger machine that formatted the drive to know its geometry.
I never even (and still don't) use more than 8G (more like 4G because of bios limitations on some 486 bioses) on sub pentium machines (aka non windows machines (to me))We can also do the combinaison of Both the BIOS and the Block Driver, to be able to BOOT and create an initial partition/Boot sector and to mount other partition and access mode data. But who will need it ? more than 8Gb on a 8086/256...
Anyway, The Ideal target should be to use the CH376 and its capacity to read a FAT32 Disk. Then, have the Disk image as .IMG Files and build a configuration text file to tell the BIOS the .IMG File to use to BOOT... Like this we can simply add Floppy emulation as well.
I understood the size problem, as doing PIO is more complex on the XTIDE due to the 2 different port for the High and Low. This make the code bigger and even more if unrolled.
Currently the XTIDE read 16Bytes per loop, this is a good comprimize for performance.
I was thinking about doing a lodsw for disk write.. But in fact it is done (I did not read everything in detail, sorry)
Anyway, the 8086 code can be optimized for the Read with a stosw and adding two exchange, it add only one byte to the code and speed up the 8086 read by 10% (If using 16Bit memory)
Yes it is what I wrote, the memory Write Optimization with lodsw is present (For Disk Write).Freddy, you should read more of the source; some of the optimizations (other than unrolling) you are mentioning are already in the codebase. They are enabled via different assemble-time defines. For example, on M24 systems, you can't do a 16-bit read from port I/O, but you can do a a 16-bit write to memory, so there is a special define for M24 systems. So there are different builds for different targets with different optimizations: 8088, 8088+chuck mod, M24, 80186+, etc.
Hi,It seems like it might be possible to say "I want to open file X, then seek to sector offset 123456 in that file, then read", but I wonder if that's going to have surprise performance gotchas, i. e. having to walk its way through a potentially fragmented filesystem to the 123456th sector of the file, compared with a more block-oriented "read sector 123456". Some of this is probably fine if you're doing a low-intensity "I write one sector every few hours when someone saves configuration data" style operation, but a disc getting a lot of random seeks might not do well.
You added support of some BIOS functions that are not in the Original BIOS, but are in the XTIDE anyway.I'm not sure how much of the BIOS I worked on is directly reusable, because it made assumptions about how the CH375/6 was wired that are sort of specific to the "EMM Computers" mainboard design. I expect these are different than the "add-on card" used-- I know the port numbers are different at a minimum, and maybe it makes more sense to use an interrupt-driven model instead of the busy-waiting it does.
In XTIDE Code:
; If 8088/8086
in al, dx ; Load low byte from port
xor dl, bl ; IDE Data Reg to XTIDE Data High Reg
stosb ; Store byte to [ES:DI]
in al, dx ; Load high byte from port
xor dl, bl ; Restore to IDE Data Register
stosb ; Store byte to [ES:DI]
There is no distinction between 8088 and 8086 and it is better as there are nore 8088 Around.
Anyway, we can do this for 8086:
in al, dx ; Load low byte from port
xor dl, bl ; IDE Data Reg to XTIDE Data High Reg
xchg ah,al
in al, dx ; Load high byte from port
xor dl, bl ; Restore to IDE Data Register
xchg ah,al
stosw ; Store word to [ES:DI]
Yes, on the paper, but I tried anyway and saw a big difference on my Amstrad PC.The first example is 8 opcode bytes and two bytes written, for a total of 10 memory I/Os. The second example is 11 opcode bytes and two bytes written, for a total of 13 IOs. Even though the STOSW on 8086 is a single I/O, they look roughly the same speed "on paper"... but of course benchmarking with the 8253 timer on hardware is the only way to truly know.
Just FYI: Here's a link to when I helped benchmark some new transfer code that was for 8-bit I/O transfers, but on an 8086 CPU (the Olivetti M24 can't do 16-bit I/O due to a hardware design flaw): https://forum.vcfed.org/index.php?t...-v2-0-0-beta-testing-thread.30259/post-863851
Earlier in the thread somewhere is the difference in the code, so that might be a point of direct comparison.