• Please review our updated Terms and Rules here

Are there any file size limitations with MD5 and SHA1?

computerdude92

Veteran Member
Joined
Dec 10, 2014
Messages
1,059
Location
Alaska
Hello guys,

Within now and the next couple weeks I will be starting my long awaited Bluray M-DISC backup. The discs I will be using are 100GB each and most discs will have a single, roughly 95GB RAR archive containing all the files for that particular disc.

Next to that file will be a text document containing the MD5 or SHA1 code so I can verify the files any time I want.

Up until now I have never generated an MD5 or SHA(x) code for a file larger than about 4GB. Will there be any problems generating codes for near-100GB archive files?

Or, do I need to have multiple archive files on the disc at the max capacity supported by the checksum codes? Can I use either MD5 or SHA1, or is SHA256 needed?

BTW, I just learned that ZIP archives have a limitation - They can't have files larger than 4GB. So now I plan to use RAR. Are RAR archives going to work without issues?

Thank you for your knowledge.
 
SHA should generate an effective checksum for files much larger. If the documentation is to be trusted, a single file containing one million terabytes would be processed by SHA. Might take quite a while though. Depending on the number of files, you might want to generate a SHA for each file within the overarching RAR file. Catch any hidden corruption. I doubt that the longer hashes would provide much benefit even if a hash collision was theoretically possible but check the timing. If SHA256 takes only a little longer, why not use it?

RAR claimed to support 9 GB files within a (theoretical) 8 exabyte RAR file. May support even larger files now; the documentation is very bad at providing simple answers.

A buffer underrun with such a big file ruining the very expensive MDISC would concern me. Others probably have more experience trying such archives and might provide more salient suggestions.
 
There are no file size limits for hash functions used as checksums. ZIP has actually supported >4GB files since the Zip64 spec in 2001. Both ZIP and RAR already store CRCs for each file and check them automatically, so there is no need to do this yourself. But I suppose you can hash the entire archive if you want.

ZIP is convenient because it's natively supported by Windows, but RAR has some advantages including better compression. 7z is also a good (and free) alternative. If you are concerned about data integrity, you may want to consider RAR with Recovery Record enabled.
 
I have a Pioneer BDR-212EBK (BD-XL) writer drive to use for the project. Based on reviews, I think Pioneer is the better quality brand compared to the LG or Asus counterpart drives.

My disc burning software is very old, so I might be in need of an upgrade. This will be my very first time making Blurays too. My regular software is Nero 8 Micro from 2008. It does mention Bluray support in the menu, but is it too outdated for me? Got any recommendations that run on Windows 7 SP1? Thanks for the answers so far.

(BTW, I only use Windows 7 offline)
 
So many questions... that I have.

Why backup to BluRay optical disks? Does not look like a good option for long term retention, just because it's obsolete technology nowadays, and 20 years from now it may be "vintage realm" to find a functioning BlueRay drive.

Also, why use a proprietary format like RAR? You are betting on it not becoming a forbidden thing in the future. You have open source compression in 7zip and in Bzip; use that instead.
 
Back
Top