I'd be surprised if that wasn't the case. Checksums are calculated from the file contents. If the content changes, like when stuff is compressed, the sums also change. Basic example of that attached. NoticeChecksum errors on 800+ files
Code:
>>
Code:
>
Summing up: When compressing stuff, sums are supposed to change. That's how they work: They are a unique signature for a specific file Keep in mind though that there can be collisions, especially with larger files. But it's pretty much impossible to run across a collision by accident due to some file being corrupted, which is why sums are useful for checking whether a file was corrupted or not. On the other hand, any changes to a file will also change the resulting checksums, be they SHA1, SHA512, MD5 or whatever you want. I used SHA1 to keep it simple, but for checksums, anything will work. For encryption, though, you wil want something more robust like SHA512 or something better.
p.s.: If it wasn't clear, "collision" means two different files/strings/etc resulting in the same hash (checksum, in this case). If a hash/checksum was effectively unique to some content, it would need to have all the info to reconstruct the file, meaning it would have to be the same size as the original, or at least close to it. If you make infinite random sequences, sooner or later the checksums will have to repeat because you run out of unique checksums. It's just that those collisions are very far apart, so a file corruption will not result in one. The same is valid for passwords by the way, which is one of the reasons why MD5 is bad.
Last edited: