Last Active: Nov 02, 2019
Threads: 18
Posts: 57
Reputation:
0
After downloading some what i'd all end of life torrents, like 1001 albums you must hear before you die and 500 Greatest Albums of all time I find that my torrent client has accepted bad data from seeders, so by force checking I re-enter the swarm only to have my files complete in exactly the same way. So the only reason I can see is that the seeders have bad data and have not detected it, seriously nearly 75% of all albums from these torrents has missing or corrupted data.
So I put this forward to those who make these decisions, it would be better if all torrent clients force recheck upon download completed if only to ensure it was written to disk correctly. Also would a different approach to compressing files asked for cause I have one that is completely fault tollerent, you just might need a few thousand years or computers to get it done tho.
Last Active: Oct 17, 2024
Threads: 226
Posts: 6,322
Reputation:
26
I never got a torrent with such a corruption, but one little bugger; I suppose the BT protocol has checks, but it won't check the contents IF the uploader creates torrents with broken files to begin with, SO force recheck upon download would be useless; the uploaders should recheck their files before creating torrents, and those checks vary according to the material: Audio, video, games, etc and in varying degrees of integrity, completion, authenticity, etc. Sadly, many guys just care putting up their hasty "masterpieces" on the Internet.
Last Active: Nov 02, 2019
Threads: 18
Posts: 57
Reputation:
0
Nov 24, 2018, 10:30 am
(This post was last modified: Nov 24, 2018, 14:23 pm by ID10TError. Edited 8 times in total.
Edit Reason: I edit stuff lots, lots and lots of afterthoughts etc
)
(Nov 24, 2018, 09:15 am)dueda Wrote: I never got a torrent with such a corruption, but one little bugger; I suppose the BT protocol has checks, but it won't check the contents IF the uploader creates torrents with broken files to begin with, SO force recheck upon download would be useless; the uploaders should recheck their files before creating torrents, and those checks vary according to the material: Audio, video, games, etc and in varying degrees of integrity, completion, authenticity, etc. Sadly, many guys just care putting up their hasty "masterpieces" on the Internet.
But the torrent client has incorrectly assumed that the data received is correct and never realizes its mistake, which would have happened if the file was checked upon download completion. Also i'm given to think that hash collisions and torrent poisoning is occurring more frequently these days.
Am i wrong? I'd like to know!
Also if you made the torrent with the incorrect data it would have a different hash. You can force check as often as you like the downloaded amount starts to drop and once again the files are 'broken'. So the downloaded data was already corrupt.
Also my compression algorithm expects collisions, that vary so wildly from original data and each other that they should be obvious incorrect selections. And they should not pass overall checksum verification.(The collisions)
At the moment my torrent client thinks it has successfully downloaded those 2 aforementioned torrents, I don't have the heart to tell it to recheck the files again cause if it was human it would react badly to the news.
other torrents with seeders giving false data
LATEST VST BANKS VOL-61 (7% or so corruption 40% files unusable)
MEGA SAMPLES VOL-40 (under 12% corruption 50% unusable files)
MEGA SAMPLES VOL-82 (about 12% corruption 35% unusable files)
These are estimates as I cannot be f'd rechecking them again and writing down numbers but the corruption is there
Last Active: Oct 03, 2024
Threads: 67
Posts: 6,379
The piece is checked before it is written to disk. If it is bad, it is rejected and re-downloaded from the swarm. Some clients will ban a peer if it sends too many bad pieces.
If you are finding bad pieces after doing a force recheck, then something on your system is modifying the data after it is saved by your client.
Last Active: Oct 17, 2024
Threads: 226
Posts: 6,322
Reputation:
26
Most cases of bad downloads relate to bad hardware / software. BT includes MD5 for files, but (surprisingly) doesn't use that info*, but some clients go beyond the standard protocol.
An algorithm to recreate files is better than just verifying and finding out one or more files needs to be downloaded again; but depending on size and speed, it will be easier just to retry downloading than running it "thousands of years".
It seems to me a BEP to include that won't make it this decade, as other simpler and/or more pressing matters are still waiting. Easier to propose for all clients to include content data file MD5 (the field is already there) at moment of torrent file creation, and to check it at moment of download completion.
* BT Wiki: https://wiki.theory.org/index.php/BitTor...Dictionary
Last Active: Nov 02, 2019
Threads: 18
Posts: 57
Reputation:
0
Nov 25, 2018, 07:56 am
(This post was last modified: Nov 25, 2018, 08:46 am by ID10TError. Edited 3 times in total.)
Original seeder sets up another seed that hasn't detected a corrupted download, original seed disappears, new seed assumes data it has is correct and shares that without regard to checksum.
If i re check the files again the will be again corrupted despite matching sizes, obviously the content does not match the checksums, given partial data availability how much space would have to be 'reconstructed'? Do ya know? How do I calculate the spacing between the cheksums probably chunk size hey? if you have a pattern to fit its a lot better than reconstructing from checksum only. you calculate all the sectors then weave them into cluster possibilities then wander that tree searching for combinations that match the final file checksum, I say that but going from that to an actual piece of working software that finishes in my lifetime is a massive undertaking, guess i'll have to factor how much time a client would work with other clients reconstructing a stream?
My torrent client rarely bans a person for currupted download how are the errors getting through? Mathematically perfect?, I.e. hash collision?
I cannot see how the files would be modified by any other application while tixati is also writing to them.
Last Active: Oct 17, 2024
Threads: 226
Posts: 6,322
Reputation:
26
Mr. Masami tested a hashing algorithm to rebuild data, but even on a good machine it takes some time (days).
Please see this thread: https://pirates-forum.org/Thread-Decentr...le-Sharing
My idea was about searching, not certifying, but Masami is a pragmatic guy and proposed something similar to what you have in mind.
Again, I should point the operating system is responsible for data, it should guarantee a file is correct; but if a user just rename a .JPG to .MP3, tamper with data, or the OS can't detect file corruption before you upload, that's enough to make a bad upload. If the underling systems on both points (uploader/downloader) aren't solid, you'll end with bogus.
So we're at the uploaders' mercy, they should check and make sure the data is 100% accurate to the source. Thus, rippers should do it too. It's like I buying a CD to rip, encode, and upload; some sites demand I use X software, under Y circumstances, then follow Z method to upload. If you propose to make a error-free transfer tool, remember it only covers the transfer.
Note it's impossible to make a tool capable of testing if the content of a file is indeed what's supposed to be without the source author including some info (hash), and few file types include certification and reconstruction structures; the contents of EXEcutable files, for example, are unpredictable.
Finally, I don't know much about Tixati, but I do know some programs can read and write data even if a file is in use, that's "file sharing" for multi-tasking; a bank database does it with thousands users at once, home computers also do it, just not sure how Tixati handles it. Maybe it make a file "read only" for other processes while the torrent is on, but idk.
Last Active: Nov 02, 2019
Threads: 18
Posts: 57
Reputation:
0
Nov 26, 2018, 07:50 am
(This post was last modified: Nov 26, 2018, 10:18 am by ID10TError. Edited 1 time in total.)
(Nov 25, 2018, 14:47 pm)dueda Wrote: Mr. Masami tested a hashing algorithm to rebuild data, but even on a good machine it takes some time (days).
Please see this thread: https://pirates-forum.org/Thread-Decentr...le-Sharing
My idea was about searching, not certifying, but Masami is a pragmatic guy and proposed something similar to what you have in mind.
Again, I should point the operating system is responsible for data, it should guarantee a file is correct; but if a user just rename a .JPG to .MP3, tamper with data, or the OS can't detect file corruption before you upload, that's enough to make a bad upload. If the underling systems on both points (uploader/downloader) aren't solid, you'll end with bogus.
So we're at the uploaders' mercy, they should check and make sure the data is 100% accurate to the source. Thus, rippers should do it too. It's like I buying a CD to rip, encode, and upload; some sites demand I use X software, under Y circumstances, then follow Z method to upload. If you propose to make a error-free transfer tool, remember it only covers the transfer.
Note it's impossible to make a tool capable of testing if the content of a file is indeed what's supposed to be without the source author including some info (hash), and few file types include certification and reconstruction structures; the contents of EXEcutable files, for example, are unpredictable.
Finally, I don't know much about Tixati, but I do know some programs can read and write data even if a file is in use, that's "file sharing" for multi-tasking; a bank database does it with thousands users at once, home computers also do it, just not sure how Tixati handles it. Maybe it make a file "read only" for other processes while the torrent is on, but idk.
I do not disagree with you but am certain that no file is being modified by any other program, i'll do process monitoring if you would like me to but time and again re check finds some data incorrect and then the torrent downloads it again. I'm certain if I do this on another computer I will be just repeating the same steps that never gets a 100% intact copy because I believe there is now nowbody around with a clean copy.
Thanks for pointing out Masami's thread, its right up my alley.
Last Active: Oct 03, 2024
Threads: 67
Posts: 6,379
I'm still calling bullshit. The possibility of an entire swarm having the same bad data that wasn't already rejected by standard piece checks is next to impossible.
On the other hand, downloaded data being modified by another process such as an anti-virus or, in the case of music files, just listening to them, is a regular occurrence.
Share the torrent links if you have some examples of one of these bad swarms.
Last Active: Nov 02, 2019
Threads: 18
Posts: 57
Reputation:
0
Hi Moe lube your tag, oh shit I meant love. Seriously "love" guys what did you think i was sayin? Dont taze me bro.
Look i'm just downloading the files I excluded from the torrent to get it to 100% completion after recheck to ensure this is the case, this will be the more than 5th time I have done this.
And will post up direct links to usually above 50gb torrent files with specific files listed that never pass verification after it is forced.
|