Information on torrent country of origin?
#1
I hope someone can assist me - I'm conducting academic research on piracy.  I already have one academic publication on the subject:

https://www.tandfonline.com/doi/full/10....17.1398767

Is there any way to get information on a given torrent file regarding where it was created (such as the country)?  What I'm trying to do is use country of origin (perhaps even time zone) as an instrumental variable for some additional analysis.  If I look at a given torrent file in PirateBay I see some nice information on the format, video, and audio specs.  But is there any way to get information on where it was uploaded?  We have a timestamp on when it was uploaded and by what username, but wondering if there is any other information.  IP addresses would be great, but I have to believe that anyone uploading or downloading files is using a VPN to hide that.

Alternatively, if there was even public information on the uploader that would be useful, such as if they self-identify as being in a certain country (or even gender or age group).  I am not trying to identify any particular user - just wondering if there is any information that can tie back to the torrent file itself so we can see if it was created/uploaded in China, Russia, Germany, Canada, etc. 

Any suggestions are greatly appreciated, thanks.
Reply
#2
No.

All you can do is monitor recent uploads and try to determine the first peer but even doing that any analysis will be flawed due to VPN's and seedboxes.
Reply
#3
Something else as well, a torrent is basically a box with some information and one other box that contains file hashes, the box in the box if modified creates a new torrent but the container box that holds the creation date can be changed without affecting the box of file details.
Reply
#4
(Apr 12, 2018, 10:55 am)Kingfish Wrote: Something else as well, a torrent is basically a box with some information and one other box that contains file hashes, the box in the box if modified creates a new torrent but the container box that holds the creation date can be changed without affecting the box of file details.

Thanks, Kingfish - I appreciate the information.  Is there anything in the 'info hash' itself that could contain some information on origination?  Or is just random hexadecimal characters that address what items are in the box?  Thanks for your help.
Reply
#5
No.

The "info hash" is not random it is the very VERY VERY!!! specific result of a calculation being performed on a number of files, such that a change of a single Bit of a single file in a set of thousands of files amounting to thousands of thousands of millions of Bytes will result in a different info hash if the calculation is redone. i.e. the hash represents what the torrent contains not where it came from.

I assume that you are seeking reliable information rather than mere assumptions such as "if it has hard-coded Korean subtitles then it MUST have originated in Korea"?

In addition to what Kingfish has already said, there are other problems:
- a torrents appearance on TPB is not necessarily it's first appearance on the Internet. It may have been copied from elsewhere. And it may have been copied verbatim or modified in some way as to make it appear new or different.
- uploaders who "super aka initial" seed don't show up as seeders--they appear to peers to be downloading the file.
- some uploaders deliberarly don't start seeding torrents until a swarm has built up, so the first peers within a torrent cannot reliably be assumed to be the originators.

As for self-identifying, any uploader who does that may or may not be lying. If they are a major uploader they will almost certainly be lying.

(Apr 12, 2018, 10:43 am)tk48197 Wrote: is there any way to get information on where it was uploaded?...IP addresses would be great, but I have to believe that anyone uploading or downloading files is using a VPN to hide that.

You are certainly right to some degree but the bigger problem is that we would never share that information under any circumstances. We're not Facebook. Seriously.

(Apr 12, 2018, 10:43 am)tk48197 Wrote: Any suggestions are greatly appreciated, thanks.

Depending on the nature of your research.

You could arrange with researchers in other countries for them to create and upload torrents (containing non-copyright infringing content if that is an issue) which you could then track with absolute certainty of their point of origin.

You could attempt to contact outed-individuals (e.g. Yify) to see if they would co-operate with you.

You could approach BREIN, who seek to identify uploaders (in order to harass them) to see if they will co-operate with you.
Reply
#6
Thanks, Sid, that's very informative and helpful. I appreciate all the info. You are correct - the self-identifying information could be erroneous, and sub-titles do not guarantee a country's origin. Thanks to you and Kingfish for the useful information.
Reply
#7
You're welcome.

If you explain what it is you are trying to confirm/disprove then perhaps we will be able to suggest how it might be possible.

[If there are competitive issues which mean you would rather not reveal that in public I can move this thread to a secure area where only staff will be able to see it.]
Reply
#8
(Apr 13, 2018, 01:57 am)Sid Wrote: You're welcome.

If you explain what it is you are trying to confirm/disprove then perhaps we will be able to suggest how it might be possible.

[If there are competitive issues which mean you would rather not reveal that in public I can move this thread to a secure area where only staff will be able to see it.]

Thanks, Sid. What my co-author and I trying to do is control for some of the endogeneity in the econometric model.  For instance, one paper on file sharing (https://www.journals.uchicago.edu/doi/10.1086/511995) used German students' time off from school (such as spring break) to help account for their free time and download behavior.  That paper also had the benefit of using a sample where individuals opted-in to share information so more was known about the individuals. 

Not that we need user information - what we're trying to do is account for endogeneity with some country-specific effects as a possible instrumental variable.  We have data on which files are downloaded, how often, and when (and my co-author has some legal data such as laws/regulations as possible natural experiments when paired with country data).  Even information on the time-zone might be useful.  We're trying to associate download activity to box office sales, but account for some of the endogeneity in the availability/downloading.  I appreciate your help and thoughts, Sid.
Reply
#9
For the reasons explained previously, I can't see any way for you to obtain reliable geographical information, and that includes time zones. If you don't know which country an uploader is in, and you don't know what time of his/her day they upload you cannot even derive their time zone from the absolute time of their upload. Even with the added advantage of being able to see the IP addresses, email addresses and language preferences of uploaders, and with a significant vested interest in blocking the uploading of fake or infected torrents--which anecdotally do tend to originate from certain geographic areas--TPB hasn't been able to manage that feat in 15 years.

But (if you look outside the TPB/torrent box) the absolute time of upload is readily and reliably available. And it might even be more relevant. Downloaders all know (or can know) when a file becomes available but none of them have any idea where the file originated. So the time of upload can and does factor into download decisions but country of origin does not (other than in the rather indirect and imprecise sense that something from China is more likely to be in Chinese than English and that a downloader in the US is more likely to speak English than Chinese and so more likely to not download that particular item).

Are you focusing specifically on TPB?

Are you focusing specifically on torrents?

Because most content originates from "the Scene" or from "release groups", and much perhaps even most of it does not originate on the BitTorrent network.

Scene and p2p release groups are highly competitive, and the times of first releases are recorded on sites such as predb.me

Multiple players will then race to source that content from newgroups or private torrent sites or other such locations with limited access and to upload it to TPB and/or other torrent sites which provide general public access.

predb has an RSS feed so you could tap that to get a continually updated list of the first appearances of content in the filesharing world, reliably cross reference that with it's first appearance on TPB or other torrent sites, and reliably cross reference that with the information you have on "which files are downloaded, how often, and when".
Reply
#10
Thanks, Sid – this is helpful and quite informative. I understand the issues with any geography – it was more academic optimism that some nugget of information might be embedded somewhere. Thanks also for the additional insight on the sourcing and ‘race to source’. This is useful and I’ll take a look into predb – we had been focusing on TPB, but if other avenues have data and information we’ll start to examine those now as well. I really appreciate your input.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Deleting all information from pics and videos Ladyanne3 0 8,715 May 13, 2022, 22:16 pm
Last Post: Ladyanne3
  Anyone have non-Internet information about the Burmuda Triangle? LadyAnn 3 11,456 Dec 27, 2021, 15:04 pm
Last Post: LadyAnn
  Leaving a way to contact in the information LillyLacTac 4 11,871 Jul 24, 2020, 08:36 am
Last Post: dueda
  Besides youtube what's a good source for "how to" information? soulcity 2 11,842 Nov 30, 2019, 05:22 am
Last Post: waregim
  Which country will you prefer to live. MUMBAI1 72 131,990 Jun 08, 2019, 13:39 pm
Last Post: ID10TError



Users browsing this thread: 4 Guest(s)