The Megaupload takedown took a whole lot of data with it, and it eventually got obliterated. Some of it was pirate data, and some were legit files too. New research shows that at least 10 million innocent files got deleted.

Researchers at Boston's Northeastern University, together with colleagues from France and Australia, ran a study to attempt to check that copyright-infringement status of a ton of files that had been Megauploaded shortly before the takedown.

They examined metadata from links to content that had been hosted on Megaupload, and took representative samples of 1,000 files at a time, manually deciding if they were infringing, non-infringing, or undecided.

The researchers found that in the end, a whopping 31 percent of Megaupload's content was clearly infringing, but at least 4 percent of the 250 million uploads  - which is about 10 million files - was not. Plus there was a majority of 65 percent where the researchers couldn't tell one way or the other.

Four percent isn't a lot, but copyright-infringing files are duplicates by their very nature, and non-infringing files are far more likely to have been unique, meaning their deletion was a real, actual loss.

[Torrentfreak]