Speeding up peer-to-peer music transfers

Source: scenta
 

A computer scientist from Carnegie Mellon University has found a way to speed up the process of transferring large music files across the Internet.

We spoke to David G. Andersen, assistant professor of computer science at Carnegie Mellon, who designed the system called Similarity-Enhanced-Transfer (SET).

What old and new can do together

The computer programme is set to identify files similar to the desired file a P2P (peer-to-peer) user wants to download. The SET programme greatly increases the number of potential sources for downloads, boosting its download time by almost 75 per cent.

Andersen explained: “The ‘similar' audio files SET finds are basically identical other than the ‘header’ information such as the artist, song name, etc., etc.  This information is stored in the first part of the file.”

Current P2P technology runs on internet services that give much more bandwidth for downloading than they do for uploading. This is an imbalance that slows the P2P programme. Limiting the upload bandwidth is a factor which slows down transfer speed for the other user. “I want P2P downloads to be faster,” Andersen said. “One way to do that is by finding more sources. One limitation that makes downloads slow is limited upload bandwidth."

The CM team also looked into other factors which can potentially slow or speed up the SET program. “Based on our experience and experiments, there's a ‘sweet spot’ of file popularity,” Andersen said. “Really, super-duper popular files don't benefit too much from SET, because there are already many hundreds of sources.  Medium-popularity files benefit quite a bit.  Unpopular files still benefit, but somewhat less than the medium-popularity files.”

Current P2P programs like BitTorrent, Gnutella and ChunkCast, can benefit from SET. It can speed up the data transfer by simultaneously downloading different portions of a chosen file from many different sources as opposed to current methods of sourcing one particular file – which seems slow in comparison according to Andersen.

He said: “The major benefit SET provides is that it lets clients download from similar files, not just identical files.  In BitTorrent terms, each distinct file has its own ‘swarm’ of clients that are downloading it.

"Really, super-duper popular files don't benefit too much from SET, because there are already many hundreds of sources.  Medium-popularity files benefit quite a bit.  Unpopular files still benefit, but somewhat less than the medium-popularity files."

SET behaves much like BitTorrent, except that the swarms for similar files can also share between each other, instead of just among the other clients downloading the exact same file.

BitTorrent works by breaking the file down into fragments while looking for the best peer connection to download the fragments from. Yet, the program can still be slow as the networks can not find enough file fragments to download. This is why SET takes the additional step of identifying not the exact file – as BitTorrent does – but searches for files that are similar.

For instance, New Order’s Blue Monday might have similar code in it’s first five bars as Kylie Minogue has in the entire song Can’t Get You Out Of My Head. These two songs with their exact same musical phrases might help each other’s file out.

Andersen elucidates: “SET always downloads an exact copy of the file that someone is trying to download.  It merely is capable of downloading parts of that file from sources of different files.

Getting down to the bytes

The basic operation of SET is very similar to BitTorrent – as it shares the functionality to divide the source file into individual chunks – SET can divide a one gigabyte file into 16-kilobyte bits, for example. The different portions are downloaded at the same time for one identical file reassembled to a single file the user requested.

The process that SET takes to simultaneously download and search for similar files is called ‘handprinting.’ The process was inspired by the same techniques as clustering search results or detecting spam. It is also a little similar to compression coding. “Some of the techniques that SET uses to identify blocks have also been used for data compression, though those aren't the main contributions of SET.”

The coding rather works by identifying relevant chunks which are similar and downloading them. “For example, let's say I want a file that looks like "AAAAAAAAAABBBBBBBBBBBBB" and you have a file that looks like "AAAAAAAAAAAAAAAAAAAAAAA".  Using SET, I could download the first "AAAAAAAAAA" from your file, and the ‘B’s from some other source, and then re-assemble them to form the file I wanted,” Andersen explained.

In tests based upon current P2P networks, SET was found to improve the transfer speed by 71 per cent. The researchers are hoping that SET will become part of the next generation of high-speed multimedia delivery. In fact, Andersen hopes the technology is something people would ‘steal’ as he and his colleagues have no intention of releasing it as an add-on themselves. “My goal is for our research to have impact and get adopted for use in real systems.  For P2P technology, the best approaches seem to be either creating a startup (I'm not too interested in doing so right now) or by creating and distributing useful software.  We haven't created a gorgeous, usable GUI [graphical user interface] for SET, so I think the best way for the software to get used is for people to incorporate it into an existing client such as Azureus [a Java BitTorrent client].”

The speed of the future?

With on-demand television and music downloads being more popular now than ever, SET definitely has its place on our computer systems. The immediacy of our lives and the “I want it now!” attitude might just well be the force behind introducing SET, or something similar, to our downloading lifestyles.

You’ve read it. Now review it.

Source: scenta
Date Published: May 02, 2007
 
Useful? Recommend It.

If you found this item fun or informative, please let others know. Simply send to a friend or recommend it to even more people - on any of the following sites:

Latest Science News | reddit | digg.com | del.icio.us | rollyo | stumbleupon

More on P2P...

Faster music and movie downloads
Computer scientists discover way to speed up peer-to-peer file sharing.

Downloading doesn't affect sales of CDs. Perhaps
Peer-to-peer file-sharing tends to increase rather than decrease music buying, according to a study.

Behind the music: Why can't Pirate Bay admit it's a business?
In a recent blog criticising the P2P filesharing site Pirate Bay, I voiced my doubts about the site's claims of being anti-corporation, as they accept ad funding. I suspected that the people behind the site weren't in fact as altruistic as they proclaimed, and were quite happy to make money off the back of other people's music. With last week's announcement of their upcoming IPREDator, for which they plan to charge €5 a month, it appears my suspicions were well-founded.

All the industri	al manufacturers Industrial Catalogues and Technical Brochures