Search
Saturday, October 11, 2008 ..:: Home ::.. Register  Login
Blog roll

Topic search

UsersOnline
Membership Membership:
Latest New User Latest: havilandp
New Today New Today: 0
New Yesterday New Yesterday: 0
User Count Overall: 108

People Online People Online:
Visitors Visitors: 0
Members Members: 0
Total Total: 0

Online Now Online Now:

Blogs
Apr 27

Written by: George Ou
4/27/2008 4:56 PM

Update: How Comcast customers can seed 100 times faster and bypass TCP resets

The FCC hearing at Stanford University on April 17th 2008 was filled with inaccurate testimony from various witnesses.  Since that testimony seems to be carrying significant weight both on Capitol Hill and in the media, I feel compelled to set the record straight.  I have filed a copy of this letter on FCC docket 07-52.

Problems with Jon Peha's testimony
Jon Peha testified that BitTorrent was like a telephone and implied that if a TCP reset is used by Comcast to stop a TCP stream, then that constituted a blockage of BitTorrent.  Furthermore, Professor Peha implied through his telephone analogy that if BitTorrent is blocked, then the user must manually redial to reestablish the connection.  These assertions are highly inaccurate and here's why.

The first problem is that Jon Peha did not understand the multi-stream aspect of BitTorrent or P2P.  Peha seemed very surprised immediately before our panel at Stanford when I told him that a P2P download typically used 10 to 30 TCP streams at the same time.  His surprised reply to me was "all active?" and I replied yes.  The reality is that if a certain percentage of BitTorrent TCP streams are reset and temporarily blocked by an ISP, say 15% for example1, then the "Torrent" (the file that's being exchanged amongst multiple peers over the BitTorrent protocol) is essentially slowed down by an average of 15%.  In other words, the "Torrent" would suffer a 15% partial blockage which is accurately described as a "delay" since the file transfer didn't actually stop.  This would be like filling up a bath tub with 20 faucets and you closed 3 of those faucets.  The rate of water flowing in to the tub would slow but not stop.

The second problem with Jon Peha's testimony is his implication that the user must take some sort of action to resume the BitTorrent connection or else the connection wouldn't resume.  Peha's assertion can easily be proven false by a simple experiment with BitTorrent.  One can easily confirm that BitTorrent will always resume a lost connection within a matter of seconds without any user intervention just by physically disconnecting a network cable on the test machine and reconnecting it.  Not only does BitTorrent automatically resume, it picks up where it left off and does not need to start all over again.  So even if all TCP streams in a Torrent were simultaneously blocked for a short period of time, it will quickly resume by itself and eventually finish the file transfer.  Therefore this is by definition a "delay" and not a "blockage".

This is not to say that Comcast's existing form of network management is without problem because it is clear that the system has flaws and unintended consequences like the accidental blockage of IBM Lotus Notes.  The use of TCP resets also have a more drastic effect on rare Torrents, which are BitTorrent files that are not popular and have few seeders or other peers who have parts of the file to download from.  These rare Torrents aren't healthy to begin with and a TCP reset can in some cases trigger a complete temporary blockage.  The rare Torrent will still get there eventually but it will suffer significantly more than a normal Torrent that is resilient to partial blockage.

It should be noted that BitTorrent in general is not an appropriate file transfer protocol for rare Torrents.  BitTorrent tracker sites tend to rank and sort a Torrent based on the Torrent's "health" which is based on the number of available seeders and pre-seed peers.  Users generally tend to avoid the "unhealthy" Torrents on the bottom of the list.  Since Comcast offers a vastly superior alternative where they provide 1 gigabyte of web storage space, Comcast customers can use that service to distribute files 10 to 20 times faster than any single Comcast BitTorrent seeder could ever provide.  To further illustrate this point, Richard Bennett posted a copy of the King James Bible on his Comcast-provided web space.  By contrast, I couldn't find a non-copyrighted version of the King James Bible or any non-copyrighted Barbershop music on any BitTorrent tracker site which illustrates how unlikely it is that you'll find rare or legal content on BitTorrent.

Problems with Robert Topolski's testimony
Robert Topolski also had problems in his testimony.  Topolski, a software tester who does not work in the networking field, insists that the TCP reset mechanism isn't common in network devices and declared that I was wrong in my testimony.  In my experience as a Network Engineer who designed and built networks for Fortune 100 companies, it is my experience that the TCP reset mechanism is common in routers and firewalls.  For many years, Internet service providers, including LARIAT (owned and operated by Brett Glass, who has filed comments in this docket) have used TCP RST packets to protect the privacy of dialup Internet users. When a dialup user's call is over, RST packets are sent to terminate any remaining connections. This prevents a subsequent caller from receiving information -- some of which might be confidential -- that was intended for the caller who disconnected. Thus, the transmission of RST packets by a device other than the one(s) which established a connection -- for the purpose of informing the endpoints that the connection has been terminated -- is not only commonplace but salutary.  Network architect Richard Bennett who works for a Router maker explained to me that TCP resets are the standard mechanism used by consumer routers to deal with NAT table overflow, which itself is typically caused by excessive P2P connections.  Are we to believe a single software tester or three networking experts?

The second key problem with Topolski's testimony is that to my knowledge, he has never provided any forensic data from Comcast in the form of packet captures that can be independently analyzed.  Even if Topolski did produce packet captures and we assumed that those packet captures are authentic, one man's packet captures wouldn't be a large enough sample to draw any conclusions by any legal or scientific standards.  The Vuze data may constitute a large enough sample but the data isn't very granular because it doesn't tell us what percentage of reset TCP sessions are due to an ISP versus other possible sources1.  The Vuze data, which even Vuze admits isn't conclusive, is highly suspect because it even shows as much as 14% resets on 10-minute TCP sessions from ISPs who do NOT use TCP reset packets to manage their network.

Furthermore, even if a TCP reset was used by Comcast at 1:45AM, we cannot assume that there was no spike in congestion at 1:45AM.  As I have indicated in the past, just 26 fulltime BitTorrent seeders in a neighborhood of 200 to 400 users can consume all of the available upstream capacity in a DOCSIS 1.1 Cable Broadband network.  That means less than 10% of the population seeding all day and night can cause congestion at any time of the day.  Based on what little evidence presented by Robb Topolski, no conclusions can be drawn regarding the question of whether Comcast uses TCP resets for purposes other than congestion management.


 1. The reason I use 15% as the example is because the Vuze data gathered via thousands of user's computers indicated that a Comcast broadband network typically suffered 14% to 23% for all TCP streams during 10-minute sampling periods.  That 23% figure is not restricted to just BitTorrent or P2P traffic and even those TCP resets pertaining to BitTorrent aren't necessarily from Comcast.  It's quite possible that the reset actually came from the client on the other end or it may have come from a customer-premise router on either end-point trying to conserve on NAT (Network Address Translation) resources.  It is undeniable that a certain percentage the TCP reset have nothing to do with Comcast's network management practices.  We really can't know what percentage of those TCP resets were due to Comcast and figuring out the exact percentage isn't trivial because there are so many factors to consider.

Note: Any assertions on behalf of Brett Glass and Richard Bennett that I have made in this document have been approved by Brett Glass and Richard Bennett.

Tags:

24 comments so far...

Re: Comments on (mostly)-accurate testimony at the FCC Stanford hearing

Dear George,

I noticed that you consulted with Brett Glass and Richard Bennett before posting your article. You certainly have my permission to consult with me, too. Why should we be on completely opposing sides of this debate? You know how to find my phone number.

FIRST, AN APOLOGY: At the Stanford hearing, you reported that some smart gateway/router devices know when a host behind it has gone offline, and will issue RSTs in lieu of forwarding to the known-offline host. At the hearing, I misunderstood what you were saying and I countered that you were describing behavior covered in the standards. While I don't know of any particular smart devices that do this, if they do (and it makes sense to me that they do so), then I was incorrect to counter you without acknowledging that gateway devices might do this exactly as you described. For that, I WHOLEHEARTEDLY APOLOGIZE. Had I understood your statement (which is clear enough in the video --- the fault of misunderstanding is all mine), the behavior you described is probably not in the standards (AFAIK). However, the device sending the RST is acting as the end point and was instructed to do so by some administrator. It was not a secret. It was not DPI changing the behavior of the Internet. As far as the Internet peers were concerned, the SYN-RST exchange was completely expected. The entire discussion ended appropriately when Professor Peha, countering both of us, made the entire discussion moot when he pointed out that these are not examples of using RST for congestion control. You weren't wrong in what you said. I WAS WRONG to dismiss it as errant and I apologize for doing so.

SECOND: "P2P download typically used 10 to 30 TCP streams at the same time." This is essentially correct, but moot. Also the statement is true only for DOWNLOADING WITH BITTORRENT specifically. It is not true for eMule or Gnutella. It is not true while BitTorrent is only uploading. While partly true, the statement is completely unimportant. The important direction is the upload direction -- because the transmitting host handles congestion control on the Internet, because the last-mile congestion is the congestion we are discussing (this is the uploader's first mile), and because P2P uploads is what Comcast is tearing down. With BitTorrent, only 3-4 streams are allowed to upload simultaneously. This is a major flaw in YOUR testimony to the FCC and other bodies (I'm referring to your article and graphic showing a single connection comparison to 11 connections). Limiting the uploading connections to 3-4 is a specific feature of the BitTorrent protocol designed to prevent congestion by quickly responding to changing network conditions. See http://wiki.theory.org/BitTorrentSpecification#Choking_and_Optimistic_Unchoking to understand how this feature is designed to prevent congestion.

THIRD: "The second problem with Jon Peha's testimony is his implication that the user must take some sort of action to resume the BitTorrent connection or else the connection wouldn't resume." Jon Peha is entirely correct, as long as you understand "user" as being the application and not the person. The response to a RST action is completely dependent upon the application programming. An application developer has to choose whether to retry an RST-aborted connection, and when. The scenario is not covered in the BitTorrent protocol, so the developer has free reign to drop the peer, retry it immediately, or drop the peer from the peer list. The behavior is particularly damaging to users of current versions of eMule and Shareaza (which supports Gnutella and Gnutella 2) as a client who resets is penalized by those applications. With eMule and Shareaza peers, you lose your place in line and have to wait at the back of the queue. With Shareaza, the affected peer is also marked as "suspicious" on the network until it is able to complete an upload transfer to someone.

FOURTH: 'So even if all TCP streams in a Torrent were simultaneously blocked for a short period of time, it will quickly resume by itself and eventually finish the file transfer. Therefore this is by definition a "delay" and not a "blockage".' YOUR STATEMENT IS NOT TRUE when the affected peer is trying to upload unique content, such as their own musical tracks or movies. When the Associated Press and EFF created new BitTorrent uploads and tried to transfer them over Comcast links, the transfers were prevented. A disconnected TCP session cannot carry data. I have some responsibility for the word, "delay" as I used the word when I first posted about this on DSLReports.com. (I also, unfortunately, used the word "manage," too.) As it turns out, both of those words proved to be too generous.

FIFTH: In your article today, the very next paragraph recognizes the problem. You said, "The use of TCP resets also have a more drastic effect on rare Torrents, which are BitTorrent files that are not popular and have few seeders or other peers who have parts of the file to download from. These rare Torrents aren't healthy to begin with and a TCP reset can in some cases trigger a complete temporary blockage." I think we're close to agreement, here. However, being the lone seeder of my own unique content is EXPECTED. The swarm cannot become healthy if that lone seeder is prevented from uploading. This was the case for me with Gnutella. Until February 20th or so, I always have enjoyed some ability to upload with BitTorrent, despite considerable interference. As of my last tests, I cannot seed at all. Because new torrents are rare torrents until sufficiently seeded, your sentence "It should be noted that BitTorrent in general is not an appropriate file transfer protocol for rare Torrents," is obviously false. As an experiment, I uploaded the Public Domain Gutenberg version of DaVinci's Notebook to The Pirate Bay last July. Despite not being Moviez, Tunez, or Warez, the swarm started with only me, and grew to hundreds large. A single uploader is quite important, and P2P introduced and maintained the availability of this content for months. I've long since deleted my copy, but the file remains available on TPB today! See http://thepiratebay.org/tor/3735336/The_Notebooks_of_Leonardo_Da_Vinci___Complete_by_Leonardo_da_Vin and download it, should you want.

SIXTH: "Topolski, a software tester who does not work in the networking field," George, at age 20, I was on the air with both RTTY, Fax, Slow-Scan TV, and etc.. These are point-to-point protocols, to be sure. By 24, I was heavily into store-and-forward systems. Law Enforcement was my profession then, but digital communications was my hobby. It would become my career in my 30s. Except for about 3 years, all of my tech jobs have involved networking. Desqview/