Re: [tana] FW: New Version Notification for draft-penno-tana-app-practices-recommendation-00

"Robb Topolski" <robb@funchords.com> Wed, 29 October 2008 21:20 UTC

Return-Path: <tana-bounces@ietf.org>
X-Original-To: tana-archive@ietf.org
Delivered-To: ietfarch-tana-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 8D04728C3BA; Wed, 29 Oct 2008 14:20:50 -0700 (PDT)
X-Original-To: tana@core3.amsl.com
Delivered-To: tana@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 44C8F3A69D4 for <tana@core3.amsl.com>; Wed, 29 Oct 2008 14:20:49 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.027
X-Spam-Level:
X-Spam-Status: No, score=-1.027 tagged_above=-999 required=5 tests=[AWL=0.350, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, J_CHICKENPOX_36=0.6]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id xQudbA3mjLlK for <tana@core3.amsl.com>; Wed, 29 Oct 2008 14:20:47 -0700 (PDT)
Received: from rv-out-0506.google.com (rv-out-0506.google.com [209.85.198.230]) by core3.amsl.com (Postfix) with ESMTP id 229C328C3EB for <tana@ietf.org>; Wed, 29 Oct 2008 14:20:33 -0700 (PDT)
Received: by rv-out-0506.google.com with SMTP id b25so194090rvf.49 for <tana@ietf.org>; Wed, 29 Oct 2008 14:20:31 -0700 (PDT)
Received: by 10.140.132.8 with SMTP id f8mr5176206rvd.122.1225315231388; Wed, 29 Oct 2008 14:20:31 -0700 (PDT)
Received: by 10.141.69.3 with HTTP; Wed, 29 Oct 2008 14:20:31 -0700 (PDT)
Message-ID: <3efc39a60810291420g4e154551m642a6330575bb6ca@mail.gmail.com>
Date: Wed, 29 Oct 2008 14:20:31 -0700
From: Robb Topolski <robb@funchords.com>
To: Steve <cubic1271@gmail.com>
In-Reply-To: <367782b60810291026x205d6250ya23a3518243e6e89@mail.gmail.com>
MIME-Version: 1.0
Content-Disposition: inline
References: <20081027220242.0BB2F3A69EC@core3.amsl.com> <C52BB110.13B9F%rpenno@juniper.net> <3efc39a60810281646l104a39bds2fc98518c39c4160@mail.gmail.com> <367782b60810291026x205d6250ya23a3518243e6e89@mail.gmail.com>
Cc: tana@ietf.org
Subject: Re: [tana] FW: New Version Notification for draft-penno-tana-app-practices-recommendation-00
X-BeenThere: tana@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: "Techniques for Advanced Networking Applications \(TANA\)" <tana.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/tana>, <mailto:tana-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/pipermail/tana>
List-Post: <mailto:tana@ietf.org>
List-Help: <mailto:tana-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tana>, <mailto:tana-request@ietf.org?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: tana-bounces@ietf.org
Errors-To: tana-bounces@ietf.org

Thanks for all of your responses, Steve.  I agreed with a lot of them,
but wanted to respond to certain of the others.

On Wed, Oct 29, 2008 at 10:26 AM, Steve <cubic1271@gmail.com> wrote:
>> Users frequently favor transfers over P2P networks because they seem
>> more efficient and robust than those over client-server.  P2P is that
>> it doesn't require or burden a central server yet remains light enough
>> to be a background operation on user machines.  P2P transfers
>> generally avoid hammering traffic through a congested route when other
>> routes exist.
>
> Unfortunately, they create many, many more congested routes by doing so.

Everyone uses congested routes, and client-server nor P2P will perform
any better or worse if all the routes (or the single route) are/is
congested. The sin occurs when one continues to use a congested route
when uncongested routes exist.

This sin doesn't occur with BitTorrent for very long and only
fractionally.  With BitTorrent, a long-term transfer peer pair is
created when both sides establish a relatively strong exchange --
uploading and downloading at the highest rates.  There are 2-3 of
these, plus one upload slot is used to move from peer-to-peer (30
seconds at a try) to try to find an even better candidate to replace
the weakest of the established pairings.  Since congestion affects
speed  (obeying normal congestion controls), a congested pairing won't
be paired for very long.  And even during that 30 second trial, the
other 2-3 streams on uncongested paths continue.

This sin doesn't occur with ED2K or Gnutella and only fractionally.
These clients will upload for a set size (usually up to a maximum
time) and if one of its queues is stuck on a very slow link, it will
service the other queues at their own speeds (obeying normal
congestion controls).  The result is a natural one -- its contribution
to the file-sharing network is going to be biased towards the fastest
routes.

And while I've admitted that a sin occurs at all on P2P modes, keep in
mind that their client-server alternatives have no such ability to
shift use to uncongested routes -- it "sins" because it can't make
choices when a route is congested.  Client-server alternatives have no
choice but to pound the congested route until the transfer is
completed or aborted by the user.


> I can see how p2p would keep
> connections open / in use for longer than the "average", "traditional"
> client server app.  I guess I see "client/server" as defined by
> transactions and "p2p" as defined by time.  A file being transferred
> over a p2p link neither has a beginning nor an end; the transfer
> starts when you switch the client on, and stops when you tell the
> client to stop seeding the file.

A download is a download.  Once you've got the file, you stop
downloading it.  Uploaders tend to run a range. Some will upload to a
certain % of the file, some for a certain amount of time, and some to
keep a particular file perpetually available.  The nature of the first
two are more transactional in nature, and the last example more mimics
the operator of a FTP or web server.


>> PEER-TO-PEER CONSEQUENCES VS. LARGE MULTIMEDIA FILES
>> ... a lot of the called-out effects listed here are the effects of the
>> type of data being moved, not how it is being moved
>>
>> For example The goal of widely popular client-server protocols is also
>> to provide the requested data quickly, yet the paper assigns this
>> feature to P2P.  To accomplish this goal, the paper says that P2P
>> turns to opening multiple connections.  But the multiple connections
>> just aren't used in the way that most people think!  Fire up a client
>> and see for yourself!
>
> Multiple connections to a server rate-limited by the overwhelming
> number of clients connected to it means that these multiple
> connections might not consume your entire link, as opposed to p2p
> stuff which, in all likelihood, will.

They're exactly the same.  There is nothing special about these
sockets to the network software (be it a server or a P2P node),
they're presented and managed by the OS and software in the same way.

Residential users do have some additional factors working against
them, and these do make a difference (the lack of flow controls,
overwhelming the local modem) the paper doesn't go into these and
these generally aren't the issues that a scavenger class would
address.

>> P2P applications do not use more download bandwidth than their non-P2P
>> counterparts.  P2P clients do not download a lot of duplicate data.
>> Again, download bandwidth consumption is the market taste changing to
>> larger files.
>
> I'd argue that this isn't necessarily true.  p2p applications *do* use
> more download bandwidth than their traditional client/server
> counterparts, but not because they can create any more bandwidth in a
> single connection.  Instead, p2p dramatically increases the average
> bandwidth usage of all the links on a network, which means downloads
> aren't rate-limited by a single, overloaded server anymore.  Instead,
> downloads are min(client download speed, sum of clients' upload speeds
> to the client).

Whether I download Ubuntu 8.10's new release tomorrow over
Client-Server @ 100 KB/s or BitTorrent @ 1 MB/s, I will have used the
same amount of bandwidth.  To further illustrate, the first choice
will take longer, and any impacts that I have on the network will last
longer.  The second will not last as long, but my impacts will be
greater during that time.   But if the first choice is on an already
heavily-saturated route, most of my second choice is likely to avoid
heavily saturated routes.


> The drawback, I would guess, would be the additional overhead 10
> connections would introduce as opposed to 2.

These are literally bytes (not KB).  Once established, idle TCP
connections have no cost (although some app developers do send pings
or "NOOP" commands every once in a while as a keep-alive).

>> The whole "multiflows means that congestion signals are less
>> effective" argument forgets that these users are behind a
>> flow-uncontrolled modem which, no matter what, is going to keep speeds
>> from growing beyond a certain relatively tiny point and will
>> constantly be trimming the sails of its own heavier flows to allow its
>> smaller ones to grow in a never-ending struggle to reach equilibrium
>> locally.  This means that any fractional benefit from multiple streams
>> across X point in the Internet is quite temporary, as the knocked-down
>> TCP stream from a particular host that is growing again is going to
>> soon cause the others from that host to get knocked down as they
>> eventually try for equilibrium at a subscribers first bottleneck --
>> their own modem.
>
> This is only true if the bottleneck later in the internet has a
> greater amount of available bandwidth than the user's modem does.

That's true in all of the ISP access networks that are the subject of
this effort.


> Also, I'd argue that congestion control means that 10 connections
> would own 2 connections when you're fighting for control of an
> already-congested link (i.e. if your sister is downloading ubuntu ISOs
> via BT while you're downloading gentoo ISOs via HTTP).

We have to flip that example to the upload to make use of it.
(Congestion control is a sender response).

We'd  also see that USER and SISTER dynamics would not be good
examples because they transmit at unconstrained rates at the
tightly-constrained upstream modem.  In the ISP case, users are
sending at constrained rates toward a less-constrained upstream
router.  The question at that point is SUBSCRIBER1 to SUBSCRIBERn
fairness.

(We should stop, since the example is invalid to this situation, but
let's go on because it's interesting .  10 uploading connections
would "own" two uploading connections at the first packet drop and
perhaps the next several -- but then the Japanese proverb "the tallest
tree catches the most wind" means that the most prolific sender of
data during the next several occurrences of packet drop is the one
likely to be affected next.  As an end result, we ought to run it
before declaring for sure -- but it cannot be predicted that USER
would have a 500% advantage over SISTER because USER would get 500%
more packet drops than SISTER.  I don't think it would be quite
50%/50% either.)



> 3 uploading connections via bittorrent using full link rate generally
> wins over 3 upload connections to FTP servers using full link rate,
> when I've tried  ("Why's my download so slow?!  Oh. . . I'm still
> seeding that ISO. . .").

Predictably true, because those upload connections have been chosen as
the best candidates from among a list of 35-70 choices.  When you're
uploading to an FTP server, you're using whatever route you've got and
you're stuck with its conditions.

>> (secondly and much more minor) -- When did middleboxes become a
>> consideration?  As far as end-host communications that cross some
>> private gateway, sure.  But with respect to network operators,
>> middlebox considerations ought to be off the table.  There is nothing
>> in the role of access or transit provider that requires one, and most
>> of the middleboxes we've seen have been attempts to do something
>> outside of their scope.
>
> Why?

Because all the data an ISP needs to do its job is in the IP header.
Most of the middleboxes are on the network edges (not the Internet).
Those used on the Internet are doing something that is usually
communication neutral (such as looking for spam and attack patterns,
blocking frequently attacked ports, creating segments within bigger IP
pipes, and other non-controversial things) or communication harmful
("enhancing the experience" and "monetizing every transaction" and
"inserting the ISP into the content value chain").


> I agree that p2p doesn't necessarily work *exactly* as described by
> the paper, but I'd argue that p2p does actually consume more bandwidth
> than client/server applications designed to do the same thing.  Thus,
> the problem statement is valid.

Hopefully, I've persuaded you.  I could agree that P2P users tend to
use the Internet differently than client-server users and, owing to
those different uses, tend to use more bandwidth as a group.  But if
the goal is file sharing, there is no appreciable difference in
bandwidth use that is owing to the different architectures of P2P and
client-server.

I appreciate this chance to hopefully shed some light.

-- 
Robb Topolski (robb@funchords.com)
Hillsboro, Oregon USA
http://www.funchords.com/
_______________________________________________
tana mailing list
tana@ietf.org
https://www.ietf.org/mailman/listinfo/tana