Re: [tana] FW: New Version Notification for draft-penno-tana-app-practices-recommendation-00

"Robb Topolski" <robb@funchords.com> Thu, 30 October 2008 04:27 UTC

Return-Path: <tana-bounces@ietf.org>
X-Original-To: tana-archive@ietf.org
Delivered-To: ietfarch-tana-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id C83C73A63D3; Wed, 29 Oct 2008 21:27:15 -0700 (PDT)
X-Original-To: tana@core3.amsl.com
Delivered-To: tana@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id D61CF3A6837 for <tana@core3.amsl.com>; Wed, 29 Oct 2008 21:27:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.975
X-Spam-Level:
X-Spam-Status: No, score=-0.975 tagged_above=-999 required=5 tests=[AWL=1.002, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id F4lWLnurhNGZ for <tana@core3.amsl.com>; Wed, 29 Oct 2008 21:27:13 -0700 (PDT)
Received: from rv-out-0506.google.com (rv-out-0506.google.com [209.85.198.225]) by core3.amsl.com (Postfix) with ESMTP id E87853A6359 for <tana@ietf.org>; Wed, 29 Oct 2008 21:27:12 -0700 (PDT)
Received: by rv-out-0506.google.com with SMTP id b25so363935rvf.49 for <tana@ietf.org>; Wed, 29 Oct 2008 21:27:00 -0700 (PDT)
Received: by 10.140.139.3 with SMTP id m3mr4533397rvd.165.1225340820636; Wed, 29 Oct 2008 21:27:00 -0700 (PDT)
Received: by 10.141.69.3 with HTTP; Wed, 29 Oct 2008 21:27:00 -0700 (PDT)
Message-ID: <3efc39a60810292127m714b321fo2ca993d809b58b5e@mail.gmail.com>
Date: Wed, 29 Oct 2008 21:27:00 -0700
From: Robb Topolski <robb@funchords.com>
To: Reinaldo Penno <rpenno@juniper.net>
In-Reply-To: <C52E2B08.13ED8%rpenno@juniper.net>
MIME-Version: 1.0
Content-Disposition: inline
References: <3efc39a60810281646l104a39bds2fc98518c39c4160@mail.gmail.com> <C52E2B08.13ED8%rpenno@juniper.net>
Cc: tana@ietf.org
Subject: Re: [tana] FW: New Version Notification for draft-penno-tana-app-practices-recommendation-00
X-BeenThere: tana@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: "Techniques for Advanced Networking Applications \(TANA\)" <tana.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/tana>, <mailto:tana-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/pipermail/tana>
List-Post: <mailto:tana@ietf.org>
List-Help: <mailto:tana-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tana>, <mailto:tana-request@ietf.org?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: tana-bounces@ietf.org
Errors-To: tana-bounces@ietf.org

On Wed, Oct 29, 2008 at 3:00 PM, Reinaldo Penno <rpenno@juniper.net> wrote:
> Hello Robb,
>
> Thanks for the comments. Inline..
>
>
> On 10/28/08 4:46 PM, "Robb Topolski" <robb@funchords.com> wrote:
>
>> With GREAT respect to Reinaldo, who did the infinitely harder job of
>> creating a document than it takes to critique one -- and please
>> understand that my critique is on the document and the process, not on
>> the persons...
>>
>> I feel like we're back to square one.
>>
>> From the first paragraph, the paper draws attention to P2P that is
>> undeserved. It attributes to P2P clients behaviors and motives that
>> are generally untrue and makes conclusions that do not logically
>> follow even if they were. The very motivation for using P2P over other
>> methods isn't especially good.
>>
>> Users frequently favor transfers over P2P networks because they seem
>> more efficient and robust than those over client-server.
>
> Do you really think the average user goes by these beliefs? I would think
> users (as in the the majority of people that are not power users) do not
> care about the technology. The biggest drive of P2P is that the client is
> easy to configure (click and forget) and you have access to tons of
> extremely appealing content.
>
> And no, I'm not talking about Ubuntu's distribution, which compared the
> traffic generated by 'Iron Man' torrent is a drop in the ocean.
>

Easier to use and configure than a web browser that can access the
same content?

Yes, I really think that robustness and speed matter to users.
Although HTTP and FTP can resume a broken download provided both
client and server are equipped to do so, this has been a consistent
feature of common P2P clients.

I think that P2P clients are frustrating to configure.  Do it wrong,
and there is no "click and forget" because everything crawls.  There
are further complications involving locating the content you want
where a complete copy is shared.

The availability or appealing choice of content is and has been a
valid driving point, but it's one that quickly is becoming less of a
factor.   Legal and free (or very inexpensive) choices are abounding
and studios and publishers are becoming more comfortable and
forthcoming with business models that work online.  Sandvine released
its study that predicted that P2P would grow 400% over 5 years (which
is about normal net growth) but within that statistic, legal P2P would
grow at a 10x to infringing or illegal uses.

So we can talk about Iron Man, but what problem are we here to solve?
The Iron Man issue is declining in relevance (and has been for about
two years) but using P2P to reduce the cost and other burdens of
spreading multimedia is going to be a factor for some time to come.

> If users had access to the same content (and with the same price) in another
> technology with ease of configuration they would go for it.

I think you're probably right and this factor is eroding at two of the
common drivers of P2P file-sharing fans --

 -- the lack of availability of content online causes people to
"hoard" it when it appears online.  They download it even if they
don't intend to watch or listen to it right away because it's
available now.   This factor is decreasing because now nearly
everything more than 6 months old is available online.

 -- the "free" offer is very compelling against $15 retail price of CD
or DVD content.  This factor is decreasing because the new online
price of $2 - free per track or episode or a low monthly subscription
price is a price people are willing to pay to get a professionally
encoded and unencumbered multimedia file.

But this effort isn't about these infringing factors.  If that was the
target, then we should also look at the client-server practices of
HTTP direct-download sites and NNTP hosters. Sure, they're there, but
what's the point?


> True. At some point I say that multiple connections are an old issue that
> were amplified by P2P. I can make this issue clearer
>

>> For example The goal of widely popular client-server protocols is also
>> to provide the requested data quickly, yet the paper assigns this
>> feature to P2P.  To accomplish this goal, the paper says that P2P
>> turns to opening multiple connections.  But the multiple connections
>> just aren't used in the way that most people think!  Fire up a client
>> and see for yourself!
>
> I have, what you mean by that exactly?

I mean, exactly, that many are told by authoritative sources that P2P
clients fire up hundreds or thousands of connections.  At best, the
leading voices fail to mention that most of these connections are idle
but in the worst examples, they even assert that all of these
connections actively uploading.  And finally, they reach the
conclusion that this behavior is intended and designed to skirt the
effectiveness of congestion control.

One of these voices you mentioned by name.


>> In Gnutella and Emule, the non-transfering clients don't stay
>> connected after requesting a place in queue except for a few
>> coordination connections to keep the searches and peer-assisted
>> connections working.  BitTorrent clients probably only stay connected
>> because it's fewer packets than closing and reopening connections to
>> keep the rest of the swarm updated as to newly available pieces and to
>> send choke/unchoke interested/uninterested flags.  But in all cases a
>> P2P client is only sending significant data (the file transfer) over a
>> small handful of connections at any one time.
>
> Correct. And I point that in this case what matters is not bandwidth but:
>
> * State tables in firewall NATs
> * TCPCB in case of certain devices.
>
> Do you think this is not clear?

It's almost an entire change of gears.  Attacking that problem seems
out of place.  It's really not the problem anyone sought to solve who
came to the P2PI workshop and as a co-project to the scavenger class
thing it's really a strange coupling.




>> The "using more bandwidth" effect has nothing to do with the P2P
>> architecture nor the overhead of the P2P network.  It has to do with
>> the fact that multimedia files are large files.  If P2P were to
>> suddenly be removed as a choice, then their client-server replacements
>> would use a large amount of bandwidth on the net and spread that use
>> more throughout the day
>
> This part I disagree. You are assuming everything has the same availability
> and same constraints. With P2P you can download 10Kbps from 10 people and
> get 100kbps. Let's also assume that the download constrain is on the side of
> the uploader.
>
> In a pure client to server you would be downloading only 10kbps.
>
> Would you agree?

I can't say no, because there is a time factor.  I mentioned this to
Steve in the other message.  See
http://www.ietf.org/mail-archive/web/tana/current/msg00111.html

---quote---
Whether I download Ubuntu 8.10's new release tomorrow over
Client-Server @ 100 KB/s or BitTorrent @ 1 MB/s, I will have used the
same amount of bandwidth.  To further illustrate, the first choice
will take longer, and any impacts that I have on the network will last
longer.  The second will not last as long, but my impacts will be
greater during that time.   But if the first choice is on an already
heavily-saturated route, most of my second choice is likely to avoid
heavily saturated routes.
---endquote---

But on the other hand, I'm not going to download duplicate data,
either, so by the end of the download -- whether it took an hour or a
day, I've still taken nnnn megabytes.   All of that was to point out
that this wasn't a P2P factor, it was a large-file factor.  The paper
focused on P2P as a problem because of its bandwidth use.  My argument
is that P2P doesn't create significantly higher bandwidth use (save a
statistically tiny amount of overhead).

What's driving bandwidth use higher in the fastest way today is the
addition of video into the Internet user's experience.  We can blame
P2P, but we're missing the point that if P2P would go away, people
would still want to create and share this stuff.  Fifteeen years ago,
facing much the same stress, we could have blamed HTTP for the
additional burden of graphics files and throttle all traffic that
wasn't Gopher, but we didn't because the cause wasn't HTTP, the cause
was that people liked using graphics in their presentation of web
information.

>> Using upload bandwidth is a factor for all file-sharing, regardless of
>> architecture.  Most ISPs disallow public HTTP and FTP servers, so it's
>> not so much a function that consuming upload bandwidth is a P2P thing
>> than it is a file-sharing thing.  It would save upload bandwidth if
>> everyone took advantage of central servers for their desired
>> downloads, but there are drawbacks there as well concerning the
>> battles of costs, disk space quotas, and terrible overloading when
>> popular content is released.
>
> I tend to agree, that's why P2P allows more bandwidth to be consumed,
> because the constraint is on the uploader.

NO, no no!  P2P doesn't allow more bandwidth to be consumed (except
perhaps as a side-effect of avoiding congested routes and thus gaining
some better throughput).

Maybe this is the crux of a misunderstanding -- If you believe that
P2P does something fundamentally that skirts the bandwidth usage or
speed limit imposed by the ISP at the modem, then tell me why or how
you believe that because it may be the fundamental misunderstanding
between us.  The facts are that these are just network sockets to the
software.  There's nothing under the hood going on here.  The upload
bandwidth consumed would be the same whether the file-sharer was
sharing using client-server or peer-to-peer.

We have to be able to say why Peer-to-peer as an architecture uses
more bandwidth than Client-server as a technology would to do the same
tasks, otherwise we'll be focused on the wrong problem due to an
assertion that just isn't true.


>
>>
>> P2P applications do not use more download bandwidth than their non-P2P
>> counterparts.  P2P clients do not download a lot of duplicate data.
>> Again, download bandwidth consumption is the market taste changing to
>> larger files.
>>
>> PLEASE READ THIS ESPECIALLY
>>
>>> The advantages of P2P applications come from the fact that they open
>>> multiple TCP connection to different peers. On the other hand, this
>>> is also their major drawback.
>>
>> There has been no evidence shown that there is a "major drawback."
>
> For whom? Clients, equipment vendors, ISPs? Client probably not, ISPs we
> have been seen major problems.

And my points, summarized, are that no one has demonstrated that these
drawbacks are due the number of connections or something else special
about P2P connections, or even provided a size or nature of these
drawbacks sufficient for designing a particular solution that will
cover the need.

We've had plenty of assertions, likely good-faith assertions,
highly-respected assertions, but they don't work out when you try
them!  They don't open as many connections, they only actually upload
on a handful of connections at a time, and assignments of nefarious
intentions don't match the personalities involved (let alone the
specs).

P2P has been compared to download accelerators, but I know of no
clients that open multiple connections to the same host.  If the
intention was to skirt congestion control, then why wouldn't P2P
developers follow that well-established example?  Why leave most of
those open connections idle?  The uploader could be transferring on
them and effectively never slow down much when the net drops a packet?
  Why use TCP or OS sockets at all when the developer can use UDP and
roll his own rules?  BECAUSE P2P developers generally seen congestion
control and congestion signals as good things, and it is in their own
best interests to use and respect them.  With limited upload
bandwidth, it just doesn't help the P2P network to hammer packets down
a congested path that is going to drop half of them anyway, it just
makes more sense to slow down on that path and go use another path to
someone else in the meantime.  That's WHAT they do, and that's WHY
they do it.

You asked for studies.  But I asked you first, and since it is you
framing the problem statement, the onus is on you.



>> The whole "multiflows means that congestion signals are less
>> effective" argument forgets that these users are behind a
>> flow-uncontrolled modem which, no matter what, is going to keep speeds
>> from growing beyond a certain relatively tiny point and will
>> constantly be trimming the sails of its own heavier flows to allow its
>> smaller ones to grow in a never-ending struggle to reach equilibrium
>> locally.  This means that any fractional benefit from multiple streams
>> across X point in the Internet is quite temporary, as the knocked-down
>> TCP stream from a particular host that is growing again is going to
>> soon cause the others from that host to get knocked down as they
>> eventually try for equilibrium at a subscribers first bottleneck --
>> their own modem.
>
> Do you have references to this claim? I would like to if possible cite such
> studies on the next revision.

I can probably work up the test case (it's pretty obviously stated
above).  I'm interested in knowing what you think would happen
instead?


>
>>
>> This long-repeated complaint that P2P is gaining some kind of benefit
>> here is completely unsupported.  When the Comcast affair started,
>> their base subscription was 6 Mbps/384 Kbps.  Comcast felt that its
>> upload was being strained so it used Sandvine to attack P2P in the
>> upload direction.  Sure, like anything else P2P apps can be
>> misconfigured to open hundreds of connections, but most won't reach
>> 100 on one download and doing so would require either massive
>> misconfiguration or a very large upload pipe.  On my Comcast account,
>> I could run 1-2 simultaneous torrents with ~50-70 connections each and
>> 3 or so upload slots (actively uploading connection) each.  That's not
>> 100s of connections in simultaneously use for the purpose of cheating
>> congestion control.
>
> Agree. But it depends on the number of torrents you share and if you are a
> seeder (or a seed box) or not.

The number of torrents you share simultaneously is actually a function
of the configuration, which is set based on the upload and download
speeds.  Any more than that and the surplus is queued until previous
tasks finish.  When I wrote my P2PI paper, I was on a 6/384 connection
which was also generous in the industry.  I think Comcast offers
12M/2M in that class now, so the 2 x (3 or 4 slots) figure probably
changes.


>> (secondly and much more minor) -- When did middleboxes become a
>> consideration?  As far as end-host communications that cross some
>> private gateway, sure.  But with respect to network operators,
>> middlebox considerations ought to be off the table.  There is nothing
>> in the role of access or transit provider that requires one,
>
> I think this does not reflect reality. Firewall and NATs are widespread
> today in ISPs.  Firewalls are needed for security, securing the ISPs
> infrastructure, their customers, and vice-versa.
>
> NAT are needed due to IP address depletion. I suggest you look at BEHAVE and
> check the discussion on v4v6 coexistence and see the many types of NATs
> today that exist in the network. We could argue what should happen in a
> perfect world vs. what actually is deployed.

Please note the weaker emphasis to objection.  I'm not very against
going down this path somewhere and somehow.  It's a real problem, it's
just a really different problem than congestion and seems orthogonal
to the intents and purposes of this group.  :-)
If you want a lot more comment on it, my observations are that it's
usually the UDP traffic involved in some distributed database that
pops the lid on the NAT tables.  Turn these off, clear out the table,
change the port, and the user usually doesn't need to limit the TCP
connections any further than normal.


> This document is not about solving the flow fairness problem. People have
> quite different views on this. See Briscoe's papers on this.

After reading the paper, I felt it was very much about asserting a
flow-fairness problem and repeated many incorrect and easily tested
assertions often related to that problem.  I also felt that it listed
other problems such as overall more bandwidth usage to P2P as an
architecture or P2P technologies when the same use would be true if
users were restricted to using only client-server, given the same
tasks and content.

The paper failed to recognize that one factor is that video screens
and video files are growing exponentially in size.  Yes, sharing
copyrighted studio-quality content has been a strong draw, and this
will continue although it is declining as the studios and publishers
are successfully converting users to their own offerings.  But even
"piracy" isn't a P2P-specific problem as 15 years ago the traffic
complaints were that FTP software pirates were dominating the traffic
graphs during most hours and forcing their dial-up modems to run 24/7.
The paper needs to recognize that whatever the technology, users no
longer exclusively want to "pull" or "get" but more and more
frequently they also want to participate and contribute. They are
sharing the stuff that they like, some are personalizing it, and now
some are even producing it!

The last-generation network provisioning of 10:1 download/upload
ratios currently have users topping out their upload pipe for longer
periods of time.   Given the very public and uninhibited way that
today's newer net users view their lives today, how can we not include
that in the factors and practices that are going to drive the network
decisions for tomorrow?

Thanks again, Reinaldo!

Robb

-- 
Robb Topolski (robb@funchords.com)
Hillsboro, Oregon USA
http://www.funchords.com/
_______________________________________________
tana mailing list
tana@ietf.org
https://www.ietf.org/mailman/listinfo/tana