Re: [conex] byte vs packet counting
Toby Moncaster <toby.moncaster@cl.cam.ac.uk> Wed, 14 December 2011 16:43 UTC
Return-Path: <tm444@hermes.cam.ac.uk>
X-Original-To: conex@ietfa.amsl.com
Delivered-To: conex@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D535121F8C4C for <conex@ietfa.amsl.com>; Wed, 14 Dec 2011 08:43:12 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.599
X-Spam-Level:
X-Spam-Status: No, score=-6.599 tagged_above=-999 required=5 tests=[AWL=0.000, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iGL7KZ38M5jL for <conex@ietfa.amsl.com>; Wed, 14 Dec 2011 08:43:11 -0800 (PST)
Received: from ppsw-50.csi.cam.ac.uk (ppsw-50.csi.cam.ac.uk [131.111.8.150]) by ietfa.amsl.com (Postfix) with ESMTP id 0992221F8BBB for <conex@ietf.org>; Wed, 14 Dec 2011 08:43:10 -0800 (PST)
X-Cam-AntiVirus: no malware found
X-Cam-SpamDetails: not scanned
X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/
Received: from ravage.cl.cam.ac.uk ([128.232.1.17]:51910) by ppsw-50.csi.cam.ac.uk (smtp.hermes.cam.ac.uk [131.111.8.157]:25) with esmtpsa (PLAIN:tm444) (TLSv1:AES128-SHA:128) id 1RarvA-0001SZ-pm (Exim 4.72) (return-path <tm444@hermes.cam.ac.uk>); Wed, 14 Dec 2011 16:43:08 +0000
Mime-Version: 1.0 (Apple Message framework v1251.1)
Content-Type: text/plain; charset="windows-1252"
From: Toby Moncaster <toby.moncaster@cl.cam.ac.uk>
In-Reply-To: <201112132012.pBDKCokX014681@bagheera.jungle.bt.co.uk>
Date: Wed, 14 Dec 2011 16:43:17 +0000
Content-Transfer-Encoding: quoted-printable
Message-Id: <6342D66E-6525-49D4-9DD9-3713230F2303@cl.cam.ac.uk>
References: <CAH56bmD2fh3sm4mozh17K2C+K0Pxyw7vRvykCo9Xt-jeEP36ZQ@mail.gmail.com> <201111201956.pAKJuJSQ007421@bagheera.jungle.bt.co.uk> <20111120214012.GE22465@verdi> <201111202327.pAKNRJPT008060@bagheera.jungle.bt.co.uk> <20111121203356.GG22465@verdi> <201111212314.pALNEhvZ013554@bagheera.jungle.bt.co.uk> <20111122001928.GH22465@verdi> <82AB329A76E2484D934BBCA77E9F524924B97E81@Polydeuces.office.hd> <20111202232051.GH31463@verdi> <Prayer.1.3.4.1112031010520.17047@hermes-2.csi.cam.ac.uk> <20111205230153.GC39149@verdi> <9BD81879-81A2-4EF0-A60B-F541D0BA418B@cl.cam.ac.uk> <201112132012.pBDKCokX014681@bagheera.jungle.bt.co.uk>
To: Bob Briscoe <bob.briscoe@bt.com>
X-Mailer: Apple Mail (2.1251.1)
Sender: "T. Moncaster" <tm444@hermes.cam.ac.uk>
Cc: "T. Moncaster" <tm444@cam.ac.uk>, ConEx IETF list <conex@ietf.org>
Subject: Re: [conex] byte vs packet counting
X-BeenThere: conex@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Congestion Exposure working group discussion list <conex.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/conex>, <mailto:conex-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/conex>
List-Post: <mailto:conex@ietf.org>
List-Help: <mailto:conex-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/conex>, <mailto:conex-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 14 Dec 2011 16:43:12 -0000
Hi Bob, On 13 Dec 2011, at 20:12, Bob Briscoe wrote: > Toby, > > Catching up after been away for a week... > Inline, and I've snipped everything I agree with… OK > > At 09:50 06/12/2011, Toby Moncaster wrote: >> On 5 Dec 2011, at 23:01, John Leslie wrote: >> > T. Moncaster <tm444@cam.ac.uk> wrote: > > [snip] > >> >>> Auditing every packet is plausible near the sender and receiver, >> >>> but not in the backbone. I don't believe we have agreement among >> >>> ourselves what a receiver's uplink will do if their auditing shows >> >>> "abuse" of ConEx marking: so I don't see what auditing at the >> >>> sender's uplink can do and be confident it's the right thing. >> >> >> >> I agree, it seems hard to see how auditing something when you are >> >> missing half the information is going to be hard. >> > >> > (I'm having trouble parsing that sentence.) >> >> I think we are hitting the issue of "what is audit?" If audit is taken to mean accounting for ConEx marks, then that can happen anywhere. However I was taking auditing to mean accounting for the balance of ConEx marks with actual congestion. That function can only work somewhere where you can see both the congestion and the ConEx marks. The only way you can achieve that at the sender side is by inspecting the ACKs on the return path (which may be impossible). > > [BB]: I want to take issue with 'the only way'. There's at least one other very plausible scenario. Assume by design there's one major bottleneck on the upstream path from a sender (e.g. a radio access network). Then a network node at the congested point can see all the ConEx and most of the loss (or ECN). It cannot know what 'most' means, but bear with me... > > Imagine a sender is trying to cheat. This can even be a really knowledgeable sender - a disgruntled ex-employee of the network operator who knows that the audit function is badly placed - not near the receiver, but near the sender. > > Even if there might sometimes be another bottleneck nearer the receiver on the path, when this dishonest sender sees congestion feedback (from loss or ECN), it cannot know which bottleneck each congestion signal was from. > > So if this sender is trying to cheat, it will sometimes understate ConEx markings with respect to the first bottleneck alone. The audit at that first bottleneck will detect persistent cheating, so it knows the sender is suspect. An honest sender might accidentally get ConEx marks wrong occasionally, but not persistently badly like this. It would be interesting to get a feel for how common this scenario is. I think what you are saying is that the audit function doesn't have to be perfect, it just needs to be reasonably likely of catching persistent under-declaration of congestion? > > >> >> There is a separate accounting mechanism where you can measure bulk >> >> flows of congestion, but that has to be based on a belief in the >> >> accuracy of the marking of the underlying individual flows. >> > >> > I don't see how that follows. >> >> Measuring the bulk is very easy, but if all the flows making up the bulk are understating their congestion then the bulk will be understated as well. Bob always had in mind a clever mechanism to do spot checking at any border to identify persistent under (or over) declaration, but I never entirely understood how it was meant to work. > > [BB]: The bulk auditing idea is primarily designed to remove the motivation for networks to launch attacks on /each other/ using ConEx. I don't expect many networks would even think of doing that, but unless you have a way of detecting breaches of trust, you cannot rely on that trust. You don't necessarily have to use the mechanism; other networks just need to know that there is feasible mechanism and you might be using it. A key point to get across. > >> >>> Unless I misread draft-ietf-conex-concepts-uses, our principle >> >>> use case is Informing Traffic Management. I have posted before that >> >>> the most important feature of ConEx marking (IMHO) is informing >> >>> any node along the way that the sender is knowingly sending packets >> >>> despite having reason to doubt that congestion has cleared. >> >>> >> >>> Auditing does nothing to help here. The abusive thing to do is >> >>> to _clear_ ConEx marks -- and auditing can't detect that. >> >> >> >> No. ConEx is about a balance of markings. To cheat you fail to send >> >> forward into the network sufficient ConEx markings to cover the >> >> congestion your flow is causing (or has caused). >> > >> > Not quite... an ISP can cheat by clearing marks which one sender >> > placed in his outgoing traffic -- that is what I meant to refer to. >> >> OK, yes. That is a separate requirement for audit. > > [BB]: Just to ensure we make clear which sort of abuse we're talking about at any one time, I detect three different types of abuse being discussed here: > a) The sender continuing to send despite expecting that congestion has not cleared > b) The sender understating ConEx markings relative to path congestion > c) An ISP clearing the sender's ConEx markings > > a-type behaviour could describe normal TCP/IP or UDP. It's only abuse if in excess. The idea of ConEx is to give ISPs the info, so ISPs can draw the line between normal and abuse, rather than the IETF making this judgement call. > > (b) and (c) are about validity of ConEx information, whereas (a) is about behaviour measured by ConEx info. The idea is to prevent b-type & c-type behaviour so that ConEx info is reliable enough that ISPs can use it to make judgements about (a). > > b-type behaviour is intended to be addressed by audit. > > c-type behaviour can be addressed by e2e verification of ConEx info. I think this is a really good and clear description of the ways to distort ConEx info. I would like to see something like this as a stand-alone informational draft (at least for now) so we have it captured somewhere and can refer back to it as and when needed. > > We're not doing protocol work on (c) in the w-g, but the info is in the protocol if we wanted to. I can straight-off think of three possible protocols to address this: > - The source could protect the integrity of the DSOPT extension header with IPsec AH, so if an ISP altered ConEx markings, AH integrity checks would detect that something on-path had altered an immutable header. > - We could add an e2e feedback field in TCP so that the receiver can feed back the received ConEx markings to the sender (re-re-feedback?). Then the sender can check that an ISP hasn't changed what it sent. > - The sender could log the packets it marks with ConEx, and the receiver could occasionally send back a log of the packets it receives with ConEx markings for the sender to check (not sure what protocol this would be added to). I'm afraid I'm being lazy for now and not evaluating these suggestions. > >> >> So the function of auditing is to check that the two numbers >> >> roughly match. >> > >> > (I note that Toby says "roughly match". Others seem to be assuming >> > a much stricter match. I'm afraid we don't have a clear understanding >> > here.) >> >> Clearly, given the min 1RTT delay between getting congestion info and responding to it, there can never be a completely accurate match. I did have numerous debates with Bob about this topic. Basically you run the risk of a user just terminating his flow rather than making up his losses. > > [BB]: That attack is dealt with. That's why I/we decided that the sender has responsibility for keeping ConEx info greater than or equal to congestion, despite RTT delay. Then if the sender runs up a 'loss' and jumps to a new flow, the end of its last flow will have been discarded by the audit function, so the sender won't have gained by this cheat. I think this may be one of the things people haven't understood. Namely, we need any sender to start by assuming they will cause a little bit of congestion and paying for that up-front. > >> You also have to be cautious that the actual process of dropping packets for audit doesn't simply distort the apparent congestion information (because the sender will respond by retransmitting and hopefully adding more ConEx marks) > > [BB]: We (in BT) established that's also not a problem. The sender ought to match these audit losses with the ConEx marks it didn't send in the first place. If it doesn't, the audit node can detect that, because it knows about the losses it introduced. The only drawback then is that the aggregate over the path is possibly distorted (as in, some of the stated congestion isn't actual congestion) which may have an impact on some future ConEx use cases. > >> > >> >> And yes, this is a very difficult problem to solve if you can't >> >> assume ECN is being used. >> > >> > I absolutely agree with Toby here. > > [BB]: I also agree audit is hard without ECN - but I believe it's not a lost cause (excuse the pun). I started to agree with Matt when I realised cases like that discussed earlier ('not the only way' above) are common enough for ConEx without ECN to be useful. I never said it was impossible, it's just that ECN does give you a lot more information > >> > >> >>> Auditing, to tell truth, may or may not prove useful after we >> >>> have done enough experiments. IMHO, if it proves useful we'll find >> >>> that it's useful when done statistically, rather than on every >> >>> packet. >> >> >> >> I think this may point to the heart of the issue. Yes, you can sample >> >> to do audit, but only if you assume a relatively high density of >> >> markings. >> > >> > I don't follow why that is true. If auditing is not strict, sampling >> > should work for a wide range of density; if auditing is strict, it >> > will fail pretty much regardless of the density… >> >> OK, this comes down to the point below > > [BB]: Sampling isn't applicable if anyone who is 'caught' can just whitewash their reputation by adopting a new clean identity. The problem is that identity here is just a flow ID and new flow IDs can be created at will. > > A dishonest transport can cheat in as many flows as it can get away with, until sampling catches one of its flows, which it just terminates and starts another flow. E.g. a msbehaving receiver could understate congestion feedback to a server. When the receiver detects high loss, it assumes it is being randomly audited, so it just terminates that connection and opens a new one. > > "Reputation whitewashing" is easy when your identity is as emphemeral as a flow ID and you're the one who controls the end with the ephemeral ports. So what you are arguing is that speed camera-style auditing is not sensible because in the Internet you can change your car registration at will with near zero cost, and there is no external mechanism (licensing authority) that can retrospectively apply a sanction to you? (sorry for the convoluted simile) > >> >> If ConEx is a specific number carried in each packet then sparse >> >> sampling may well give you enough to be able to achieve the aims of >> >> audit. >> > >> > Carrying what I call "congestion-expected" as a multi-bit fraction >> > is an interesting idea, in which I see potential benefits. I'd be >> > happy to discuss it, but my impression is that we're not heading in >> > that direction. >> >> That was certainly how I understood a re-ECN like scheme working… Te whole core of this argument (should you account for bytes or packets) has always been a major bone of contention between me and Bob. Not least because it can be impossible for a sender to comply (it is very easy to construct pathological sequences of packets where the sender is forced to either over or under declare if they are doing per-byte auditing). I think this is a case where the IETF view has to diverge from the researcher's view. In the IETF we have to make prosaic engineering decisions. Personally I would like ConEx to account on a per-packet basis, BUT with an allowance for it to be more accurate if an operator so wishes > > [BB]: that's my position too. > Assuming "more accurate" means per-byte. That was my implication, yes ;) > > I don't think this is disagreeing with the wording in abstract-mech, or the proposed wording in the tcp mods draft. If it is, pls say how. > >> (in other words MUST as a minimum account per packet, but MAY account per byte) > > Except I wouldn't state it in terms of what audit must or may do, because the argument is about what we say the transport should do (to be robust against what audit may do). OK, but we probably need to have some minimal black-box definition of what an audit function needs to do to be "compliant" > >> >> Which brings me on to the aims of audit. To my mind there are two >> >> possible aims. >> >> 1) to guarantee that every single packet, flow, aggregation of traffic >> >> in the network has accurately declared their congestion; or >> >> 2) to act as a "speed camera" that catches a sufficiently high >> >> proportion of infringements to ensure that senders don't risk >> >> under-declaring their congestion. >> > >> > I don't see auditing as either of these; but that may be more a >> > function of my tendency to think of auditing as gathering information, >> > not enforcing disciplinary measures based on this information. >> >> Perhaps auditing is an unhelpful word to use. To my mind an audit implies an accountancy audit that tries to balance the books. I agree that carries no implication of policing or enforcement, but I think that a lot of this debate has assumed the two concepts are identical > > The critical issue is whether the audit function is possible. Once it is, the action as a result of audit can be added. I agree we clearly have two functions - audit and reaction to audit. Audit in and of itself simply says there is an issue. > > When audit is in a remote network from the sender, the only action available seems to be to 'punish' the packets (e.g. drop) because it's too complex in general to track down and 'punish' a user that might be in another network. Agreed > >> > IMHO, "guaranteeing" much of anything isn't worth the effort over >> > organizational boundaries. At significant expense, we could use >> > cryptography to give arbitrarily high confidence to a particular >> > "guarantee" having come from a particular entity, but without also >> > signing the data "guaranteed", the effort is entirely wasted. >> > >> > (And processing all that cryptography in backbone routers seems >> > a non-starter to me.) >> >> Agreed, this has to be a lightweight mechanism. All the guarantees have to come from the idea that operators have to have some level of trust among one another (or at least, the realisation that an operator found to have cheated at the expense of others runs risk of getting sat on heavily by the others), > > See point above - in a large distributed system, trust can only be relied on if it would at least be feasible to detect breaches of trust (in the case of trust building, random sampling is highly appropriate). Absolutely. Among operators you can genuinely rely on the speed camera approach to audit - the risk of getting caught may be low (but has to be finite) but the harm to your reputation makes the risk unjustifiable > >> >> My own view is that the second is more feasible in the real world, >> >> although the first might well be more desirable. >> > >> > I at least partly agree here. However, my approach to thinking >> > about such problems simply stops descending into details when I reach >> > a feasibility blockage. >> >> And therein lies the lesson for ConEx. We HAVE to fix on a SIMPLE subset of ConEx that we can all agree on and that seems at least feasible in the real world. Arguments like the recent ones make ConEx appear to be still firmly stuck in the land of research (IRTF). Too many people (on all sides of all arguments) seem to want perfection and that is not sensible in the real world. That is just a recipe for ratholing ourselves to the point where the IESG declares our WG dysfunctional… > > See above. I actually think the disagreements among us are quite small and are (generally) semantic. The problem is that they look much bigger to any outsider. Toby > > > Bob > > >> Toby >> >> >> > >> > -- >> > John Leslie <john@jlc.net> >> >> _______________________________________________ >> conex mailing list >> conex@ietf.org >> https://www.ietf.org/mailman/listinfo/conex > > ________________________________________________________________ > Bob Briscoe, BT Innovate & Design
- Re: [conex] byte vs packet counting Bob Briscoe
- Re: [conex] byte vs packet counting John Leslie
- Re: [conex] byte vs packet counting Bob Briscoe
- Re: [conex] byte vs packet counting Fred Baker
- Re: [conex] byte vs packet counting Bob Briscoe
- Re: [conex] byte vs packet counting John Leslie
- Re: [conex] byte vs packet counting Bob Briscoe
- Re: [conex] byte vs packet counting John Leslie
- Re: [conex] byte vs packet counting Bob Briscoe
- Re: [conex] byte vs packet counting Dirk Kutscher
- Re: [conex] byte vs packet counting Matt Mathis
- Re: [conex] byte vs packet counting John Leslie
- Re: [conex] byte vs packet counting John Leslie
- [conex] Fwd: byte vs packet counting Toby Moncaster
- Re: [conex] byte vs packet counting John Leslie
- Re: [conex] byte vs packet counting Toby Moncaster
- Re: [conex] byte vs packet counting Bob Briscoe
- Re: [conex] byte vs packet counting Toby Moncaster
- Re: [conex] byte vs packet counting John Leslie
- [conex] Catching "Cheaters" John Leslie