Re: Is the invariants draft really standards track?

Christian Huitema <huitema@huitema.net> Fri, 19 June 2020 23:51 UTC

To: Martin Duke <martin.h.duke@gmail.com>
Cc: Ted Hardie <ted.ietf@gmail.com>, "Lubashev, Igor" <ilubashe=40akamai.com@dmarc.ietf.org>, IETF QUIC WG <quic@ietf.org>, Lars Eggert <lars@eggert.org>, Kyle Rose <krose@krose.org>, Ian Swett <ianswett=40google.com@dmarc.ietf.org>, Jared Mauch <jared@puck.nether.net>
References: <CAM4esxQBqfrz24riPQA_VGKcGp_TzW0pqb97KfFMtNdW9pUfDg@mail.gmail.com> <833A693C-62E6-4889-9954-FCE65A839A7C@eggert.org> <CAKcm_gPMO2DtqvKucqVw0zDjSniSOmFD4p1Tp4YLjr9WSWdEUw@mail.gmail.com> <CAJU8_nUN42gGmQof24XD9-EjXedyzcarDyRP8MGe1qW-BZ=+Aw@mail.gmail.com> <9cd91c24-c730-22a4-7aa0-baf61613b3ce@huitema.net> <f4922cdb59014202900de44cc5fea0ff@usma1ex-dag1mb5.msg.corp.akamai.com> <CAM4esxQvwkTvpUcu6-+W5zWo22m-R1jvN7DcCpXfuw8Hb55qsw@mail.gmail.com> <95dd02c92b32472d9cab0dd47b98c637@usma1ex-dag1mb5.msg.corp.akamai.com> <CAM4esxQxxXn27rZEY75-AsHD5VF0fqiV1VDyeSrzQ=-sM7JNCg@mail.gmail.com> <9c2e300c30f74d1794d11cf4334ea07b@usma1ex-dag1mb5.msg.corp.akamai.com> <2c40f3d9-fa40-9834-ac30-36bc9a1a6303@huitema.net> <CA+9kkMBQt001xOVgT=9G8YOOTOJ+9S=OWDwGEeYuKVm46Cq3iQ@mail.gmail.com> <3bb42dfc-17ba-c5c8-03f7-35428756b4c2@huitema.net> <CAM4esxRWVRjVhxyYuuzwDGq_wfTjQHkY6KHG2rEPErO2aHXA0w@mail.gmail.com>
From: Christian Huitema <huitema@huitema.net>
Autocrypt: addr=huitema@huitema.net; prefer-encrypt=mutual; keydata= mDMEXtavGxYJKwYBBAHaRw8BAQdA1ou9A5MHTP9N3jfsWzlDZ+jPnQkusmc7sfLmWVz1Rmu0 J0NocmlzdGlhbiBIdWl0ZW1hIDxodWl0ZW1hQGh1aXRlbWEubmV0PoiWBBMWCAA+FiEEw3G4 Nwi4QEpAAXUUELAmqKBYtJQFAl7WrxsCGwMFCQlmAYAFCwkIBwIGFQoJCAsCBBYCAwECHgEC F4AACgkQELAmqKBYtJQbMwD/ebj/qnSbthC/5kD5DxZ/Ip0CGJw5QBz/+fJp3R8iAlsBAMjK r2tmyWyJz0CUkVG24WaR5EAJDvgwDv8h22U6QVkAuDgEXtavGxIKKwYBBAGXVQEFAQEHQJoM 6MUAIqpoqdCIiACiEynZf7nlJg2Eu0pXIhbUGONdAwEIB4h+BBgWCAAmFiEEw3G4Nwi4QEpA AXUUELAmqKBYtJQFAl7WrxsCGwwFCQlmAYAACgkQELAmqKBYtJRm2wD7BzeK5gEXSmBcBf0j BYdSaJcXNzx4yPLbP4GnUMAyl2cBAJzcsR4RkwO4dCRqM9CHpVJCwHtbUDJaa55//E0kp+gH
Subject: Re: Is the invariants draft really standards track?
Message-ID: <f9e2c611-bb4d-bc80-dfe3-e323a08bfc5b@huitema.net>
Date: Fri, 19 Jun 2020 16:51:39 -0700
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.9.0
MIME-Version: 1.0
In-Reply-To: <CAM4esxRWVRjVhxyYuuzwDGq_wfTjQHkY6KHG2rEPErO2aHXA0w@mail.gmail.com>
Content-Type: multipart/alternative; boundary="------------76EBE25B1A927C0BB99D6ABB"
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/-J_bzQZETwtPpWlB_AJ955prjCM>
Precedence: list

On 6/19/2020 4:00 PM, Martin Duke wrote:
> >  The problem of the DOS protection box is to drop as much traffic as
> possible while still letting some good in, and also without causing
> performance hits when there are no attacks. There are indeed some
> thorny traffic management issues there. More information helps, and
> Ben's suggestion to look at DOTS is certainly on point.
>
> I am not sure how any of this can work when inbound 1RTT packets can
> come with essentially random IP/port and CID and potentially be valid,
> given there might have been a migration. Some of the QUIC-LB modes at
> least make it hard to guess your way to a valid CID.

The DOS box does not have to worry about what kind of traffic is coming
in. It just has to open a context for the 5 tuple, and check whether it
sees 1-RTT packets coming back. And then maybe count the volume of 1-RTT
packets coming back.

The worry is that one of the bots might start a legitimate connection,
then disclose its five tuple to the rest of the botnet. The whole botnet
can then spoof the 5 tuple that was just pin-holed by the DOS box. A
simple "open-close" logic is thus not good enough. The DOS box must also
enforce some kind of rate limiting per 5 tuple.

Which also means that if a botnet can predict the 5 tuple used by a
legitimate connection and then spoof it, it can DOS it. Once you start
digging that particular rabbit hole, the joy never stops...

-- Christian Huitema


>
> DOTS is an intriguing idea, but again I'm not sure what signatures one
> could use to configure it.
>
> On Fri, Jun 19, 2020 at 3:43 PM Christian Huitema <huitema@huitema.net
> <mailto:huitema@huitema.net>> wrote:
>
>
>     On 6/19/2020 3:05 PM, Ted Hardie wrote:
>>     Hi Christian,
>>
>>     A question in-line,
>>
>>     On Fri, Jun 19, 2020 at 2:49 PM Christian Huitema
>>     <huitema@huitema.net <mailto:huitema@huitema.net>> wrote:
>>
>>         When under DOS attack, you want to "minimize blowback", i.e.,
>>         as much as possible avoid generating packets in response to
>>         attack traffic. So, yes, a server may choose to not send
>>         stateless resets to anyone when under attack; in fact, my
>>         recommendation would be that a server SHOULD NOT send
>>         stateless resets to anyone when under attack.
>>
>>         That said, Igor raised an interesting point about return
>>         traffic. It would be very nice if DOS protection boxes could
>>         distinguish between "validated traffic" that the server
>>         presumably intends to process, and "unsolicited traffic" that
>>         will just consume resource. The box can then reserve some
>>         share of the resource for validated traffic, and place the
>>         rest of the traffic in a lower priority queue. Fine, but
>>         there needs to be a test. The classic test is that incoming
>>         traffic is "validated" if the protection box can match it
>>         with return traffic coming from the server -- for some
>>         definition of matching.
>>
>>         From that point of view, stateless reset is definitely not
>>         helpful. But problematic traffic goes beyond that. The server
>>         will reply to a client's initial with a server's initial
>>         packet. Does that validate the response traffic? OK, maybe
>>         the protection box can programmed to only validate traffic if
>>         it sees 1RTT packets. But many servers will send 1-RTT
>>         packets as 0.5 RTT. Does that validate the response traffic?
>>
>>         We might say that traffic is validated when the handshake is
>>         confirmed, but the protection box does not understand the TLS
>>         handshake, it just sees packet types and packet sizes. It
>>         cannot distinguish between 0.5RTT data and 1RTT data, and
>>         thus the closest approximation of "validation" would be
>>         seeing more than an initial window worth of traffic coming
>>         back from the server. That does not sound great.
>>
>>         On the other hand, things get much better if the server under
>>         attack can adopt a defensive posture and help the DOS
>>         protection box do its job. Suppose that a server can detect
>>         that it is under attack -- or be explicitly configured so.
>>         The simplest defensive posture would be to (1) disable
>>         stateless reset and (2) not send any 0.5RTT packet, including
>>         in response to 0-RTT. The protection boxes can at that point
>>         take the 1-RTT packets from the server as indicating validation.
>>
>>     Perhaps I'm misreading you here, but this sounds like you expect
>>     the protection boxes to be able to distinguish when servers see
>>     themselves as under DOS from when they don't, so that they can
>>     tell that the lack of 0.5RTT is an indication of an attack
>>     response.  Given distribution patterns of DOS attacks, I'm
>>     struggling to see whether that will commonly be the case.
>
>     Maybe I wasn't too clear. I don't think that DOS protection boxes
>     are trying to take directions from servers. I also don't believe
>     that DOS protection boxes have any practical means of
>     distinguishing between 0.5 RTT and 1 RTT traffic. That might
>     actually be a problem.
>
>>
>>     You could, of course, always put handshake traffic into
>>     low-priority queues until you see the 1-RTT packets that validate
>>     the server's interest.  That would make 1-RTT traffic effectively
>>     a path signal of validation. 
>
>     Yes, that's more or less what I was considering. But of course it
>     only works if the server refrains from sending 1RTT packets until
>     the handshake is confirmed. Otherwise, the protection boxes will
>     have to count the number of 1RTT packets coming back from the
>     server, or maybe the amount of data coming from the server, and
>     only consider the traffic validated if number of packets and
>     amount of data are larger than common values of IW. You might end
>     up with 3 classes instead of 2 -- validated if 1RTT > IW,
>     validation in progress if 1RTT see but < IW, not validated yet if
>     no 1RTT seen.
>
>>     My concern with that is that using those low priority queues
>>     during the handshake phase seems likely to result in worse
>>     latency and increased risk of packet loss (which can be tricky
>>     during that phase).  That seems a heavy price to pay during
>>     non-attack times for better protection when under attack.
>
>     The problem of the DOS protection box is to drop as much traffic
>     as possible while still letting some good in, and also without
>     causing performance hits when there are no attacks. There are
>     indeed some thorny traffic management issues there. More
>     information helps, and Ben's suggestion to look at DOTS is
>     certainly on point.
>
>     I am just doing a thought experiment so far. One immediate
>     observation is that in the absence of other data, DOS protection
>     designers are likely to look at packet types, and certainly reason
>     about 1RTT packets versus long header packets. This is definitely
>     an ossification risk.
>
>     The other observation is that we have to be careful about the
>     realism of thought experiments. If I remember correctly, the
>     payload of the Blaster worm was something like:
>
>         On each bot
>
>             Open 256 threads
>
>                 In each thread, loop on "GET some large page from the
>     server"
>
>     Reasoning about validated traffic would not stop that. But
>     reasoning about validated traffic forces the attacker to disclose
>     the "real" IP addresses of the attacking bots, which then enables
>     a second line of defense.
>
>     -- Christian Huitema
>
>
>
>>
>>     Am I misunderstanding how this is distinguished?
>>
>>     Clue appreciated,
>>
>>     Ted
>>
>>      
>>
>>         Maybe we should specify that.
>>
>>         -- Christian Huitema
>>
>>         On 6/19/2020 1:42 PM, Lubashev, Igor wrote:
>>>
>>>         > There is no need for servers to decrypt CIDs in QUIC-LB.
>>>         Presumably the server has a lookup table for its CIDs.
>>>
>>>          
>>>
>>>         Sending a stateless reset in response to a junk packet would
>>>         cost more CPU than verifying CID integrity.  But, yes, a
>>>         server may choose to not send stateless resets to anyone
>>>         when under attack.
>>>
>>>          
>>>
>>>          
>>>
>>>         *From:* Martin Duke <martin.h.duke@gmail.com>
>>>         <mailto:martin.h.duke@gmail.com>
>>>         *Sent:* Friday, June 19, 2020 2:44 PM
>>>
>>>         >Unfortunately, Retry system protects only server's memory
>>>         state and some CPU cycles spent on crypto.  (Servers still
>>>         need to decrypt CID to decide it is invalid, and if the
>>>         attacker is clever enough to establish one valid connection
>>>         and use that CID in a flood, the server will also be
>>>         decrypting packets.)  
>>>
>>>          
>>>
>>>         There is no need for servers to decrypt CIDs in QUIC-LB.
>>>         Presumably the server has a lookup table for its CIDs.
>>>
>>>          
>>>
>>>         It is true that Retry Services (and indeed, the Retry
>>>         concept as a whole) does nothing to protect network capacity.
>>>
>>>          
>>>
>>>         On Fri, Jun 19, 2020 at 8:08 AM Lubashev, Igor
>>>         <ilubashe@akamai.com <mailto:ilubashe@akamai.com>> wrote:
>>>
>>>             It looks like
>>>             https://tools.ietf.org/html/draft-ietf-quic-load-balancers-02#section-5
>>>             <https://urldefense.proofpoint.com/v2/url?u=https-3A__tools.ietf.org_html_draft-2Dietf-2Dquic-2Dload-2Dbalancers-2D02-23section-2D5&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=Djn3bQ5uNJDPM_2skfL3rW1tzcIxyjUZdn_m55KPmlo&m=eTT-BoZ1fMitywKRVSxpU3js0lhO0qkspYTKljvj-ys&s=1x7AN2a51X_zV5xSyK0uCL8vZ5cugD3n4lbFWZDqVQg&e=>
>>>             is an excellent discussion of Retry mechanics.  It
>>>             definitely deserves a reference from this manageability
>>>             draft.
>>>
>>>              
>>>
>>>             The Retry mechanisms described in LB draft are all
>>>             cooperating boxes, and servers must be aware of them. 
>>>             Unfortunately, Retry system protects only server's
>>>             memory state and some CPU cycles spent on crypto. 
>>>             (Servers still need to decrypt CID to decide it is
>>>             invalid, and if the attacker is clever enough to
>>>             establish one valid connection and use that CID in a
>>>             flood, the server will also be decrypting packets.) 
>>>             Retry does nothing to protect network resources.
>>>
>>>              
>>>
>>>             The PR I opened
>>>             (https://github.com/quicwg/ops-drafts/pull/94
>>>             <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_quicwg_ops-2Ddrafts_pull_94&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=Djn3bQ5uNJDPM_2skfL3rW1tzcIxyjUZdn_m55KPmlo&m=eTT-BoZ1fMitywKRVSxpU3js0lhO0qkspYTKljvj-ys&s=VpvhQ9n79rANIk3O5Sm7PZyToFQEennoPt9iqFpWbq8&e=>)
>>>             is about uncooperating devices that try to mitigate
>>>             volumetric network attacks.
>>>
>>>              
>>>
>>>              
>>>
>>>             *From:* Martin Duke <martin.h.duke@gmail.com
>>>             <mailto:martin.h.duke@gmail.com>>
>>>             *Sent:* Wednesday, June 17, 2020 8:16 PM
>>>
>>>             Hi Igor, you might want to check out the "Retry
>>>             Services" bit of the QUIC-LB draft. This has something
>>>             to do with the DDoS use case you discuss.
>>>
>>>              
>>>
>>>             On Wed, May 27, 2020 at 9:07 AM Lubashev, Igor
>>>             <ilubashe@akamai.com <mailto:ilubashe@akamai.com>> wrote:
>>>
>>>                 I’m working on a manageability draft PR for this
>>>                 (how to rate limit UDP to reduce disruption to QUIC
>>>                 if you have to rate limit UDP).  ETA end of the week
>>>                 (if I do not get pulled into something again).
>>>
>>>                  
>>>
>>>                 The relevant observation is that DDoS with UDP that
>>>                 is indistinguishable from QUIC will happen.  UDP is
>>>                 already the most prevalent DDoS vector, since it is
>>>                 easy for a compromised non-admin app to send a flood
>>>                 of huge UDP packets (with TCP you get throttled by
>>>                 the congestion controller).  So there WILL be DDoS
>>>                 protection devices out there to try to mitigate the
>>>                 problem, possibly by observing both directions of
>>>                 the flow and deciding whether a packet belongs to a
>>>                 valid flow or not.
>>>
>>>                  
>>>
>>>                 Since such middle boxes will be created, the more
>>>                 explicit and normative Invariants are about what one
>>>                 can expect, the less such middle boxes may decide
>>>                 for themselves.  For example (I did not think long
>>>                 about it), if some elements of path validation could
>>>                 land into Invariants (roughly, “no more than X
>>>                 packets/bytes can be sent on a new path w/o a return
>>>                 packet”), a DDoS middle box may use this fact and
>>>                 active connection migration might still have a
>>>                 chance during an attack (NAT rebinding could be
>>>                 linked by DDoS boxes to an old connection via
>>>                 unchanged CID).
>>>
>>>                  
>>>
>>>                   * Igor
>>>
>>>                  
>>>
>>>                  
>>>
>>>                 *From:* Christian Huitema <huitema@huitema.net
>>>                 <mailto:huitema@huitema.net>>
>>>                 *Sent:* Wednesday, May 27, 2020 11:34 AM
>>>
>>>                 On 5/27/2020 8:28 AM, Kyle Rose wrote:
>>>
>>>                     On Wed, May 27, 2020 at 10:34 AM Ian Swett
>>>                     <ianswett=40google.com@dmarc.ietf.org
>>>                     <mailto:40google.com@dmarc.ietf.org>> wrote:
>>>
>>>                         I was agreeing with MT, but I'm happy to see
>>>                         some more MUSTs added if people feel that'd
>>>                         be helpful.
>>>
>>>                      
>>>
>>>                     Coincidentally, we were just talking about this
>>>                     internally at Akamai yesterday. IMO, an
>>>                     invariants document isn't really helpful if it
>>>                     isn't normative, and for it to be normative it
>>>                     (or a related practices doc for operators)
>>>                     really needs to spell out clear boundaries for
>>>                     operators with MUSTs..
>>>
>>>                      
>>>
>>>                     The example that came up yesterday was around
>>>                     operators filtering QUIC in the event of a DDoS:
>>>                     one recommendation based on some conversations
>>>                     going back at least to Prague 2019 was to hash
>>>                     packets on 4-tuple and filter those below a hash
>>>                     value chosen for a desired ingress limit instead
>>>                     of doing what most operators do with UDP today,
>>>                     which is to cap UDP throughput and just drop
>>>                     packets randomly or tail drop.
>>>
>>>                 Interesting. Did they consider using the CID, or a
>>>                 fraction of it? This looks entirely like the
>>>                 scenario for which we developed stateless retry.
>>>
>>>                      
>>>
>>>                     This recommendation certainly imposes some
>>>                     constraints on future protocol development that
>>>                     motivate new invariants: for instance, it would
>>>                     preclude sharding a connection across multiple
>>>                     source ports (not that there is necessarily a
>>>                     reason to do this; it's just an example). But
>>>                     more importantly, it goes beyond invariants:
>>>                     it's one among many practices compatible with
>>>                     the current set of invariants, some reasonable
>>>                     and some terrible.
>>>
>>>                 This would break the "preferred address"
>>>                 redirection. Preferred address migration may or may
>>>                 not be spelled out in the invariants.
>>>
>>>                      
>>>
>>>                     Operators are going to do things to QUIC
>>>                     traffic, so it would be good to offer them
>>>                     recommendations that are compatible with broad
>>>                     deployability.
>>>
>>>                  
>>>
>>>                 Yes, we do need the invariants for that.
>>>
>>>                 -- Christian Huitema
>>>

Is the invariants draft really standards track? Martin Duke
Re: Is the invariants draft really standards trac… Martin Thomson
Re: Is the invariants draft really standards trac… Roberto Peon
Re: Is the invariants draft really standards trac… Ian Swett
Re: Is the invariants draft really standards trac… Jana Iyengar
Re: Is the invariants draft really standards trac… Lars Eggert
Re: Is the invariants draft really standards trac… Ian Swett
Re: Is the invariants draft really standards trac… Martin Duke
Re: Is the invariants draft really standards trac… Kyle Rose
Re: Is the invariants draft really standards trac… Christian Huitema
Re: Is the invariants draft really standards trac… Jana Iyengar
RE: Is the invariants draft really standards trac… Lubashev, Igor
Re: Is the invariants draft really standards trac… Jared Mauch
Re: Is the invariants draft really standards trac… Paul Vixie
Re: Is the invariants draft really standards trac… Martin Duke
Re: Is the invariants draft really standards trac… Martin Duke
RE: Is the invariants draft really standards trac… Lubashev, Igor
Re: Is the invariants draft really standards trac… Martin Duke
RE: Is the invariants draft really standards trac… Lubashev, Igor
Re: Is the invariants draft really standards trac… Christian Huitema
Re: Is the invariants draft really standards trac… Benjamin Kaduk
Re: Is the invariants draft really standards trac… Ted Hardie
Re: Is the invariants draft really standards trac… Christian Huitema
Re: Is the invariants draft really standards trac… Martin Duke
Re: Is the invariants draft really standards trac… Christian Huitema
Re: Is the invariants draft really standards trac… Martin Duke
Re: Is the invariants draft really standards trac… Paul Vixie
RE: Is the invariants draft really standards trac… Mike Bishop
Re: Is the invariants draft really standards trac… Paul Vixie