Re: [Int-area] Adam Roach's Discuss on draft-ietf-intarea-provisioning-domains-10: (with DISCUSS and COMMENT)

Tommy Pauly <tpauly@apple.com> Thu, 23 January 2020 00:18 UTC

Sender: tpauly@apple.com
From: Tommy Pauly <tpauly@apple.com>
Message-id: <B873983A-1327-4388-8B9E-BEC8D2008450@apple.com>
Content-type: multipart/alternative; boundary="Apple-Mail=_0D50D5D0-3D1B-4A3A-9E91-CB7AAE3AB6EF"
MIME-version: 1.0 (Mac OS X Mail 13.0 \(3594.4.17\))
Date: Wed, 22 Jan 2020 16:18:07 -0800
In-reply-to: <4b2b529f-9e67-b6d0-1a9c-b6ad5cd96f01@nostrum.com>
Cc: The IESG <iesg@ietf.org>, ek@loon.com, draft-ietf-intarea-provisioning-domains@ietf.org, int-area@ietf.org, intarea-chairs@ietf.org
To: Adam Roach <adam@nostrum.com>
References: <157967080772.28909.16443816599872682093.idtracker@ietfa.amsl.com> <6AFB6A09-59BF-411D-816F-914BAAF86A9B@apple.com> <a1daf959-3331-e86d-2734-1f63a98d7625@nostrum.com> <BF4953C0-2502-4E08-B8B3-B55D04475416@apple.com> <3c3fb029-be06-02a2-1ac2-d23a3183d09a@nostrum.com> <7BBB92DD-7C0D-4A30-AE5D-3DB6A8424B9A@apple.com> <4b2b529f-9e67-b6d0-1a9c-b6ad5cd96f01@nostrum.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/int-area/2-J-d2H8iYVJQ-LY_3_OsJyBZkA>
Subject: Re: [Int-area] Adam Roach's Discuss on draft-ietf-intarea-provisioning-domains-10: (with DISCUSS and COMMENT)
Precedence: list


> On Jan 22, 2020, at 4:08 PM, Adam Roach <adam@nostrum.com> wrote:
> 
> Thanks again for the quick turn-around on this.
> 
> Using your proposed 2**(Delay + 10) seems to strike an okay balance, if I'm understanding the situation correctly. Double-check my thinking here: the scope of RA reach from an attacker will be available only on a single local link, which deployments typically limit to on the order of 500 clients or so. If all 500 are triggered at the same time and smooth out their requests over a one-second window, we're looking at a 500 TPS load on a web server. That's about 25% the capacity of a relatively low-end web server (e.g., Apache running on an Atom 1.66), which seems small enough to avoid major issues.

Yes, that sounds right to me. That limits the single burst size in response to a given RA being sent. Each of these hosts wouldn't request again for that PvD ID, even if RAs keep coming telling the host to update, for another ten seconds after that. And, the limit for the number of different hostnames (for a case in which a wildcard host exists, which presumably also has a higher capacity) is 5 different names in the ten second period, so that limits at 2500 TPS across all fetches to any server from a given network.
> 
> So, unless one of my assumptions above is wrong, I think your proposal below is a good solution to the issue. I'll clear my DISCUSS when a new version of the draft comes out (I would propose that you wait for instructions from your AD about when to do so).

Thanks! Yes, I'll wait for the go-ahead from Suresh. I appreciate your helping to work through these important details!

Best,
Tommy
> 
> /a
> 
> On 1/22/20 17:51, Tommy Pauly wrote:
>> Hi Adam,
>> 
>> Thanks for the feedback! The updated paragraph in the retrieval section, to indicate a maximum failure count per attachment, is:
>> 
>> If the request for PvD Additional Information fails due to a TLS error,
>> an HTTP error, or because the retrieved file does not contain valid PvD JSON,
>> hosts MUST close any connection used to fetch the PvD Additional Information,
>> and MUST NOT request the information for that PvD ID again for the duration
>> of the local network attachment. If a host detects 10 or more such failures
>> to fetch PvD Additional Information, the local network is assumed to be
>> misconfigured or under attack, and the host MUST NOT make any further
>> requests for PvD Additional Information, belonging to any PvD ID, for
>> the duration of the local network attachment. For more discussion, see {{security}}.
>> 
>> I've also expanded the security considerations DoS section as follows:
>> 
>> An attacker generating RAs on a local network can use the H-flag and the PvD ID
>> to cause hosts on the network to make requests for PvD Additional Information
>> from servers. This can become a denial-of-service attack, in which an attacker
>> can amplify its attack by triggering TLS connections to arbitrary servers in response
>> to sending UDP packets containing RA messages. To mitigate this attack, hosts
>> MUST:
>> 
>> - limit the rate at which they fetch a particular PvD's Additional Information;
>> - limit the rate at which they fetch any PvD Additional Information on a given local
>> network;
>> - stop making requests for a PvD ID that does not respond with valid JSON;
>> - stop making requests for all PvD IDs once a certain number of failures is reached
>> on a particular network.
>> 
>> Details are provided in {{retr}}. This attack can be targeted at generic web servers,
>> in which case the host behavior of stopping requesting for any server that doesn't
>> behave like a PvD Additional Information server is critical. Limiting requests for
>> a specific PvD ID might not be sufficient if the attacker changes the PvD ID values
>> quickly, so hosts also need to stop requesting if they detect consistent failure when
>> on a network that is under attack. For cases in which an attacker is pointing hosts at
>> a valid PvD Additional Information server (but one that is not actually associated
>> with the local network), the server SHOULD reject any requests that do not originate
>> from the expected IPv6 prefix as described in {{serverop}}.
>> 
>> For the delay calculation, you make a good point that the larger values get pretty unnecessarily large! I'm a bit concerned about making the minimum fetch range be ~4 seconds, as that could end up being user visible for some valid scenarios. How about making the formula "2**(10 + Delay)":
>> 
>> The target time for the delay is calculated
>> as a random time between zero and 2**(10 + Delay) milliseconds,
>> where 'Delay' corresponds to the 4-bit unsigned integer in
>> the last received PvD Option.
>> 
>> This limits it to 1 second as what the RA can request for fastest frequency bound. This isn't incredibly fast, and with the overall limits for how many requests can be made by a client (which provide the larger portion of the DoS prevention, I'd argue), I think this strikes a good balance between usability and precaution. Thoughts?
>> 
>> I've updated the GitHub text for anyone wanting to see the full flow: https://github.com/IPv6-mPvD/mpvd-ietf-drafts/pull/25 <https://github.com/IPv6-mPvD/mpvd-ietf-drafts/pull/25>
>> 
>> Thanks,
>> Tommy
>> 
>>> On Jan 22, 2020, at 2:58 PM, Adam Roach <adam@nostrum.com <mailto:adam@nostrum.com>> wrote:
>>> 
>>> Thanks for the explanation and the further proposed mitigation.
>>> 
>>> Allowing the RA to specify an arbitrarily small "Delay" parameter seems to still allow for a pretty big burst of traffic. If I read the proposed interpretation of the "Delay" bits correctly (2**(Delay * 2)), the current behavior is specified to allow a delay upper bound selected from one of the following (approximate) values:
>>> 
>>> 1 ms
>>> 4 ms
>>> 16 ms
>>> 64 ms
>>> 256 ms
>>> 1 second
>>> 4 seconds
>>> 16 seconds
>>> 1 minute
>>> 4 minutes
>>> 17 minutes
>>> 70 minutes
>>> 4 hours, 40 minutes
>>> 18 hours 38 minutes
>>> 3 days, 3 hours
>>> 1 week, 5 days
>>> 
>>> That's a pretty breathtaking scope, and it's hard to imagine that the first six or so are strictly needed, while all six are in a range that might overload a DDoS target. The final several seem a bit questionable as well, given normal operational timelines for network attachment. If the formula were revised to, e.g., "2**(Delay + 12)" instead of the current formula, you would have an enforced lower bound of roughly four seconds (which should be enough to blunt most DDoS attacks), and an upper bound of roughly 37 hours (which still seems excessive, although not quite as much as the previous upper bound).
>>> 
>>> Assuming the additional mitigation you propose below (10 maximum failures per attachment) as well as some means of achieving a lower-bound for "Delay" on the order of multiple seconds, I think I'm good clearing when a new version comes out.
>>> 
>>> Thanks for your work in thinking through practical solutions to this issue.
>>> 
>>> /a
>>> 
>> 
>

[Int-area] Adam Roach's Discuss on draft-ietf-int… Adam Roach via Datatracker
Re: [Int-area] Adam Roach's Discuss on draft-ietf… Warren Kumari
Re: [Int-area] Adam Roach's Discuss on draft-ietf… Tommy Pauly
Re: [Int-area] Adam Roach's Discuss on draft-ietf… Ted Lemon
Re: [Int-area] Adam Roach's Discuss on draft-ietf… Tommy Pauly
Re: [Int-area] Adam Roach's Discuss on draft-ietf… Adam Roach
Re: [Int-area] Adam Roach's Discuss on draft-ietf… Ted Lemon
Re: [Int-area] Adam Roach's Discuss on draft-ietf… Adam Roach
Re: [Int-area] Adam Roach's Discuss on draft-ietf… Adam Roach
Re: [Int-area] Adam Roach's Discuss on draft-ietf… Tommy Pauly
Re: [Int-area] Adam Roach's Discuss on draft-ietf… Tommy Pauly
Re: [Int-area] Adam Roach's Discuss on draft-ietf… Adam Roach
Re: [Int-area] Adam Roach's Discuss on draft-ietf… Tommy Pauly
Re: [Int-area] Adam Roach's Discuss on draft-ietf… Adam Roach
Re: [Int-area] Adam Roach's Discuss on draft-ietf… Tommy Pauly
Re: [Int-area] Adam Roach's Discuss on draft-ietf… Adam Roach
Re: [Int-area] Adam Roach's Discuss on draft-ietf… Tommy Pauly
Re: [Int-area] Adam Roach's Discuss on draft-ietf… Suresh Krishnan
Re: [Int-area] Adam Roach's Discuss on draft-ietf… Eric Vyncke (evyncke)
Re: [Int-area] Adam Roach's Discuss on draft-ietf… Tommy Pauly