[sfc] Benjamin Kaduk's Discuss on draft-ietf-sfc-serviceid-header-12: (with DISCUSS and COMMENT)

Benjamin Kaduk via Datatracker <noreply@ietf.org> Wed, 04 November 2020 04:36 UTC

MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-sfc-serviceid-header@ietf.org, sfc-chairs@ietf.org, sfc@ietf.org, Greg Mirsky <gregimirsky@gmail.com>, gregimirsky@gmail.com
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <160446460674.32614.15877665689972040502@ietfa.amsl.com>
Date: Tue, 03 Nov 2020 20:36:46 -0800
Archived-At: <https://mailarchive.ietf.org/arch/msg/sfc/dR-WEbfJizLEwq_qedU5lNtlgcs>
Subject: [sfc] Benjamin Kaduk's Discuss on draft-ietf-sfc-serviceid-header-12: (with DISCUSS and COMMENT)

Benjamin Kaduk has entered the following ballot position for
draft-ietf-sfc-serviceid-header-12: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-sfc-serviceid-header/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

Retaining my original Discuss position (without the "early warning"
note), as it is the one that was supported by Martin D and Alvaro:

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
This document defines (among other things) a mechanism for carrying
subscriber information in an NSH.  RFC 8300 (NSH) notes both that:

                                             Metadata privacy and
   security considerations are a matter for the documents that define
   metadata format.

and that:

      One useful element of providing privacy protection for sensitive
      metadata is described under the "SFC Encapsulation" area of the
      Security Considerations of [RFC7665].  Operators can and should
      use indirect identification for metadata deemed to be sensitive
      (such as personally identifying information), significantly
      mitigating the risk of a privacy violation.  In particular,
      subscriber-identifying information should be handled carefully,
      and, in general, SHOULD be obfuscated.

On the other hand, this document in its security considerations claims
that:

   Data plane SFC-related security considerations, including privacy,
   are discussed in [RFC7665] and [RFC8300].

and does not seem to incorporate any discussion of the privacy and
security considerations of the subscriber information metadata carried
by the new format it conveys.  Yes, it does note that all nodes with
access to the information are part of the same trusted domain, but I do
not think that is sufficient, especially given that personally
identifiable information is often subject to strict compliance regimes.

In short, 8300 and this document are referring to each other for privacy
considerations, and the actual privacy considerations do not seem to be
documented in either place.

Additionally, I did not see any indication of how the
subscriber-identifying information ought to be obfuscated (or an
explanation of why it is okay to violate the SHOULD from 8300).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

To record some additional synthesis of the above (original) remark with
my more thorough reading of the document: we are defining containers
specifically to contain subscriber and performance policy
identifiers/information.  While the specific contents are out of scope
for this document, we still are obligated to describe the general
classes of issues that can arise due to conveying those types of
information within a SFC domain.  We should also give guidance on how to
populate the contents of these context headers in a secure and
privacy-supporting manner, including the use of indirect identification
and obfuscation/encryption.

Futhermore (and this part is not a discuss point but may lead to me
switching my position to Abstain once the discusses are resolved), I
have some misgivings about including subscriber identification
information at all, and would prefer if it could instead be translated
into the relevant policy information element(s) needed by the SFP in
question before being applied to the NSH.  For example, rather than
saying "this packet is from user X" we could say "this packet is part of
quota bucket ABC (with bucket size Z) for time period Y" to enforce
per-user quota.  While in this case the identifier would still
ultimately lead back to an individual, the identifier would be rotated
periodically, and it is possible to achieve some level of de-linkability
as records age out (depending on how the "ABC" is generated, of course).
I do recognize that even for non-quota use cases where a user is part of
multiple distinct policy groups, the combination of those groups might
still identify only a small anonymity set, but the overall privacy
properties of such a design seem superior than consistent use of a
persistent identifier or identifiers, in aggregate.

I have an additional Discuss point after doing a more thorough review of
the document -- I think there's a (minor) internal inconsistency within
Section 3:

   Intermediary NSH-aware nodes have to preserve Subscriber Identifier
   Context Headers (i.e., the information can be passed to next hop NSH-
   aware nodes), but local policy may require an intermediary NSH-aware
   node to strip a Subscriber Identifier Context Header after processing
   it.

since it seems to say that NSH-aware intermediary nodes both "have to
preserve" and "may strip" a Service Identifier Context Header.
Similar language is used to describe the Performance Policy Identifier
Context Header, in Section 4, which would presumably receive a similar
modification to the Subscriber Identifier case.


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

I am preparing a significant number of (hopefully) editorial suggestions
in a local copy of the XML source; I plan to send that as a diff to the
authors as a response to the ballot mail.

Section 1

   This document does not make any assumption about the structure of
   subscriber or performance policy identifiers; each such identifier is
   treated as an opaque value.  The semantics and validation of these
   identifiers are policies local to an SFC-enabled domain.  This
   document focuses on the data plane behaviour.  Control plane
   considerations are out of the scope.

(Control plane considerations probably touch on the privacy properties
of the system as a whole, but I think I understand the point being made
here.)

Section 3

   The classifier and NSH-aware SFs MAY inject or strip a Subscriber
   Identifier Context Header as a function of a local policy.  In order
   to prevent interoperability issues, the type and format of the
   identifiers to be injected in a Subscriber Identifier Context Header
   should be configured to nodes authorized to inject and consume such
   headers.  [...]

I think this is more of a "needs to" rather than a "should".

             For example, a node can be instructed to insert such data
   following a type/set scheme (e.g., node X should inject subscriber ID
   type Y).  Other schemes may be envisaged.

Just to check my understanding, the fact that it was node X that
injected a given subscriber ID context header is determined from the
SPI, since the SPI uniquely identifies the SFP?

   Failures to inject such headers should be logged locally while a
   notification alarm may be sent to a Control Element.  The details of
   sending notification alarms (i.e., the parameters affecting the
   transmission of the notification alarms depend on the information in
   the Context Header such as frequency, thresholds, and content of the
   alarm (full header, timestamp, etc.)) should be configurable.

While this is purely an editorial remark (and I will be including a
proposal in my editorial diff), I did want to note that I can't actually
parse how the (outer) parenthetical is supposed to apply, and expect
that my suggestion is going to change the meaning somehow.

   This document adheres to the recommendations in [RFC8300] for
   handling the Context Headers at both ingress and egress SFC boundary
   nodes (i.e., to strip such Context Headers).  Revealing any personal
   and subscriber-related information to third parties is avoided by
   design to prevent privacy breaches in terms of user tracking.

I think s/third parties/parties outside the administrative domain/ may
be more accurate.

I would also end the last sentence after "design" and split the
remaining portion into a new sentence: "Accordingly, the scope for
privacy breaches and user tracking is limited to within the
administrative domain where the NSH is used".

   if the NSH-aware SF is instructed to do so.  For example, an SF that
   expects an internal IP address as subscriber identifier will discard
   Subscriber Identifier Context Headers conveying Mobile Subscriber
   ISDN Number (MSISDN), International Mobile Subscriber Identity
   (IMSI), or malformed IP addresses.

There's a little bit of cognitive dissonance here in that we say that
the content of the context header is opaque, yet are giving a list of
things that would involve peeking inside that opacity (and possibly some
additional structure to identify the type).  (No specific action
requested here, just making an observation.)

Section 4

   instance selection, or policy selection at SFs.  Note that the use of
   the Performance Policy Identifier is not helpful if the path
   computation is centralized and a strict SFP is presented as local
   policy to SF Forwarders (SFFs).

I'm not sure I understand why this is the case; wouldn't the performance
policy information still be useful for, e.g., not dropping UHR packets
when buffers are full, regardless of whether the SFP is strict or not?

   performance, or proximity considerations.  For the particular case of
   UHR services, the stand-by operation of back-up capacity or the
   deployment of multiple SF instances may be requested.

I'm not sure that I understand how a per-packet header would request
*deployment* of multiple instances.  Perhaps the intent is to say that
the presence of multiple instances is requested?  (Or perhaps I
misunderstand, of course.)

   In an environment characterised by frequent changes of link and path
   behaviour, for example due to variable load or availability caused by
   propagation conditions on a wireless link, the SFP may have to be
   adapted dynamically by on-the-move SFC path and SF instance
   selection.

(Is "on-the-move selection" a new term here?  I kind of thought we had
used a different term to describe this type of behavior in a previous
document, but couldn't find it quickly.)

   Multiple Performance Policy Identifier Context Headers MAY be present
   in the NSH, each carrying an opaque value for a distinct policy that
   need to be enforced for a flow.  Supplying conflicting policies may
   complicate the SFP computation and SF instance location.
   Corresponding rules to detect conflicting policies may be provided as
   a local policy to the NSH-aware nodes.  When such conflict is
   detected by an NSH-aware node, the default behavior of the node is to
   discard the packet and send a notification alarm to a Control
   Element.

[It seems like we are providing guidance on the "default behavior" of
something that is entirely dependent on there being some local policy,
which is perhaps an unusual thing to do.  I don't object to it, per se,
though.]

Section 6

When we recommend logging on exception conditions, we typically also
include a note about the risk of DoS due to log spew.

With respect to privacy considerations, whenever we have multiple
Subscriber Identifier Context Headers present we are providing the
information that thos different (types of) identifiers identify the same
subscriber or individual.  This can be used to correlate other
observations and track that individual more effectively.

If one was able to spoof a performance policy identification context
header one would be in a position to steal (quality of) service, which
is related to the "service disruption" attack already discussed in the
text, but IMO distinct from it.

   trusted ([RFC8300]).  Means to check that only authorized nodes are
   solicited when a packet is crossing an SFC-enabled domain are out of
   scope of this document.

I'm not entirely sure that I understand what "solicited" is being used
for, here.  Is it something to do with the source/destination (ouside
the SFC domain) of the packet, or the nodes on the path within the SFC
domain, or something else?

Section 8.2

[RFC8459] seems to be unused.

[sfc] Benjamin Kaduk's Discuss on draft-ietf-sfc-… Benjamin Kaduk via Datatracker
Re: [sfc] Benjamin Kaduk's Discuss on draft-ietf-… mohamed.boucadair
Re: [sfc] Benjamin Kaduk's Discuss on draft-ietf-… mohamed.boucadair
Re: [sfc] Benjamin Kaduk's Discuss on draft-ietf-… Dirk.von-Hugo
Re: [sfc] Benjamin Kaduk's Discuss on draft-ietf-… Benjamin Kaduk
Re: [sfc] Benjamin Kaduk's Discuss on draft-ietf-… mohamed.boucadair
Re: [sfc] Benjamin Kaduk's Discuss on draft-ietf-… Benjamin Kaduk
Re: [sfc] Benjamin Kaduk's Discuss on draft-ietf-… mohamed.boucadair
Re: [sfc] Benjamin Kaduk's Discuss on draft-ietf-… Benjamin Kaduk
Re: [sfc] Benjamin Kaduk's Discuss on draft-ietf-… mohamed.boucadair