Re: [tsvwg] Benjamin Kaduk's Discuss on draft-ietf-tsvwg-rfc4960-bis-17: (with DISCUSS and COMMENT)

tuexen@fh-muenster.de Sun, 16 January 2022 14:11 UTC

Content-Type: multipart/signed; boundary="Apple-Mail=_CC604499-1102-4A37-ADB7-020EA4B3BA4B"; protocol="application/pkcs7-signature"; micalg="sha-256"
Mime-Version: 1.0 (Mac OS X Mail 15.0 \(3693.40.0.1.81\))
From: tuexen@fh-muenster.de
In-Reply-To: <7EC2B47C-63BA-481A-B8A4-415BE9048767@fh-muenster.de>
Date: Sun, 16 Jan 2022 15:11:34 +0100
Cc: Gorry Fairhurst <gorry@erg.abdn.ac.uk>, tsvwg <tsvwg@ietf.org>, draft-ietf-tsvwg-rfc4960-bis@ietf.org, The IESG <iesg@ietf.org>, tsvwg-chairs <tsvwg-chairs@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <F6537B33-5A9B-4A0C-9247-3C96C135A75D@fh-muenster.de>
References: <163966609848.27749.5437634889896180266@ietfa.amsl.com> <7EC2B47C-63BA-481A-B8A4-415BE9048767@fh-muenster.de>
To: Benjamin Kaduk <kaduk@mit.edu>
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/b85lWBsQaHnoB961DSXhfT31U7o>
Subject: Re: [tsvwg] Benjamin Kaduk's Discuss on draft-ietf-tsvwg-rfc4960-bis-17: (with DISCUSS and COMMENT)
Precedence: list

Hi Benjamin,

Martin suggested to submit a new version of the ID with suggested resolutions of
issues you raised. So the comments below describe what I changed when trying to
address the issues you raised.

Please let me know if you want additional changes, don't like my suggestions or
if you need more information regarding the questions you raised (in particular
the ones you raised on your DISCUSS section).

Just to be clear: I'm submitting this to try to progress work on this document,
not because I'm not willing to accept any further changes based on an upcoming
discussion.

Best regards
Michael

> On 26. Dec 2021, at 13:00, tuexen@fh-muenster.de wrote:
> 
>> On 16. Dec 2021, at 15:48, Benjamin Kaduk via Datatracker <noreply@ietf.org> wrote:
>> 
>> Benjamin Kaduk has entered the following ballot position for
>> draft-ietf-tsvwg-rfc4960-bis-17: Discuss
>> 
>> When responding, please keep the subject line intact and reply to all
>> email addresses included in the To and CC lines. (Feel free to cut this
>> introductory paragraph, however.)
>> 
>> 
>> Please refer to https://www.ietf.org/blog/handling-iesg-ballot-positions/
>> for more information about how to handle DISCUSS and COMMENT positions.
>> 
>> 
>> The document, along with other ballot positions, can be found here:
>> https://datatracker.ietf.org/doc/draft-ietf-tsvwg-rfc4960-bis/
>> 
>> 
>> 
>> ----------------------------------------------------------------------
>> DISCUSS:
>> ----------------------------------------------------------------------
>> 
>> My apologies for placing a Discuss so close to the telechat.  I believe
>> that both of these topics are comparatively minor and should be easy
>> to resolve, but that it's important for the document to have a clear
>> answer for them.
> No problem. Whatever improves the document is appreciated.
>> 
>> I ask this with nospecific answer in mind that I need to hear -- per my
>> comments on §5.1.3, what are the actual requirements on the
>> (cryptographic) protection of the State Cookie?  I feel like I got
>> different signals from different parts of the document, and it would be
>> good to have consistent messaging throughout.
> There are two things relevant here:
> (1) The sender of the state cookie considers the IP address, it send the
>    state cookie to, belonging to the peer.
> (2) The sender of the state cookie uses the information in it, when
>    receiving the state cookie, to establish the association and wants
>    to be sure that it has not been modified.
> There is no requirement for privacy, since the information in the cookie
> has been on the wire as part of the INIT and/or INIT ACK.
> 
> Does that make sense to you and is consistent with the signals you got?
Do you need more information?
>> 
>> Section 15.5 establishes a registry for payload protocol identifiers,
>> but I am not sure how this registry is supposed to be able to
>> effectively avoid collisions when we do not specify the endianness in
>> which the value is represented on the wire (per §3.3.1).
> Section 3 contains, just before Subsection 3.1:
> All integer fields in an SCTP packet MUST be transmitted in network byte order,
> unless otherwise stated.
> 
> So the PPID is transmitted always in network byte order, since its description
> does not make any other statement. The only statement that is made is
> that any byte order conversions must be done by the application, this is
> not automatically done by the SCTP implementation.
Do you need more information?
>> 
>> 
>> ----------------------------------------------------------------------
>> COMMENT:
>> ----------------------------------------------------------------------
>> 
>> Peeking in the github repo for purposes of preparing an editorial PR, I
>> see that the original RFC 4960 was prepared in nroff -- my thanks to the
>> authors for the thankless work of converting it to XML!
>> 
>> I'm also glad to see the robustness changes made to the procedures for
>> handling INIT (and INIT ACK) chunks with invalid parameters.
>> 
>> I am also sympathetic to Alvaro's point about obsoleting additional
>> documents.
> This document now also obsoletes RFC 4460 and RFC 8540.
>> 
>> I only got a chance to review the diff from RFC 4960 and a small
>> selection of portions of this document, so this review is not as
>> comprehensive as I would typically perform.  That said, I trust that the
>> document has received sufficient review and therefore am comfortable
>> reballoting No Objection once my discuss points are resolved.
>> 
>> I put some editorial suggestions in a PR on github:
>> https://github.com/sctplab/rfc4960bis/pull/79
> Since I guess you want the discussion of these proposed changes in the
> GitHub infrastructure, please see my comments for the pull request
> at Github.
>> 
>> Abstract
>> 
>> I see that Warren already remarked about the length of the Abstract.
>> I have in my bookmarks
>> https://www.rfc-editor.org/policy.html#policy.abstract that indicates
>> "[a]n Abstract of more than 20 lines is generally not acceptable."
> The abstract is based on what is in RFC 4960 and then a first paragraph was
> added to discuss which documents are obsoleted.
> 
> The document you are linking states at the beginning:
> Note: This page has been replaced by the RFC Style Guide. Please see that page for current information.
> Looking at
> https://www.rfc-editor.org/rfc/rfc7322.html#section-4.3
> I don't see this section mentioning the limitation of number of lines.
> 
> I'm also not sure which format should be used for counting lines
> (isn't the .xml format the binding one), and if empty lines count.
> 
> I'm happy to remove the paragraph about the documents being obsoleted,
> tweak the .xml to remove the number of empty lines (when rendered as
> ASCII) or whatever you suggest. Please advise.
Please advise.
>> 
>> Section 3.3.4
>> 
>>  Cumulative TSN Ack: 32 bits (unsigned integer)
>>     The largest TSN, such that all TSNs smaller than or equal to it
>>     have been received and the next one has not been received.
>> 
>> (nit) The "and the next one has not been received" is implied by
>> "largest ... such that", since if the next one had been received, this
>> one would not be the largest such that the condition holds.
>> (This phrasing also seemse to appear in §3.3.8.)  I didn't remove this
>> in my editorial PR because it is plausible that the redundancy is
>> desired for emphasis.
> The text in 3.3.4 and 3.3.8 should be consistent.
> I agree, that the text you cite is redundant, but this redundancy is
> there on purpose as a result of feedback got via WG LC. If you prefer
> not to have the redundancy, please let me know and I will remove the
> "and the next one has not been received" part.
I kept the redundant text. I'll remove it on explicit request from you.
>> 
>> 
>> Section 5.1.3
>> 
>> When would it be permissible to not include a MAC on the State Cookie
>> data?  It seems like we refer to it as essentially mandatory from other
> You need some protection against changes by others. So encryption
> would also be a possibility with some checks. That it why we did not
> specify a MUST here. But we put in the spec how to handle the case
> when using a MAC.
>> parts of the spec, so I was surprised to see it only as a "SHOULD" here.
>> E.g., in §5.1.5 the first step in processing COOKIE ECHO is "compute a
>> MAC [and compare it to the one in the cookie]" -- there's no point in
>> doing that if there's no MAC in the cookie.  Similarly, in §14.1
>> "Parameters Necessary for the SCTP Instance" we list a secret key for
>> computing the MAC.
> Section 14 contains in the first paragraph:
> 
> This section is for illustrative purposes and is not considered to
> be requirements on an implementation...
> 
> So if you think we should make the MAC mandatory using MUST, please
> let me know and I do the change. All SCTP implementations I know of,
> use a MAC and don't encrypt the cookie. But there are implementation
> I don't know (but I'm aware that they exists).
>> 
>>  Since the State Cookie is not encrypted, it MUST NOT contain
>>  information which is not being envisioned to be shared.
>> 
>> What prevents the state cookie from being encrypted (other than
>> performance concerns)?  Perhaps we just need to require that *if* the
>> state cookie is not encrypted, it doesn't contain such information.
> OK. Changed to using "If the State Cookie..."
>> 
>> Section 6.2
>> 
>>  An SCTP receiver MUST NOT generate more than one SACK chunk for every
>>  incoming packet, other than to update the offered window as the
>>  receiving application consumes new data.  When the window opens up,
>>  an SCTP receiver SHOULD send additional SACK chunks to update the
>>  window even if no new data is received.  The receiver MUST avoid
>>  sending a large number of window updates -- in particular, large
>>  bursts of them.  [...]
>> 
>> "a large number" feels somewhat subjective, such that the power of the
>> "MUST"-level requirement is weakened.
> I understand. But I don't want to force any particular implementation
> here. That is why this text is generic. It is hard to specify a number
> or rate...
>> 
>>  If an endpoint receives a DATA chunk with no user data (i.e., the
>>  Length field is set to 16), it MUST send an ABORT chunk with a "No
>>  User Data" error cause.
>> 
>>  An endpoint SHOULD NOT send a DATA chunk with no user data part.
>>  This avoids the need to be able to return a zero-length user message
>>  in the API, especially in the socket API as specified in [RFC6458]
>>  for details.
>> 
>> This is just a COMMENT since the behavior was present in 4960, but why
>> allow sending an empty data chunk when the receiver is required to abort
>> because of it?  This seems like the reverse of the Postel principle.
> I agree.
> I think both should be the same level. If you API does allow handling of
> empty messages, the protocol could do it. So should we change the first
> text to:
> 
>  If an endpoint receives a DATA chunk with no user data (i.e., the
>  Length field is set to 16), it SHOULD send an ABORT chunk with a "No
>  User Data" error cause.
> 
> This improves consistency, still describes what is implemented and
> provides a way for implementations handling empty messages in the API
> (the RECEIVE primitive would allow this).
Revision 18 contains the above text. Please let me know if this is not
appropriate in your view.
>> 
>> Section 6.5
>> 
>>  Every DATA chunk MUST carry a valid stream identifier.  If an
>>  endpoint receives a DATA chunk with an invalid stream identifier, it
>>  SHOULD acknowledge the reception of the DATA chunk following the
>>  normal procedure, immediately send an ERROR chunk with cause set to
>>  "Invalid Stream Identifier" (see Section 3.3.10), and discard the
>>  DATA chunk.  [...]
>> 
>> What other behaviors are permitted (i.e., why is this SHOULD vs MUST)?
> An implementation might decide that sending a SACK, but not being able
> to deliver the user data to the application, is something which triggers
> an ABORT. 
>> 
>>  The Stream Sequence Number in all the outgoing streams MUST start
>>  from 0 when the association is established.  [...]
>> 
>> If we were starting from scratch this would probably be undesirable (a
>> la draft-gont-numeric-ids-sec-considerations), but I guess it's not
>> really changeable now, and we have to rely on the random initial TSN and
>> verifiers to prevent off-path guessing.
> You can't change this without breaking compatibility with existing
> implementations, which is what is not in scope for a -bis document in
> my mind.
> However, you would need to negotiate the initial values which would add
> up to 128 KB to the handshake.
> 
> Looking at
> https://www.rfc-editor.org/rfc/rfc9000.html#section-19.8
> it seems that QUIC does not randomise the initial Offset (i). It
> states "The first byte in the stream has an offset of 0.".
>> 
>> Section 11.2.5
>> 
>> Looking at the diff from RFC 4960, it seems that we no longer claim that
>> there is a "data retrieval id" passed with the communication-lost
>> notification.  I don't have a great sense for whether this makes sense,
>> e.g., due to an expectation that SEND-FAILURE events will be generated
>> separately for all affected messages.
> This was a mistake in the specification. The data retrieval id is used
> by RECEIVE_UNSENT and RECEIVE_UNACKED and is an identifier for a
> user message. If the communication is lost, there can be multiple messages
> to be retrieved. That is why you can't provide a single data retrieval id
> in a communication lost notification. If the communication is lost, the
> upper layer gets a single communication lost notification and a send failure
> notification for each message. That way the mechanism can work.
> 
> This mistake was not detected earlier, since this is a conceptual API.
> Most kernel implementations use the Socket API specified in RFC 6458,
> which does it a bit differently (due to Socket API constraints).
> 20 years ago we did a userland stack with an API based on the one given
> in RFC 2960, but we did not implement the data retrieval ID part.
> So we did not catch it...
>> 
>> Section 12
>> 
>> My apologies if I failed to find some relevant content in my
>> somewhat-piecemeal review, but I note that we require each HEARTBEAT
>> chunk to receive a corresponding HEARTBEAT ACK, so that (according to
>> the protocol rules) an endpoint can force the peer to do work.  Since
>> the heartbeat is expected to only occur once every RTO or so (or less
>> often), should we allow an endpoint to abort the association on receipt
>> of too many heartbeats?
> I would prefer not to specify anything. The sending of the HEARTBEAT
> can also be triggered by the application. And the network could also
> buffer several HEARTBEATs that were sent using an appropriate timing,
> but arrive (due to network buffering) in a bulk.
> 
> Any endpoint is free to abort an association anytime to protect itself.
> But I find it hard to specify appropriate thresholds and specifying
> is not needed for interoperability.
>> 
>> It also looks like it's allowed for a sender to leave gaps in the TSN
>> space when sending (non-fragmented) DATA chunks.  This is also the case
> No, that is not true. The sender needs to assign TSN consecutive.
> Section 3.3.1 contains:
> 
> The TSNs of DATA chunks sent SHOULD be strictly sequential.
> 
>> for QUIC (for the analogous numer space), and in QUIC's security
>> considerations there is a note that a sender who suspects the peer of
>> misbehaving (i.e., sending false/speculative ACKs) can deliberately
>> leave such a gap and abort the association if the non-existent TSN is
>> ACKed.  Do we want to make such a statement here as well?
> Section 12.4 contains:
> 
> An SCTP implementation SHOULD abort the association if it receives
> a SACK chunk acknowledging a TSN that has not been sent.
> 
> I think this is the best we can state...
>> 
>> Section 12.2.3
>> 
>>  Whenever ESP is in use, application-level encryption is not generally
>>  required.
>> 
>> I suggest removing this statement; we now have plenty of examples where
>> double-encryption is actually the right solution, and advances in
>> computing power have removed many reasons to desire avoiding it.
> I removed that statement.
>> 
>> Section 12.2.4.1
>> 
>>  The design of SCTP is resistant to flooding attacks, particularly in
>>  its use of a four-way startup handshake, its use of a cookie to defer
>>  commitment of resources at the responding SCTP node until the
>>  handshake is completed, and its use of a Verification Tag to prevent
>>  insertion of extraneous packets into the flow of an established
>>  association.
>> 
>> I suggest mentioning here that using distinct initial TSN and
>> initiate tag values improves robustness against such attacks, by making
>> attackers guess 32 more bits of values.  (§3.2.2 and 3.2.3 do allow for
>> the values to be the same, with concomitant reduced protection.)
> There are a couple of points here:
> 1. An endpoint can select the Initial TSN in whatever way it wants.
>   It will be able to interoperate.
> 2. Using a randomized Initial TSN makes it harder for an off-path
>   attacker to inject DATA chunks, but not other chunks. So the
>   impact of the selection of the Initial TSN is limited (to user data).
> 3. Selecting the Initial TSN to be the same value as the verification
>   tag is somewhat strange. It is random, but the same value as another
>   one. So it does not really give you increased security, possibly it
>   helps in debugging (since you see in a trace which does not contain
>   the initial hand shake the Initial TSN).
> 4. I am aware of an (old) implementation, which selected the Initial
>   TSN to be 0.
> 5. Major implementations select the Initial TSN as a random number.
> 
> So contemplating about it, I think we could remove the text that
> the verification tag and the initial TSN MAY be set tot the same value and
> add a text stating that the value SHOULD be a random number.
> 
> That reflects reality for a lot of implementations, provides a more
> consistent selection, but still allows other choices (for whatever
> reason).
Revision 18 now replaces

Defines the initial TSN that the sender will use. The valid range is
from 0 to 4294967295. This field MAY be set to the value of the
Initiate Tag field.

by

<t>Defines the initial TSN that the sender of the INIT chunk will use.
The valid range is from 0 to 4294967295 and the Initial TSN SHOULD be set to a
random value in that range.
The methods described in <xref target='RFC4086'/> can be used for the
Initiate TSN randomization.</t>

Similar text for the INIT ACK chunk.
>> 
>>  The IP Authentication Header and Encapsulating Security Payload might
>>  be useful in reducing the risk of certain kinds of denial-of-service
>>  attacks.
>> 
>> Current IPSECME WG guidance is to prefer ESP with NULL encryption over
>> AH, so I'd recommend not mentioning AH here.
> Text is changed to:
> 
> <t>ESP might be useful in reducing the risk of certain kinds of
> denial-of-service attacks.</t>
>> 
>> Section 12.4
>> 
>>  INIT ACK chunk will be larger than the original INIT chunk.  An SCTP
>>  implementation SHOULD attempt to make the INIT ACK chunk as small as
>>  possible to reduce the possibility of byte amplification attacks.
>> 
>> The emerging consensus in QUIC and (D)TLS seems to be to limit the
>> amplification factor to at most three times the received data; do we
>> want to mention that heuristic here?
> I'm not sure...
> 
> From a high level, a factor of 3 is OK. Since you can use the concatenation
> of the INIT chunk and the INIT ACK chunk (without the state cookie parameter)
> as the state cookie and you can assume that size of the INIT ACK chunk
> without the state cookie is about the size of the INIT chunk.
> 
> However, the state cookie also has some overhead, like the MAC, some
> address information, a timestamp, tie tags, some other implementation
> specific stuff.
> 
> Testing with the shortest possible INIT chunk of 20 bytes, FreeBSD responds
> with an INIT ACK chunk of 196 bytes and Linux responds with an INIT ACK chunk
> of 252 bytes. This can be optimised a bit (at least on FreeBSD), but already
> the HMAC-SHA1 gives 20 bytes overhead, which is as large as the input...
> 
> For larger INIT chunks (more supported extensions), the ratio would go against 3...
> 
> So I would keep the text as is.
>> 
>> Section 15.6
>> 
>>  SCTP services can use contact port numbers to provide service to
>>  unknown callers, as in TCP and UDP.  IANA is requested to open the
>>  existing "Service Name and Transport Protocol Port Number Registry"
>>  for SCTP using the following rules, which we intend to mesh well with
>>  existing port-number registration procedures.  [...]
>> 
>> I don't really see anything following that looks like rules for IANA to
>> follow.  Is that phrase stale text from 4960?  (Deferring the actual
>> procedures to RFC 6335 seems like the right approach.)
> What is following is the deferral to RFC 6335. But let us make this clearer.
> What about just using:
> 
> <t>SCTP services can use contact port numbers to provide service to unknown
> callers, as in TCP and UDP.
> An IESG-appointed expert reviewer supports IANA in evaluating SCTP port
> allocation requests, according to the procedure defined in
> <xref target='RFC8126'/>.
> The details of this process are defined in <xref target='RFC6335'/>.</t>
Revision 18 now contains the text suggested by me.
>> 
>> Section 18
>> 
>> Some of these references probably don't need to be classified as
>> normative.  e.g., RFC 3873 is the MIB for SCTP, but we don't require its
>> implementation, and RFC 4301 is only referenced for "operators might
>> consult [RFC4301]".
> Done.
>> 
>> Appendix A
>> 
>> I'd consider updating the note about the modifications made to the
>> sample code; noteworthy changes appear to include the use of fixed-width
>> types and some general cleanup for modern C best practices.
> I changed the sentence to:
> 
> The code is somewhat modified from <xref target='WILLIAMS93'/>, to ensure
> portability between big-endian and little-endian architectures, use fixed
> sized types to allow portability between 32-bit and 64-bit platforms, and
> general C code improvements.
>> 
>> Also, it might be worth checking for undefined behavior -- "1 << (31 -
>> i)" will be UB for i==1 on systems with 32-bit int, since 2**31 is not
>> representable in (signed) int, which is the default type in which the
>> expression will be evaluated, even if the result is going to be assigned
>> to an unsigned type.
> I guess you mean for i == 0. I changed the line to
>      rw |= 1UL << (31 - i);
>> 
>> 
>> 
>

Attachment: smime.p7s

[tsvwg] Benjamin Kaduk's Discuss on draft-ietf-ts… Benjamin Kaduk via Datatracker
Re: [tsvwg] Benjamin Kaduk's Discuss on draft-iet… tuexen
Re: [tsvwg] Benjamin Kaduk's Discuss on draft-iet… tuexen
Re: [tsvwg] Benjamin Kaduk's Discuss on draft-iet… Benjamin Kaduk
Re: [tsvwg] Benjamin Kaduk's Discuss on draft-iet… tuexen
Re: [tsvwg] Benjamin Kaduk's Discuss on draft-iet… Benjamin Kaduk
Re: [tsvwg] Benjamin Kaduk's Discuss on draft-iet… tuexen

Re: [tsvwg] Benjamin Kaduk's Discuss on draft-ietf-tsvwg-rfc4960-bis-17: (with DISCUSS and COMMENT)

Attachment: smime.p7s