[rtcweb] AD Review: draft-ietf-rtcweb-jsep

Adam Roach <adam@nostrum.com> Wed, 19 April 2017 05:42 UTC

From: Adam Roach <adam@nostrum.com>
To: draft-ietf-rtcweb-jsep@ietf.org, "rtcweb@ietf.org" <rtcweb@ietf.org>, Ted Hardie <ted.ietf@gmail.com>
Message-ID: <c76dd321-c854-91ce-b101-36ccf794c213@nostrum.com>
Date: Wed, 19 Apr 2017 00:42:37 -0500
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:52.0) Gecko/20100101 Thunderbird/52.0.1
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="------------AFA8241ED78AB574958D38CB"
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtcweb/4b5hcQtWX1stjFgWJY5W0t4WVjM>
Subject: [rtcweb] AD Review: draft-ietf-rtcweb-jsep
Precedence: list

RTCWEB working group --

I have completed my AD review for JSEP. I want to start by saying that 
this is an impressive body of work, and that everyone who has 
contributed to getting it to its current state deserves a round of 
applause. The volume of comments below does not represent that the 
document is of poor quality. Rather, the number of comments reflects on 
the breath, depth, and size of the document itself.

All of that said, I'm going to need to see a new version of this 
document before I put it in front of the IETF for last call. The 
document has a number of issues that I suspect will trip people up in 
their LC reviews. Most of these are feedback that the document authors 
should treat as normal last call comments. The feedback that I consider 
to block progressing the document, in my role as AD, is explicitly 
marked with the prefix "BLOCKER", and these will need to be resolved in 
a new version of the document before progressing it further. Note that 
it is entirely possible that something I have marked "BLOCKER" may stem 
from an error on my part; so recognize that these are not demands for 
change, as much as a need to have things either fixed in the document or 
explained to me.

Some general, top-level comments:

BLOCKER: The nits tool reports (and a cursory hand check supports this) 
that six of the references are unused, and RFC4572 has been (recently) 
obsoleted. The remaining references are largely out of date by many 
versions, which I believe the current tools will automatically handle if 
version numbers are simply pulled out of the source. Finally, a large 
number of the referenced working group documents still use the 
individual name (e.g., draft-nandakumar-rtcweb-sdp instead of 
draft-ietf-rtcweb-sdp).

There is a lot of text that should have been removed when the decision 
to prohibit SDP munging was taken. I will call these out individually, 
but wanted to call it out here at the top since it has impacts in many 
places. (In this sentence and elsewhere, I use "munging" exclusively to 
refer to changing SDP between a createOffer or createAnswer and the 
corresponding setLocalDescription. In particular, I *do* *not* use it to 
refer to changes applied to SDP sent to or received from the remote 
party). Although I repeat it below, I want to be clear in my position: 
if an implementation creates SDP, verifies that it has not changed since 
its creation, and then has difficulty processing it, that is a 
programming error, not a protocol error. It is not appropriate to call 
out how to recover from programming errors in a protocol document.

BLOCKER: Several of the SDP-building procedures in this document call 
out addition of "a=fingerprint" to individual media sections, while the 
canonical definition of "a=fingerprint" allows these to be added at 
either the session level or the media level. The same is true of ice-pwd 
and ice-ufrag. While the process of SDP building in this document 
appears to be prescriptive on purpose, one assumes that it is allowed 
some variation so as to accommodate deployment of future SDP extensions. 
Because of this state of affairs, it is unclear whether implementations 
that consolidate these attributes at the session level when they would 
otherwise be identical among all media sections is such an allowable 
variation. Ambiguity in specifications is never a good thing, so I 
believe that the draft needs to do one of: (a) specifically call out 
this kind of aggregation as allowed; (b) specifically call out this kind 
of aggregation as disallowed; or (c) include a blanket statement that 
implementations are allowed to emit SDP that is semantically equivalent 
to that which is generated by the procedures in this document, but which 
may vary in exact structure. I have a strong preference for (c) -- 
implementations will be required to deal with such variations anyway to 
interop with legacy SDP, so prohibiting JSEP clients from generating 
them serves no goal.

BLOCKER: The text in the third paragraph of section 5.7.1 intentionally 
contradicts language in RFC 5245; as such, this document needs to 
formally update that RFC. Note that updated RFCs are expected to be 
mentioned in the Abstract as well as the document metadata.

The rest of my comments are in document order.

Section 1.1 talks about Jingle as a potential transport, but the 
remainder of the document calls out a slew of MTI SDP features that 
don't have analogs for signaling in Jingle. I recommend either 
qualifying the discussion of Jingle with some mention of it requiring 
as-yet undefined extensions, or removing mention of Jingle altogether.

Section 1.2 uses the term "glare" without describing it. This term may 
be unfamiliar to readers, and should be explained or cited.

Section 3.2 has a paragraph starting "Lastly, while the actual media 
parameters are only know after an offer..."; the entire contents of this 
paragraph appear to be actually false if both ICE and DTLS are in use. I 
know there's conversation underway on MMUSIC and the WebRTC mailing 
lists on this topic, but none of them have yet described a situation in 
which this paragraph is actually true. Please remove it or add a 
sequence diagram showing how it arises.

The state labels in Figure 2 ("Remote-Offer") do not match the state 
names used in the document text ("have-remote-offer"). Please make these 
consistent.

Section 3.3, paragraph 3 is an example of text that should have been 
removed with SDP munging. Please remove it.

The same is true of the final three words in the first paragraph of 
section 3.4.

Section 3.5.1 talks about a "recycled m= section" without any context 
for what this means. Please provide a forward reference to the section 
that discusses m= section recycling.

BLOCKER: The final paragraph in section 3.5.1 says that candidates are 
*only* gathered for m= sections referenced in BUNDLE tags. Section 3.5.4 
talks about gathering more candidates than that under certain 
circumstances. Please make these consistent with each other.

Since it has been the cause of actual interop issues in the field, it is 
probably worth mentioning, in section 3.5.2.1, that the IceCandidate 
strings explicitly *do not* start with "a=".

The second paragraph of section 3.6.1 seems odd when talking about 
creating answers, and downright incorrect when talking about offers. 
Minimally, it needs text to explain why  x=0/y=0 would ever make more 
sense than using "a=sendonly/recvonly/inactive."

Section 3.6.2 talks about "original resolution" where it appears to want 
to reference the current resolution. This happens in three different places.

BLOCKER: Section 3.6.2, second paragraph, second sentence says a sender 
MUST scale to match imageattr. The same section, paragraph seven, says 
it SHOULD scale to match imageattr. There needs to be only one normative 
statement on this topic, so as to avoid confusion among implementors.

Section 3.6.2 would ideally include a disclaimer, like we have for many 
other features, indicating that -- despite normative statements in this 
section -- the remote side might not honor imageattr because of legacy 
behavior, and that implementations MUST be prepared to handle such 
situations.

Section 4.1.9 paragraph one includes an enumeration of things that can 
appear in the "type" field, but omits "rollback" from the list.

Section 4.2.6, second paragraph says that codec preferences cannot add 
formats that were negotiated away in previous offer/answer exchanges. My 
recollection is that this conflicts with the decision taken by the 
WebRTC working group; and a quick read of the description of 
setCodecPreferences() in their current spec seems to confirm this (at 
least, by omitting this restriction). Unless I'm mistaken about the 
state of play, this paragraph should be modified to indicate that codec 
preferences *can* add back in codecs were not negotiated in previous 
offer/answer exchanges.

BLOCKER: The first bullet on page 35 has a normative statement about the 
AVT profile strings that must be accepted, but uses an unspecified 
grammar to actually specify the string. If this is normative, it needs 
to be formal. Either specify the string using a formal and cited 
grammar, or provide an exhaustive list.

Section 5.2.1 contains the statement "both 'o=' and 's=' are 
meaningless" -- this should be qualified with "in JSEP." These fields 
are not generally meaningless in all uses of SDP.

BLOCKER: The first bullet on page 37 says that port 9 "MUST be used," 
which contradicts the behavior required of BUNDLE-only sections (for 
which port 0 MUST be used). Clarify.

Page 37 contains a statement that "attributes of category IDENTICAL or 
TRANSPORT should not be repeated in bundled m= sections". Two issues: 
this reads as normative, so consider "SHOULD NOT" rather than "should 
not." Also, I *think* this is true of bundle-only sections, not bundled 
sections in general.

Page 38, second-to-last bullet refers to "an 'a=rtcp-fb' mechanism" 
where it should say "an 'a=rtcp-fb' line".

Page 41 contains a statement reading "This promotes readability," which 
might be read by some as promoting machine parsing rather than 
troubleshooting. I propose replacing "readability" with "troubleshooting."

BLOCKER: The bullet at the bottom of of page 42 ("If any RtpTranceiver 
has been added...") is normative and redundant with the bullet in the 
middle of page 44 ("If any RtpTranceiver has been added..."), which is 
also normative. Remove one.

BLOCKER: The second bullet on page 43 ("If an RtpTranceiver has been 
stopped and is associated..." is normative and redundant with (and 
slightly contradictory to) the bullet on page 44 starting "If any 
RtpTranceiver has been slopped...") which is also normative. Remove one.

The bullets on page 45 use the construct "MUST only" three times. This 
is confusing, and likely to lead to implementors doing the wrong thing. 
Please rewrite in the form "The 'a=rtcp' line MUST NOT be added unless..."

The paragraph that spans pages 46 and 47 claims that the 
"VoiceActivityDetection" parameter has no impact on whether the local 
endpoint does silence suppression for the audio it sends. The paragraph 
immediately preceding it says that JSEP implementations MUST NOT emit 
"CN" codecs if this parameter is set to "false." This only makes sense 
because RFC 3264 allows SDP to specify codecs that you want to receive 
by do not plan to generate (i.e., you can send CN even if your offer 
doesn't have CN in it, as long as the answer *does*). This is slightly 
counter-intuitive to many implementors, so it probably bears mention.

Page 51, bullet starting "For each supported RTCP feedback...": replace 
"an 'a=rtcp-fb' mechanism" with "an 'a=rtcp-fb' line".

Page 52, bullet starting "An 'a=setup' line": add the words "when 
allowed" to the end of the bullet.

Section 5.5, bullet 4 talks about parsing the local session description. 
This is legacy from when SDP munging was allowed, and should be removed 
(especially the normative statement about parsing errors: the check that 
the SDP has not changed will catch changes, and any implementation that 
cannot parse its own SDP is exhibiting an internal error, not a protocol 
error).

Section 5.7, first paragraph: similar to above, change 
"setLocal/RemoteDescription" to "setRemoteDescription."

BLOCKER: Section 5.7.1, first paragraph: this claims that a= lines are 
order insensitive. Nothing in SDP's base definition makes this true, and 
nothing precludes specific attributes from assigning meaning to the 
order in which they appear. See 
draft-ietf-slim-negotiating-human-language-08 for an example of an SDP 
extension that *does* apply meaning to the order in which attributes 
appear (for better or worse).

Section 5.7.2 should include a statement (probably at the end) that 
implementations may know and process attributes other than those 
described in this section.

Page 62 has a bullet ending "If this is a local description, the 
'ice-lite' attribute MUST NOT be specified." This sentence is leftover 
from before SDP munging was prohibited and should be removed.

Section 5.8 describes the behavior of an implementation processing its 
own unmodified SDP. It contains significant error language that should 
have been removed when SDP munging was removed. In particular, please 
remove all of the following:

  * "If an error is returned, the session MUST be restored to the state
    it was in before performing these steps."
  * "If RTCP mux is not indicated, but was previously negotiated, i.e.,
    the RTCP ICE component no longer exists, this MUST result in an error."
  * "If any indicated RTP header extension is not supported, this MUST
    result in an error."
  * "If any indicated media format is not supported, this MUST result in
    an error."
  * "If any referenced primary payload types are not present, this MUST
    result in an error."

The first bullet of section 5.8 includes: "...unless it has been marked 
as bundle-only...," and the second includes "...and it has not been 
marked as bundle-only..." -- in both cases, I believe the indicated 
behavior is supposed to happen for any bundled section (e.g., in a 
re-offer or re-answer), regardless of whether it is bundle-only.

Section 5.8 contains the sentence: "If there is no RtpTransceiver 
associated with this m= section (which will only happen when applying an 
offer)..." -- it is grammatically ambiguous whether this says that the 
*absence* of an RtpTranceiver can only happen with an offer, or if the 
*presence* of an RtpTranceiver can only happen with an offer. Suggest 
rephrasing as: "If applying an offer and there is no RtpTransceiver 
associated with this m= section, find one..."

BLOCKER: Section 5.9 discusses remote description handling. This means 
that the first bullet on page 65 talks about remote description handling 
in the "have-local-offer" state. This means that the second sentence of 
that bullet talks about remote offer handling when in the 
"have-local-offer" state. That... doesn't seem right. I'm particularly 
concerned that the language regarding ICE restart probably belongs 
somewhere else, since it doesn't appear to belong where it is, but I 
can't quickly figure out where it's missing from.

The bullet in the middle of page 66 (starting "For each specified media 
format...") states that unsupported media formats must be ignored. The 
following bullet specifies that "rtx" media formats that refer to 
"primary payload types [that] are not present" MUST result in an error. 
Read naïvely, implementors may very well discard a payload type for lack 
of understanding, and then discover an "rtx" for that discarded PT, and 
generate an error ("it's not present!"). These bullets need additional 
text regarding how they interact with each other.

It's nice that you describe how you get from AS to TIAS, but as the 
formula here isn't intended to be tuned, please simplify it for 
imlementors as "TIAS = (950 * AS) - 16000". Minimally, include 
parentheses around the first three terms and the final two; while most 
people will assume proper order of operations, leaving these ungrouped 
is asking for trouble. (I once had an argument with an implementor 
reading an RFC who insisted that 96 wasn't an even multiple of 32; they 
asserted, with great vehemence, that 96 is 32 times three, and three is 
odd, so 96 is an *odd* multiple of 32. Don't underestimate the ingenuity 
of some of the people who will end up implement these specifications.)

Page 67 contains the acronym "DTX" without expanding it. Please expand it.

BLOCKER: Page 68 contains a sub-bullet that says "If the m= section 
references any media formats, RTP header extensions, or RTCP feedback 
mechanisms that were not present in the corresponding m= section in the 
offer, this indicates a negotiation problem and MUST result in an 
error." No. This is a violation of RFC 3264. Section 6.1 of that 
document, when talking about generating answers, specifically stipulates:

    The stream MAY indicate additional media formats, not listed in the
    corresponding stream in the offer, that the answerer is willing to
    send or receive (of course, it will not be able to send them at this
    time, since it was not listed in the offer).

Note, for example, that this is *exactly* the behavior that the CN 
handling I describe above relies on. General handling for header 
extensions and (I believe) feedback mechanisms is intended to operate 
the same way. This sub-bullet needs to be remove in its entirety.

BLOCKER: The final bullet of page 68 refers to answer directionality of 
"sendrecv" and "sendonly." These are incorrect when applying remote 
answers, for which these should be "sendrecv" and "recvonly." Please 
rephrase in terms of whether media is to be sent, or fix in some other 
way that doesn't get the sense reversed for remote answers.

BLOCKER: The first bullet on page 70 exhibits the same error.

Section 7 does not acknowledge IPv6. The IAB statement on IPv6 
<https://www.iab.org/2016/11/07/iab-statement-on-ipv6/> has a 
recommendation that all protocol specifications have IPv6 examples in 
them. Since we have a separate example document, and Suhas has indicated 
that he will be including some IPv6 examples in it, I think you can get 
away in this document by amending the final paragraph of section 7 as: 
"More examples of SDP for WebRTC call flows, including examples of IPv6 
addresses, can be found in [I-D.ietf-rtcweb-sdp]." I make no promise 
that other IESG members will find this sufficient, but I'm personally 
okay with it.

For the examples, please expand the acronyms "JS" and "UA".

BLOCKER: This is an issue with all of the examples. Page 37 contains a 
bullet that asserts that (unless excluded by rare circumstances which 
the examples do not stipulate), media formats MUST include those 
specified in [draft-ietf-rtcweb-video], section 5. None of the examples 
comply with this normative requirement.

All examples include telephone-event audio codecs, but omit a 
corresponding "a=fmtp" line. RFC4733 specifies:

    SDP descriptions using the event payload MUST contain an fmtp format
    attribute that lists the event values that the receiver can process.

While this is a little hinky for implementations that plan to send, but 
not to receive, DTMF (and that's the only reason I'm not marking this as 
a blocker), I don't believe that such implementations are exempted from 
the normative requirement (as SDP recipients may not be expecting such a 
situation). Including such fmtp lines in the examples is probably the 
best way to avoid resulting interop issues.

BLOCKER: The example on page 83 includes "a=rid" attributes, but omits a 
corresponding "a=extmap" attribute to indicate which value is to be used 
for the associated header extensions.

/a

[rtcweb] AD Review: draft-ietf-rtcweb-jsep Adam Roach
Re: [rtcweb] AD Review: draft-ietf-rtcweb-jsep Adam Roach
Re: [rtcweb] AD Review: draft-ietf-rtcweb-jsep Justin Uberti