Re: [secdir] Secdir last call review of draft-ietf-perc-private-media-framework-08

Vincent Roca <> Wed, 20 February 2019 07:13 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 61B1112DD85; Tue, 19 Feb 2019 23:13:49 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -6.9
X-Spam-Status: No, score=-6.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id KlyHlZy7OZQi; Tue, 19 Feb 2019 23:13:45 -0800 (PST)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 763C6127287; Tue, 19 Feb 2019 23:13:44 -0800 (PST)
X-IronPort-AV: E=Sophos;i="5.58,388,1544482800"; d="scan'208,217";a="370117659"
Received: from (HELO []) ([]) by with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Feb 2019 08:13:42 +0100
From: Vincent Roca <>
Message-Id: <>
Content-Type: multipart/alternative; boundary="Apple-Mail=_14CEBCAF-E510-416B-B875-098F0C576BFF"
Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\))
Date: Wed, 20 Feb 2019 08:13:41 +0100
In-Reply-To: <em0afb83b5-7014-4039-88b4-5ae3d87a6b0b@sydney>
Cc: Vincent Roca <>,,,
To: "Paul E. Jones" <>
References: <> <emb104d043-b701-4e92-9e08-1e1815c2981f@sydney> <> <em0afb83b5-7014-4039-88b4-5ae3d87a6b0b@sydney>
X-Mailer: Apple Mail (2.3445.102.3)
Archived-At: <>
Subject: Re: [secdir] Secdir last call review of draft-ietf-perc-private-media-framework-08
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Security Area Directorate <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 20 Feb 2019 07:13:49 -0000

Hello Paul,

Thanks for your answer and long explanations on the use of term « picture ». I was not aware of this
evolution of vocabulary. Yes, please submit -09 version and I’ll have a new look at it.



> Le 20 févr. 2019 à 04:58, Paul E. Jones <> a écrit :
> Vincent,
>>>> The same comment applies to the remaining two paragraphs. I suggest the authors
>>>> explain that the proposal prevents an attacker to claim being a regular Media
>>>> Distributor and therefore to fool endpoints because ...".
>>> The "false Media Distributor" example could happen. For example, if a hacker had access to the network where the Media Distributor is located and claims the Media Distributor's address while the conference is operating, the endpoints might not know. However, the worst result of such an attack is similar to just unplugging the box: packets just don't flow. The fake box could attempt to forward packets it receives, but they would fail to authenticate at the receiving endpoints and be discarded. However, this false device could never gain access to the media and would not be given HBH keys since it could not authenticate with the Key Distributor.
>> VR: you’ve said that a false Media Distributor will fail to get access to the media
>> and fail to forward the traffic. In the original text, you also say that:
>> "They Key Distributor will fail
>> to establish the secure association with the endpoint if the Media
>> Distributor cannot be authenticated."
>> I agree that an attacker may try to impersonate a valid MD, but this attempt will
>> fail. And when you say that:
>> « false Media Distributor may cascade to another legitimate Media
>> Distributor creating a false version of the real conference"
>> this is just wrong. The attacker won’t succeed.
>> This is why I’m saying that you should re-write this part to say that clearly that
>> thanks to the provided security mechanism, an attacker will fail to impersonate
>> a MD. Don’t pretend attacks can succeed if this is not the case.
> OK, I've substantially re-worked this section. When version -09 is published, hopefully it will look better. I'll try to get that out soon and welcome further comments on that text.
>>>> ** Section 8.2.2.
>>>> Is the following sentence correct:
>>>> The mitigation for a replay attack is to prevent old packets beyond a
>>>> small-to-modest jitter and network re-ordering sized window to be
>>>> rejected.
>>>> Is "prevent [...] to be rejected" correct? I'd say "... to be delivered"
>>>> instead.
>>> "prevent... rejected" is definitely an error. However, we cannot stop delivery. If replayed packets are received by either the Media Distributor or the endpoint, they should be discarded. So, perhaps this is better:
>>> "The mitigation for a replay attack is to discard old packets beyond a
>>> small-to-modest jitter and network re-ordering sized window. »
>> VR: Better, if we assume there's a timing based replay protection mechanism.
>> But the following item bellow indicates a totally different approach to replay protection
>> (ID based rather than time based), hence a major contradiction in this text. Please fix it.
>>>> Another comment. Replay protection seems to be based on timing considerations
>>>> rather than on the use of unique sequence numbers that must not be replayed
>>>> (except if a wrapping to zero occurs of course). Is that correct? Additionally,
>>>> is this mechanism carefully described in this document? Since it is explained
>>>> that E2E replay protection MUST be provided, it's essential to be very clear on
>>>> how to perform this. Failing to do so is a big issue.
>>> The mechanism underlying this is SRTP, which defines in 3.3.2/RFC3711 an "SRTP window size". We felt it was best to not introduce conflicting language. Perhaps we should just change the paragraph more substantially and refer to SRTP.
>>> Would you prefer this as the second paragraph?
>>> "The mitigation for a replay attack is to implement replay protection as
>>> described in Section 3.3.2 of [RFC3711].
>>> End-to-end replay protection **MUST** be provided for the
>>> whole duration of the conference. »
>> VR: You cannot mandate replay protection without giving details on the replay
>> protection mechanism to use. Here you mention SRTP replay protection. If you
>> read the section you mention you’ll see that this replay protection is not based
>> on packet timing but a packet sequence number:
>> "When message authentication is provided, SRTP protects against such attacks through a Replay List. »
>> and:
>> "Packet indices which lag behind the packet index in the
>> context by more than SRTP-WINDOW-SIZE can be assumed to have been
>> received… »
>> That’s totally different. If replay protection is a MUST, you need to be very clear
>> on what you expect from developers.
> The previous text (that you referred to as timing-based) didn't go into the document, but the text just above did. Rather than prescribe how to do it, I think it's best to defer to 3.2.2 of RFC 3711. That is the way to do it with SRTP, and PERC utilizes SRTP. We are not inventing any new mechanisms.
>>>> ** Section 8.2.3
>>>> It is said that "a Media Distributor can select to not deliver a particular
>>>> stream for a while." That's perfectly true, yet is this a "Delayed playout
>>>> attack"? I'd rather call it a Media Distributor censorship attack, or something
>>>> along this line. The main idea behind the attack is not to delay a stream but
>>>> to censor a source.
>>> This attack is not to censor, but to delay. For example, at time "T" Bob might say "I agree with your proposal". However, the "evil" Media Distributor could opt to not forward those packets and hold them. At some time "T+delta", the Media Distributor then forwards them. The receiving endpoint might not know that the packets were an hour old, so the receiver Alice thinks Bob is agreeing with a proposal that Bob actually doesn't agree with.
>>> However, a censorship attack is also possible. But I think we covered that in the Denial of Service section. The Media Distributor could always elect to not forward, which is in effect censoring the conversation.
>> VR: delaying a packet when at the same time it is mandatory (i.e., MUST) to
>> "discard old packets beyond a small-to-modest jitter and network re-ordering sized window »
>> means it will never be used. Hence the idea of « censoring a source ». That being said,
>> I don’t absolutely insist on using this term (censor) if you feel uncomfortable with it.
> I prefer the current language (which I didn't author), as i find it very descriptive.
>>>> In the second paragraph I don't understand why it is said that:
>>>> "the receiver believing that it is what was said just now, and only
>>>> delayed due to transport delay."
>>>> Any RTP packet contains a timestamp (whose integrity is protected end-to-end if
>>>> I understand correctly), and this timestamp is used by the receiver to identify
>>>> timing issues. The fact a packet is delayed (significantly) by a Media
>>>> Distributor cannot be misinterpreted by a receiver as a "what was said just
>>>> now". The receiver immediately identifies this delay.
>>> While that might be true, I'd guess most implementations would not maintain this timing information for media held for a substantial period of time. And media held for a short duration really might be considered late only due to network delays. That actually does happen when network congestion builds. So, short delays might easily be attributed to network delays and long delays likely result in a "reset" of the flow at the endpoint. I think all we can do here is provide a warning about this, but I'm happy to make adjustments if you have specific words you think would make this clearer.
>> VR: please propose something, this is your document.
> I'm not really sure what to add. There's really nothing different here than with any other RTP conference-- nothing unique to PERC, per se. If I were to propose a solution, I think it's dangerous. Some codecs don't necessarily create monotonically increasing timestamps. Some endpoints don't have accurate clocks (sender or receiver), so drift can result in miscalculations over time. I think it's good to note this issue, but leave it for implementation.
>>>> ** Section 8.2.4:
>>>> I don't like the way this section is written. It first explains what a Media
>>>> Distributor could do if it could alter a certain header field (in this case
>>>> SSRC), it details the consequences, to finally explain that this is not
>>>> possible. This Security Discussion section is essentially meant to discuss
>>>> remaining security issues or highlight specific aspects, not what could happen
>>>> with a different, non secure, design. This text could also be written the other
>>>> way round: "By including the SSR field into the integrity check, PERC prevents
>>>> splicing attacks where...".
>>> I assume you (and Ben) are looking to change that last sentence from "not allowing" to "by including"? I'll change it that way, but if that wasn't your meaning, please let me know.
>> VR: no, I just don’t like (and I did the same comment above) the way you introduce
>> things, by explaining there’s a problem, and later explaining there’s in fact no problem
>> because it’s not possible. Please change it as suggested.
> I did change it to include the sentence you suggested (it will be in the -09), but we can come back to this if you want. This particular issue could be ignored entirely, but as Ben noted this one getting a lot of discussion. Yes, splicing attacks are mitigated, but the splicing attack concept was introduced the way it was introduced, because the meaning isn't immediately obvious. This text was originally authored by John Mattsson:
> <>
> Of course, it changed a little as the entity names and solution to the problem changed. It still has things in reverse order from what you want, I think, but look at -09 when it is published shortly and tell me if that's still objectionable.
>>>> ** Missing in 8.2
>>>> The RTCP flows are not encrypted end-to-end (unlike data flows) but only
>>>> hop-by-hop. Consequently, a malicious Media Distributors may corrupt an RTCP
>>>> packet content (e.g., reception statistics in RR) or forge malicious RTCP
>>>> packets which may trigger various effects at a sender. There are other types of
>>>> RTCP packets that may be attacked as well with various consequences. None of
>>>> this is explained in section 8.2. "Media Distributor Attacks".
>>> This is true, though it wasn't a concern since the objective of PERC was to secure the media E2E. Nonetheless, you're correct. How about this as a new section called "RTCP attacks" under the Media Distributor attacks section?
>>> PERC does not provide end-to-end protection of RTCP messages. This allows
>>> a malicious Media Distributor to impact any message that might be transmitted
>>> via RTCP, including media statistics, picture requests, os loss indication.
>>> It is also possible for the Media Distributor to forge requests, such as
>>> requests to the endpoint to send a new picture. Such requests can consume
>>> significant bandwidth and impair conference performance.
>> VR: yes, please add it. This is a significant limit of PERC and it needs to be clarified.
>> That’s also the goal of any Security Considerations section to highlight limits.
>> If you have another limit in mind, this is where it should go.
> OK. I've added some text in a new section titled "RTCP attacks".
>> That being said, can RTCP request the endpoint to send a new picture?
> Yes
>> Isn’t it an RTP packet rather than picture (I’m not totally sure of it)?
> If a receiving device lost key packets and cannot reproduce an accurate image, it will send a request for a new picture. On reception of a picture request, the sender will send a new picture (i.e., the full image that should be displayed on the screen), which will consume several RTP packets for large images.
>> Additionally « picture » is anyway not appropriate here, « video frame » is
>> probably more adapted to video flows.
> Years ago, this was called "video fast update" in H.323. In SIP, that term is used, as is "Full Intra-frame Request" (FIR). Both avoided the use of "frame" or "picture". On this subject, I was talking with Gary Sullivan (chair of the group standardizing H.264, H.265, etc.) and he suggested using the terms "I-picture" and "P-picture" (for another document I was writing). Indeed, the term "picture" is used throughout H.265, as is frame -- but they have different meaning. As he was explaining, he also said:
> I did not use the word “frame” above. It has a different meaning than “picture”. The difference
> is becoming less important over time, but if we want  to be strict, we should not use the terms
> casually.
> And with that, I was left perplexed as to what to use when. However, I think we should take queues from the latest specs. In the H.265 RTP payload format spec, there is a "picture loss indicator":
> <>
> In section 8.4 of that spec it says, "Upon reception of a FIR, a sender MUST send an IDR picture."
> And in <> (from 2006) the authors used "intra-picture".
> Anyway, that's the reason I used "picture". I think the issue is that the meaning of these words have changed over time. That, or some folks just prefer "picture" now. I don't know.
>>>> ** General comments about 8.1 and 8.2
>>>> Insider attacks are a powerful form of attacker model with severe consequences.
>>>> This is not a big surprise. I'd be more interesting in a detailed 8.1 section,
>>>> more likely to happen (weaker attacker model).
>>> I'm not sure exactly what you're looking for. You want a new section (e.g., 8.3) that details insider attacks or something related to each of the existing 8.1/8.2 sections? Insider attacks could be disastrous, definitely. They could range from anything from turning off the power to stealing the KEKs stored in the Key Distributor. The latter is "scary" in that, if a rogue individual were to steal the KEKs, he or she could decrypt media off-line and at a later date. And, if the Key Distributor stored the keys for a long time (e.g., so as to enable conference recording and playback -- something not really considered on PERC, but certainly implementable), then they could first capture the media flows and then decrypt them a year later once they have access to the KEKs. The two attacks do not have to carried out concurrently and there would be no defense against theft of KEKs.
>>> We could scare people with some words about keeping the Key Distributor secure, but I'm not sure what we need to convey.
>> VR: no, this is not what I mean.
>> Attacks of section 8.1 seems more realistic to me than attacks of section 8.2 because
>> of a weaker attacker model: the attacker is outside of the systems, and not necessarily on
>> the path.
>> Section 8.2 are all about attacks that are launched from a corrupted MD, i.e., they are
>> all some form of insider attacks. This is less likely.
>> Therefore I would have liked to see more details in section 8.1, that’s all.
> In the interest of getting a new revision published so folks can provide more input, I didn't add anything here. However, I'm happy to do so, but I'm at a loss for what to add. If by "insider" you mean a rogue individual manging the service elements (e.g., key distributor), I definitely can see issues. However, I don't understand how that would apply to third-party attacks. Maybe you mean something different for "insider" than what I have in mind? If we share the same meaning, then are you wondering what third parties can do if an insider helps facilitate an attack (e.g., by stealing keys out of the KD and sending them to some third-party hacker)?
> Paul