Re: [secdir] Secdir last call review of draft-ietf-perc-private-media-framework-08

Vincent Roca <> Mon, 18 February 2019 19:04 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 7C27C128AFB; Mon, 18 Feb 2019 11:04:20 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -6.901
X-Spam-Status: No, score=-6.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id f_-y6GZWTKh4; Mon, 18 Feb 2019 11:04:17 -0800 (PST)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 4643D12894E; Mon, 18 Feb 2019 11:04:16 -0800 (PST)
X-IronPort-AV: E=Sophos;i="5.58,385,1544482800"; d="scan'208";a="369918887"
Received: from unknown (HELO []) ([]) by with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Feb 2019 20:04:03 +0100
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\))
From: Vincent Roca <>
In-Reply-To: <emb104d043-b701-4e92-9e08-1e1815c2981f@sydney>
Date: Mon, 18 Feb 2019 20:03:35 +0100
Cc: Vincent Roca <>,,,
Content-Transfer-Encoding: quoted-printable
Message-Id: <>
References: <> <emb104d043-b701-4e92-9e08-1e1815c2981f@sydney>
To: "Paul E. Jones" <>
X-Mailer: Apple Mail (2.3445.102.3)
Archived-At: <>
Subject: Re: [secdir] Secdir last call review of draft-ietf-perc-private-media-framework-08
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Security Area Directorate <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Mon, 18 Feb 2019 19:04:21 -0000

Hello Paul, all,

> Thanks for the review.  Please see comments inline.

You’re welcome.

>> ** Section 8.1
>> Why is it said that:
>>        "A successful attacker might be able to get the Media Distributor to
>>        forward such packets."
>> Is it really possible? That would be a big design issue! In fact the following
>> sentence suggest the opposite and I think this is essentially an erroneous
>> manner to present things. Please see comments on Section 8.2.4 on saying things
>> the other way round.
> Indeed, this kind of attack would not be possible since the Media Distributor must perform HBH authentication. I'll re-word this.

VR: thanks.

>> The same comment applies to the remaining two paragraphs. I suggest the authors
>> explain that the proposal prevents an attacker to claim being a regular Media
>> Distributor and therefore to fool endpoints because ...".
> The "false Media Distributor" example could happen. For example, if a hacker had access to the network where the Media Distributor is located and claims the Media Distributor's address while the conference is operating, the endpoints might not know. However, the worst result of such an attack is similar to just unplugging the box: packets just don't flow. The fake box could attempt to forward packets it receives, but they would fail to authenticate at the receiving endpoints and be discarded. However, this false device could never gain access to the media and would not be given HBH keys since it could not authenticate with the Key Distributor.

VR: you’ve said that a false Media Distributor will fail to get access to the media
and fail to forward the traffic. In the original text, you also say that:
  "They Key Distributor will fail
   to establish the secure association with the endpoint if the Media
   Distributor cannot be authenticated."
I agree that an attacker may try to impersonate a valid MD, but this attempt will
fail. And when you say that:
  « false Media Distributor may cascade to another legitimate Media
   Distributor creating a false version of the real conference"
this is just wrong. The attacker won’t succeed. 

This is why I’m saying that you should re-write this part to say that clearly that
thanks to the provided security mechanism, an attacker will fail to impersonate
a MD. Don’t pretend attacks can succeed if this is not the case.

>> ** Section 8.2.2.
>> Is the following sentence correct:
>>   The mitigation for a replay attack is to prevent old packets beyond a
>>   small-to-modest jitter and network re-ordering sized window to be
>>   rejected.
>> Is "prevent [...] to be rejected" correct? I'd say "... to be delivered"
>> instead.
> "prevent... rejected" is definitely an error. However, we cannot stop delivery. If replayed packets are received by either the Media Distributor or the endpoint, they should be discarded. So, perhaps this is better:
> "The mitigation for a replay attack is to discard old packets beyond a
> small-to-modest jitter and network re-ordering sized window. »

VR: Better, if we assume there's a timing based replay protection mechanism.
But the following item bellow indicates a totally different approach to replay protection
(ID based rather than time based), hence a major contradiction in this text. Please fix it.

>> Another comment. Replay protection seems to be based on timing considerations
>> rather than on the use of unique sequence numbers that must not be replayed
>> (except if a wrapping to zero occurs of course). Is that correct? Additionally,
>> is this mechanism carefully described in this document? Since it is explained
>> that E2E replay protection MUST be provided, it's essential to be very clear on
>> how to perform this. Failing to do so is a big issue.
> The mechanism underlying this is SRTP, which defines in 3.3.2/RFC3711 an "SRTP window size". We felt it was best to not introduce conflicting language. Perhaps we should just change the paragraph more substantially and refer to SRTP.
> Would you prefer this as the second paragraph?
> "The mitigation for a replay attack is to implement replay protection as
> described in Section 3.3.2 of [RFC3711].
> End-to-end replay protection **MUST** be provided for the
> whole duration of the conference. »

VR: You cannot mandate replay protection without giving details on the replay
protection mechanism to use. Here you mention SRTP replay protection. If you
read the section you mention you’ll see that this replay protection is not based
on packet timing but a packet sequence number: 
	"When message authentication is provided, SRTP protects against such attacks through a Replay List. »
	"Packet indices which lag behind the packet index in the
         context by more than SRTP-WINDOW-SIZE can be assumed to have been
        received… »

That’s totally different. If replay protection is a MUST, you need to be very clear
on what you expect from developers.

>> ** Section 8.2.3
>> It is said that "a Media Distributor can select to not deliver a particular
>> stream for a while." That's perfectly true, yet is this a "Delayed playout
>> attack"? I'd rather call it a Media Distributor censorship attack, or something
>> along this line. The main idea behind the attack is not to delay a stream but
>> to censor a source.
> This attack is not to censor, but to delay. For example, at time "T" Bob might say "I agree with your proposal". However, the "evil" Media Distributor could opt to not forward those packets and hold them. At some time "T+delta", the Media Distributor then forwards them. The receiving endpoint might not know that the packets were an hour old, so the receiver Alice thinks Bob is agreeing with a proposal that Bob actually doesn't agree with.
> However, a censorship attack is also possible. But I think we covered that in the Denial of Service section. The Media Distributor could always elect to not forward, which is in effect censoring the conversation.

VR: delaying a packet when at the same time it is mandatory (i.e., MUST) to
"discard old packets beyond a small-to-modest jitter and network re-ordering sized window »
means it will never be used. Hence the idea of « censoring a source ». That being said,
I don’t absolutely insist on using this term (censor) if you feel uncomfortable with it.

>> In the second paragraph I don't understand why it is said that:
>>        "the receiver believing that it is what was said just now, and only
>>        delayed due to transport delay."
>> Any RTP packet contains a timestamp (whose integrity is protected end-to-end if
>> I understand correctly), and this timestamp is used by the receiver to identify
>> timing issues. The fact a packet is delayed (significantly) by a Media
>> Distributor cannot be misinterpreted by a receiver as a "what was said just
>> now". The receiver immediately identifies this delay.
> While that might be true, I'd guess most implementations would not maintain this timing information for media held for a substantial period of time. And media held for a short duration really might be considered late only due to network delays. That actually does happen when network congestion builds. So, short delays might easily be attributed to network delays and long delays likely result in a "reset" of the flow at the endpoint. I think all we can do here is provide a warning about this, but I'm happy to make adjustments if you have specific words you think would make this clearer.

VR: please propose something, this is your document.

>> I now understand the title ("delayed playout") but I really suspect this is a
>> mistake as (too much) delayed packets will not be played at all.
> I bet they would :) Seriously, I have seen far stranger things. And implementers might consider doing that as a "resilient" implementation.

VR: you may be right if all packets are delayed, I agree.

>> ** Section 8.2.4:
>> I don't like the way this section is written. It first explains what a Media
>> Distributor could do if it could alter a certain header field (in this case
>> SSRC), it details the consequences, to finally explain that this is not
>> possible. This Security Discussion section is essentially meant to discuss
>> remaining security issues or highlight specific aspects, not what could happen
>> with a different, non secure, design. This text could also be written the other
>> way round: "By including the SSR field into the integrity check, PERC prevents
>> splicing attacks where...".
> I assume you (and Ben) are looking to change that last sentence from "not allowing" to "by including"? I'll change it that way, but if that wasn't your meaning, please let me know.

VR: no, I just don’t like (and I did the same comment above) the way you introduce
things, by explaining there’s a problem, and later explaining there’s in fact no problem
because it’s not possible. Please change it as suggested.

>> ** Missing in 8.2
>> The RTCP flows are not encrypted end-to-end (unlike data flows) but only
>> hop-by-hop. Consequently, a malicious Media Distributors may corrupt an RTCP
>> packet content (e.g., reception statistics in RR) or forge malicious RTCP
>> packets which may trigger various effects at a sender. There are other types of
>> RTCP packets that may be attacked as well with various consequences. None of
>> this is explained in section 8.2. "Media Distributor Attacks".
> This is true, though it wasn't a concern since the objective of PERC was to secure the media E2E. Nonetheless, you're correct. How about this as a new section called "RTCP attacks" under the Media Distributor attacks section?
> PERC does not provide end-to-end protection of RTCP messages. This allows
> a malicious Media Distributor to impact any message that might be transmitted
> via RTCP, including media statistics, picture requests, os loss indication.
> It is also possible for the Media Distributor to forge requests, such as
> requests to the endpoint to send a new picture. Such requests can consume
> significant bandwidth and impair conference performance.

VR: yes, please add it. This is a significant limit of PERC and it needs to be clarified.
That’s also the goal of any Security Considerations section to highlight limits.
If you have another limit in mind, this is where it should go.

That being said, can RTCP request the endpoint to send a new picture?
Isn’t it an RTP packet rather than picture (I’m not totally sure of it)?
Additionally « picture » is anyway not appropriate here, « video frame » is
probably more adapted to video flows. 

>> ** General comments about 8.1 and 8.2
>> Insider attacks are a powerful form of attacker model with severe consequences.
>> This is not a big surprise. I'd be more interesting in a detailed 8.1 section,
>> more likely to happen (weaker attacker model).
> I'm not sure exactly what you're looking for. You want a new section (e.g., 8.3) that details insider attacks or something related to each of the existing 8.1/8.2 sections? Insider attacks could be disastrous, definitely. They could range from anything from turning off the power to stealing the KEKs stored in the Key Distributor. The latter is "scary" in that, if a rogue individual were to steal the KEKs, he or she could decrypt media off-line and at a later date. And, if the Key Distributor stored the keys for a long time (e.g., so as to enable conference recording and playback -- something not really considered on PERC, but certainly implementable), then they could first capture the media flows and then decrypt them a year later once they have access to the KEKs. The two attacks do not have to carried out concurrently and there would be no defense against theft of KEKs.
> We could scare people with some words about keeping the Key Distributor secure, but I'm not sure what we need to convey.

VR: no, this is not what I mean.
Attacks of section 8.1 seems more realistic to me than attacks of section 8.2 because
of a weaker attacker model: the attacker is outside of the systems, and not necessarily on
the path.
Section 8.2 are all about attacks that are launched from a corrupted MD, i.e., they are
all some form of insider attacks. This is less likely.
Therefore I would have liked to see more details in section 8.1, that’s all. 


>> Other comments:
>> ** Section 6, intro: (it's a detail, but...)
>> I don't think that the use of "and so forth" is adequate in a specification
>> that aims to be exhaustive. The list of items addressed in section 6 is
>> finished.
> Agreed. I'm just cut the sentence short:
> "This section describes the various keys employed by PERC."
> The sub-section titles are sufficient, I think.

VR: Yes.