Re: [Tsv-art] Tsvart last call review of draft-ietf-suit-architecture-11

Bob Briscoe <ietf@bobbriscoe.net> Thu, 17 September 2020 22:45 UTC

Return-Path: <ietf@bobbriscoe.net>
X-Original-To: tsv-art@ietfa.amsl.com
Delivered-To: tsv-art@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 62A733A0D7E; Thu, 17 Sep 2020 15:45:26 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.099
X-Spam-Level:
X-Spam-Status: No, score=-2.099 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, NICE_REPLY_A=-0.001, SPF_HELO_FAIL=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=bobbriscoe.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id hRrdDrfLJr9Q; Thu, 17 Sep 2020 15:45:19 -0700 (PDT)
Received: from cl3.bcs-hosting.net (cl3.bcs-hosting.net [3.11.37.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BBDB03A0D70; Thu, 17 Sep 2020 15:45:18 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=bobbriscoe.net; s=default; h=Content-Type:In-Reply-To:MIME-Version:Date: Message-ID:From:References:Cc:To:Subject:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=eRwnrUqCEqJH0tU5Grm1DD99eF1WuS7AKmEXhcuAVxI=; b=1fKzfhIvu+/3A7TcbZBAjYR7f +XFZx1m3rkfbcIDnYrDpFKqABikb8EoLEjjjVLPQkKFKGyANAPUdP9NpUcvEmCbLrKf0Lr/vo3tuZ 3c6wRjItQwx9vJsZGb+uAsIm/0LrF+cfN9WM30G1F+5jPCyZ0T5rkJh4D4qRR998lWL4Zz7QPHyWE N0SPFbC43DCzas4PF4sDvedxP06GaLrlmmBbROybcsk5Z2XjncU7DW6SSWF4OwE5iKe8CYlcQQ2Fp /fB4o/Ev9/f4tMeqDpKUyyM5O+gPu17kvQpjrNlRDJmtJVUOlrS+i2B5Dm/jLI+2PGvzqmM9GGoqw 9Ic+KDXzQ==;
Received: from [31.185.135.145] (port=32810 helo=[192.168.0.6]) by cl3.bcs-hosting.net with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.93) (envelope-from <ietf@bobbriscoe.net>) id 1kJ2eR-0094F3-9S; Thu, 17 Sep 2020 23:45:16 +0100
To: Hannes Tschofenig <Hannes.Tschofenig@arm.com>, "tsv-art@ietf.org" <tsv-art@ietf.org>
Cc: "last-call@ietf.org" <last-call@ietf.org>, "draft-ietf-suit-architecture.all@ietf.org" <draft-ietf-suit-architecture.all@ietf.org>, "suit@ietf.org" <suit@ietf.org>
References: <159701600789.9734.7112047200124687933@ietfa.amsl.com> <AM0PR08MB3716602CB3B458A1D9A12DD3FA3E0@AM0PR08MB3716.eurprd08.prod.outlook.com>
From: Bob Briscoe <ietf@bobbriscoe.net>
Message-ID: <276c27d6-b32c-925e-9660-8edec8b67b17@bobbriscoe.net>
Date: Thu, 17 Sep 2020 23:45:14 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0
MIME-Version: 1.0
In-Reply-To: <AM0PR08MB3716602CB3B458A1D9A12DD3FA3E0@AM0PR08MB3716.eurprd08.prod.outlook.com>
Content-Type: multipart/alternative; boundary="------------4BAED82BF4CA3623F9BB2964"
Content-Language: en-GB
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - cl3.bcs-hosting.net
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - bobbriscoe.net
X-Get-Message-Sender-Via: cl3.bcs-hosting.net: authenticated_id: in@bobbriscoe.net
X-Authenticated-Sender: cl3.bcs-hosting.net: in@bobbriscoe.net
X-Source:
X-Source-Args:
X-Source-Dir:
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsv-art/c9YMwcDaD_45TjFiPaueaVRWAYU>
Subject: Re: [Tsv-art] Tsvart last call review of draft-ietf-suit-architecture-11
X-BeenThere: tsv-art@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Review Team <tsv-art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsv-art>, <mailto:tsv-art-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsv-art/>
List-Post: <mailto:tsv-art@ietf.org>
List-Help: <mailto:tsv-art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsv-art>, <mailto:tsv-art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Sep 2020 22:45:27 -0000

Hannes, [BTW, I just noticed a bug with the draft-12 HTML under 
tools.ietf - it is now missing links to versions 09, 10, 11 ]

Pls see inline responses, tagged [BB]:

On 17/09/2020 07:56, Hannes Tschofenig wrote:
> Hi Bob,
>
> thanks for the time you spent on this review. I know myself how much time it takes to write a detailed review.
> You can find my responses inline; search for [Hannes].
>
> -----Original Message-----
> From: Bob Briscoe via Datatracker <noreply@ietf.org>
> Sent: Monday, August 10, 2020 1:33 AM
> To: tsv-art@ietf.org
> Cc: last-call@ietf.org; draft-ietf-suit-architecture.all@ietf.org; suit@ietf.org
> Subject: Tsvart last call review of draft-ietf-suit-architecture-11
>
> Reviewer: Bob Briscoe
> Review result: Ready with Issues
>
> This document has been reviewed as part of the transport area review team's ongoing effort to review key IETF documents. These comments were written primarily for the transport area directors, but are copied to the document's authors and WG to allow them to address any issues raised and also to the IETF discussion list for information.
>
> When done at the time of IETF Last Call, the authors should consider this review as part of the last-call comments they receive. Please always CC tsv-art@ietf.org if you reply to or forward this review.
>
> This review is long. For the benefit of busy readers, it is structured with 7 important issues listed first (and tagged either as technical or editorial), followed by minor editorial comments for the authors.
>
> Altho' it is ostensibly from the Transport Area Review Team, this review identifies only one transport-related issue (see item #6a). Most of the major discussion points are offered with a security hat on.
>
> First I want to say that there's a lot of useful stuff in the draft. So I'd like to apologize that the review comments raise issues, and do not dwell on praising all the good stuff.
>
> == Important Issues ==
>
> 1. Motivation for publication by the IETF [Editorial]
>
> Until I reached the summary of the recent IoT IAB workshop in the first para of the Security Considerations section, I was wondering why the IETF needed to publish this. It seemed to be a description of what is already done in the industry, but framed as an architecture. Most of this first para of the Security Considerations section motivates this work, and ought to be moved to the Introduction.
>
> Even then, a document that describes what the industry already does isn't a sufficient response to a security problem. Given (I believe) the intention is to encourage the industry to systematically cater for firmware updates, perhaps the draft needs to be a little more hard-hitting (without being patronizing of course). Rather than giving the impression (except in the abstract) that it is just describing current industry practice. For instance, see item #2 below about saying what not to do. I would also suggest that it should highlight the simplest architecture, only giving optional more complex extras later (see item
> #4 below).
>
> [Hannes] You write: "I was wondering why the IETF needed to publish this"
>
> The IESG chartered the group to work on three documents and this is one of them. It is not uncommon to work on architecture documents in the IETF.
>
> The group wanted an architecture document to explain what problem we are trying to solve and how the standardization work of the manifest fits into the bigger picture. The document outlines requirements, describes terminology, and goes into details about the type of IoT systems we are looking at (Section 7) and explains the relationship between the secure boot and the firmware update process.

[BB] I'm not saying "Why did you write this?" I'm saying "The document 
needs to state why the IETF felt the need to publish this." You've said 
more in the para above than the whole document says about this. Pls try 
to understand that, as a reader coming to this document without being an 
insider to the WG (whether now or some time much later), you wouldn't 
know that the manifest was the real driver for this work.

It's natural when a reader is confronted with an architecture document 
to assume that the architecture is different in some way from current 
practice. But, even with just the above para, I now know that this was 
primarily about providing context for other work. If it isn't new, it 
ought to say that, otherwise the presumption will be that it is.

>
> [Hannes] You write: "It seemed to be a description of what is already done in the industry".
>
> I am not sure what you are specifically referring to. Firmware and software updates are obviously not a new invention.

[BB] I'm not referring to anything, other than how I believe secure 
firmware update is already done. Unless it says up front that it's 
primarily describing current best practice, the reader's natural 
assumption will be that some new practice is being proposed.


>
> [Hannes] You write: "Given (I believe) the intention is to encourage the industry to systematically cater for firmware updates, perhaps the draft needs to be a little more hard-hitting (without being patronizing of course). Rather than giving the impression (except in the abstract) that it is just describing current industry practice."
>
> The purpose of an architecture document is not to recommend a specific solution. Where do we give the impression that we are describing industry practice? What do you believe is the industry practice when it comes to firmware updates for IoT devices?

[BB] Let me explain. You don't /say/ that you are describing industry 
practice. I believe you just /are/ describing industry best practice - 
for firmware update of any type of device (not specifically IoT). I 
might be wrong. But if I'm right, it would be worth telling a 
prospective reader that up front - so they can better decide whether 
this is a document they need to read - which is one of the primary 
purposes of an abstract or introduction.

>
>
> 2. Is Anything Not Allowed by this Architecture? [Technical+Editorial]
>
> a) A good architecture precludes as well as includes. Would it be useful to list some common practices that are insecure, and perhaps some common misconceptions about secure firmware update?
>
> [Hannes] You write: "Would it be useful to list some common practices that are insecure, and perhaps some common misconceptions about secure firmware update?"
>
> There are, of course, many ways to write an architecture document or any document. Since nobody asked for a compilation of the 10 things to get wrong with IoT firmware updates we didn't include it. I will reach out to the folks on the list regarding that idea and see what the feedback is.

[BB] It might be worth asking the list to brainstorm common 
misconceptions, before deciding whether it would be worth the effort of 
writing them up.

>
> b) I could hardly find anything in this draft that did not equally apply to firmware update of "Non-Things". It would indeed be useful to define a 'Thing'
> (at least what this document means by it). I suggest:
> * unattended operation
> * not within the operator's physical security control
>
> [Hannes] This is a common remark to IoT work and there are indeed fuzzy boundaries. Luckily, this challenge has been tackled by others in the IETF, see RFC 7228. We reference that RFC to avoid having to define our own IoT definition.
> Since you got the impression that there is nothing special about IoT in this document we might need to add text that highlights it. Certainly worth pointing out is the special attention to small size of the manifest and its easy of use by constrained devices.

[BB] Yes. I can understand why you don't want to rat-hole into the 
definition of a Thing. However, I still maintain it would be useful to 
state that unattended operation and a physically unsecured environment 
define the context of the draft (given context is what you've now told 
me is the purpose of this draft).

You could certainly also mention that many Things are constrained, but 
that doesn't really define what is distinctive about firmware update for 
IoT (particularly because there is no feature of the architecture that 
helps constrained devices - the requirements to pay careful attention to 
software size are aspirational, so they don't count).


> Additionally, the integration with secure boot is something I believe is uncommon in software updates done at higher layers.

[BB] I think you might have taken what I wrote as being about update of 
/higher layer/ software.

I said "I could not find anything ... that did not equally apply to any 
/firmware/ update...". The ability to constrain a device to only boot 
from an image signed with a recognized key has been common for general 
PCs and laptops for over a decade now.

>
> c) On the subject of ruling things out, I felt the list of items ruled out of scope in the Security Considerations include some items that are so central to IoT that they should not have been ruled out of scope, and in the first two cases quoted below, they didn't need to be ruled out of scope, because the document addresses them: "
>    - installing firmware updates in a robust fashion so that the update
>      does not break the device functionality of the environment this
>      device operates in.
>    - the distribution of the actual firmware update, potentially in an
>      efficient manner to a large number of devices without human
>      involvement
>    - energy efficiency and battery lifetime considerations.
> "
> And, wouldn't it be better to move scoping statements to just after the Intro, rather than in Security Considerations? (And, yes, I know that not all Things are energy-challenged, but the size of the subset that are is significant.)
>
> [Hannes] There are many ways to arrange the text in a document. We can see whether moving the list improves readability. Having the items listed later in the document allows the reader to better understand the context we are talking about.

[BB] I try not to ask authors to follow my preferred style. Indeed, 
there are many ways to arrange a doc, but there are certain ways not to 
arrange a document, and this is one of them. As you say, the scope of a 
document helps the reader understand the context of the document.  So 
it's of little use to read context after the reading the document (as I 
myself found out, when I finally got to the Security Considerations 
section).

>
> More importantly is the question why they are out of scope for the work. Let's look at them one-by-one:
>
>     -  installing firmware updates in a robust fashion so that the update
>        does not break the device functionality of the environment this
>        device operates in.
>
> [Hannes] This concerns the use of multiple images so that you can fall back to an older version in case the newly provided firmware update fails.
> How many firmware images to store on a device depends on many factors and does not concern the interoperability of the manifest specification.

[BB] My point was that there is a section about this in the architecture 
(3.5), so it's not out of scope (or if it is out of scope, that section 
shouldn't be there).

>
>     -  installing firmware updates in a timely fashion considering the
>        complexity of the decision making process of updating devices,
>        potential re-certification requirements, and the need for user
>        consent to install updates.
>
> [Hannes] This is a classical policy decision and, for certain type of devices, also a privacy decision.

[BB] This one wasn't in my list of the 3 bullets that I was concerned about.

>
>     -  the distribution of the actual firmware update, potentially in an
>        efficient manner to a large number of devices without human
>        involvement.
>
> [Hannes] This is out of scope because there are protocols available already that enable the distribution of manifests + firmware updates. We can use them without having to standardize new protocols again.

[BB] Again, my point was that a large part of the draft, particularly in 
the sections listed below, is already about the efficiency and scaling 
of distribution so it's surely wrong to say it's not in scope:
3.2.  Friendly to broadcast delivery
3.10.  Operating modes
5.  Communication Architecture
9.  Example
(I've used draft-12 for the section numbering here).


>
>     -  energy efficiency and battery lifetime considerations.
>
> [Hannes] Energy efficiency is a system design aspect that puts a heavy emphasis on hardware power management features. This is something outside the scope of the IETF and interoperability.

[BB] Hardware power management is certainly outside the IETF's scope{1}, 
but energy efficiency is about much more than hardware power management. 
Aspects of firmware update that impact energy efficiency should surely 
be in scope.

My point is that, if you're going to write an RFC about firmware update 
of 'Things', energy efficiency is one of the few distinguishing features 
of  constrained devices, which are a large subclass of the set of 
'Things'. And, there are energy efficient and inefficient ways of doing 
firmware update (e.g. naive sender-initiated broadcast requires the 
receiver's radios to be constantly powered up - see later).

Note {1}: Firmware update is also outside the IETF's scope :)



>
>     -  key management required for verifying the digital signature
>        protecting the manifest.
>
> [Hannes] This is something that has been addressed by other working groups in the IETF already. We don't want to re-invent the wheel.
>
>     -  incentives for manufacturers to offer a firmware update mechanism
>        as part of their IoT products.
>
> [Hannes] This is an economics & business aspect that cannot be mandated by an IETF specification.

[BB] These two also weren't in my list of the 3 bullets that I was 
concerned about.

>
> I hope you agree with my assessment.
>
>
> 3. Relying on Software with Security Vulnerabilities to Patch Security Vulnerabilities [Technical]
>
> The Intro only mentions 'software updates' generally, and doesn't explicitly mention patching security vulnerabilities (altho the abstract does). Only having read the Security Considerations section, do I discover that the draft is primarily meant to be about patching firmware vulnerabilities.
>
> [Hannes] What the new firmware image contains will vary from case-to-case. Sometimes it will fix a non-security bug, sometimes a security bug and in other cases it enhances the functionality. From the point of view of the manifest specification it does not matter.
>
> That raises the question of how secure it is to download new firmware from a device booted from firmware that is potentially already compromised. As a minimum, surely the draft needs to mention this point.
>
>   And preferably:
> * whether anything can be trusted once firmware is compromised, and if so what.
> * whether it is still worth updating firmware, even once a vulnerability in the firmware update process has been identified, given:
>    o identification of a vulnerability does not necessarily imply it has been
>    exploited, or not prevalently exploited
>    o a vulnerability might not make the firmware update process itself vulnerable (with an
>    explanation of how to tell)
> * describe which aspects of the firmware update process need to be run within a TEE (and which not if any)
> * should the TEE lock the device against booting if a firmware authentication or integrity check fails
>    o how to prevent tampering with firmware integrity being used as an attack in itself, e.g.
>      - by ensuring that, once a device is locked against booting, firmware
>      re-update is never completely disabled
>      - by ensuring firmware updates are not immediately retried without an exponentially
>      increasing timer back-off, otherwise retries could lead to the devices flooding their
>      own network with fruitless update traffic.
>
>
> [Hannes] We can add a discussion about this point in the document. Whether there will be a security improvement in your case depends on the extend of the compromise.

[BB] Thanks.


>
>
> 4. Please Focus More on the Simplest Architecture [Technical]
>
> All the following increase system complexity, but are not /essential/ for strong security: a) Status Tracking Per Device b) Confidentiality of the firmware binary c) Robustness against rendering the device unbootable d) Supporting both Message Authentication and Object Authentication (see item #5)
> e) Broadcast Friendly (see item #6)
>
> This draft is meant to be persuading the 'industry of Things' to provide built-in secure firmware update. It tends to fall into the common trap of setting the security bar so high that practitioners might give up in despair.
>
> [Hannes] This draft was written for implementers, who want to understand the background on the IETF SUIT manifest. IETF documents are in general not useful to persuade a wider industry audience because that audience does not read any of our documents. I can tell you lots of stories about developers not reading IETF specifications from my experience as OAuth working group co-chair. Luckily, regulators are taking care of mandating the use of firmware updates for IoT devices.

[BB] Thanks. Please don't just correct me on who the doc was written for 
and why - pls write it in the document. Again this returns to my point 
that the document doesn't clearly say why the IETF is publishing it. If 
the document had described its intended readership, I wouldn't have had 
to guess. The only place I found anything about the document's 
motivation was in the Security Considerations, where it said it was as a 
result of the IoTSU workshop, which I took to mean that the IETF was 
trying to help the industry fix this problem by contributing to secure 
firmware update.

>
> a) Per-device status tracking certainly might be preferred by many operators, but the alternative of the operator not knowing the status of each individual device might be acceptable (as in the example in Figure 5). Per-device status tracking introduces the following complexity:
> * a need to separately identify each device, both on each device, and in the status tracker.
> * a need to securely identify each separate device (to prevent compromised devices masquerading as all the other devices to give a false sense of security), requiring management of separate public or shared keys
>
> [Hannes] This functionality was introduced by the group in an attempt to align with other architectures in the industry for IoT device management. In particular, we had a longer interaction with the ITU-T on their firmware update architecture and we aligned terminology and architecture with them. Both architectures were informed by what is currently happening in the industry.

[BB] If the purpose of the doc is not to persuade anyone that secure 
firmware update need not be complex, my point is moot anyway.

Nonetheless, I'm not saying exclude status tracking from the 
architecture. I'm saying start with a minimal but secure architecture. 
Then, as you add extra capabilities, the implications in terms of 
complexity will be clearer.

>
>
> b) Confidentiality certainly might provide defence in depth against reverse engineering the binaries, but it is ultimately security by obscurity, and so ultimately optional. By definition (see item #2b) 'Things' are not in a physically secure environment. So, unless all devices decrypt all downloaded binaries within a TEE and store them in tamper-proof memory, once the binaries are stored on each device, they will be accessible to external inspection anyway. So the document should be less dogmatic about confidentiality protection (3rd para of Intro), and at least explain that, with IoT, confidentiality on the wire is moot unless there is also confidential device storage as well.
>
> [Hannes] This is a requirement from industry and while not everyone offers support for this functionality it is getting more common. The manifest specification needs to offer a solution for it and it happens to be optional to implement.

[BB] I'm not saying don't include confidentiality in the architecture - 
the draft gives good reasons for wanting to protect against reverse 
engineering. Nonetheless, I said "So the document should be /less 
dogmatic/ about confidentiality protection (3rd para of Intro)". I'm 
telling you that, on my reading of the document, it currently does /not/ 
say it's optional. Quoting:
     "The firmware update process, among other goals, has to ensure that 
... the firmware image can be confidentiality protected..."


>
> Take a regular Cortex M MCU, one with no hardware security IP and no TEE. The flash and the RAM is on chip, which means that you need an attacker that decaps the chip and access the bus lines. This is not an easy attack. Is it impossible? No, it can be done. Are we seeing more MCUs with hardware security IP and with TEEs? Certainly.

Don't tell me - the draft is the place for this sort of point. You seem 
to have slipped into treating my review as if I was arguing about the 
merits of these optional parts of the architecture. Not so. I was 
suggesting that you divide the architecture between aspects essential 
for security and optional.

But again, if the purpose of the doc is not to persuade anyone, my point 
is moot anyway.


>
>
> c) Robustness against rendering the device unbootable Often, when I initiate an (attended) firmware update, the OS warns me that this is a sensitive process that could render the device useless if the power fails part-way through. So clearly, this is a cost-tradeoff that device designers are willing to compromise on. Therefore, I don't think the IETF is entitled to pronounce a requirement against this practice. I would rather see this text moved from Requirements to somewhere else in the doc, as a commentary on the implementation issues, rather than stating it as a requirement. Climbing down a bit at the end by saying it is only an implementation requirement doesn't help.
>
> [Hannes] Are you talking about this requirement:
>
> "-  installing firmware updates in a robust fashion so that the update
>        does not break the device functionality of the environment this
>        device operates in."
>
> or this one
>
> "3.5.  High reliability
>
>     A power failure at any time must not cause a failure of the device.
>     A failure to validate any part of an update must not cause a failure
>     of the device.  One way to achieve this functionality is to provide a
>     minimum of two storage locations for firmware and one bootable
>     location for firmware.  An alternative approach is to use a 2nd stage
>     bootloader with build-in full featured firmware update functionality
>     such that it is possible to return to the update process after power
>     down.
> "
>
> The firmware update in the past was a bit clumsy as well and had the ability to ask the user. With IoT devices you cannot assume that there is a user sitting in front of the device nor can you even assume that the device has a display. The main reason why you have seen this message in firmware updates (e.g. BIOS updates) on your desktop PC was that you should be warned not to disconnect power because there was no backup firmware image available on the device. You interrupt the firmware update process and you are in trouble.

[BB] Again, if the purpose of the doc is not to persuade anyone, my 
point about distinguishing essential from optional parts is moot.

Whatever, you seem to have slipped into telling me why these optional 
aspects are important. I'm not saying they aren't. I was only suggesting 
that the draft avoids giving every nice-to-have bell and whistle the 
same level of importance as the core functions of a secure firmware 
update architecture: authenticity and integrity verification.

My point was that even expensive PCs often didn't / don't provide 
firmware update that is robust to power failure - the clunky warning 
message was merely mentioned as evidence of this.

BTW, it was me that suggested that the draft articulates that unattended 
operation is one of the two distinguishing features of IoT.

>
>
>
> 5. Both Message Authentication and Whole Object Authentication? [Technical]
>
> Message authentication codes aren't specifically mentioned, until sections 7 & 8, where they are mentioned as if they might be used, without saying why or how. The document needs to discuss the merits of MACs vs. authentication of the whole manifest and/or the whole firmware binary.
>
> Ultimately, if an object's authenticity and integrity will be verified once it is fully delivered, there is no need for MACs as well. However, using message authentication reduces the risk that the device is talking with an imposter at an early stage in the transmission, rather than having to wait until it is complete. And it is easy to arrange message authentication to cumulatively authenticate the whole object, without additional infrastructure for whole-object verification. Therefore using MACs could avoid the need to provide enough storage for a complete update of the firmware as well as the current version - after verifying the manifest and the first message, the device could even start to overwrite the firmware it is currently booted from.
>
> The above strategy would not be without risk, but my point is not just to suggest this particular strategy. The document ought to at least discuss the trade-offs between MACs and whole-objection authentication, and whether both are really necessary.
>
> [Hannes] We can do that.

[BB] Thx

>
>
> 6. Friendly to Broadcast Delivery? [Technical]
>
> Section 3. states this as one of the "Requirements", although the text softens it to "may be desirable for some networks". However, broadcast delivery introduces the three significant problems below, wrt a) reliable transport; b) device energy efficiency; and c) broadcast message authentication.
>
> a) Reliable Broadcast Transport
> Delivery of binary objects needs to recover lost or corrupt packets. Reliable broadcast delivery at scale is extremely challenging. It needs either fountain coding [1] or reliable multicast.
> * Fountain coding delivers an object in a continually repeating stream and ensures that the data in any missing packet can be reconstructed from data in a subsequent different packet. But this would increase device complexity.
> * For broadcast delivery, per-packet acknowledgements (ACKs) from each device do not scale. Negative ACKs (NACKs) can be used but they also do not scale. If a loss is experienced close to the root of the broadcast/multicast, it still causes an implosion of negative ACKs
> (NACKs) on the sender. Reliable multicast (e.g. PGM [RFC3208]) arranges a spreading tree of delivery nodes each of which handles NACKs solely from its next-degree downstream neighbours. Clearly this increases network or CDN complexity.
>
> b) Broadcast Energy Efficiency
> If the IoT device is wireless and needs to take care with its energy consumption, it will need to initiate all communications, rather than have to sit with its radio powered up listening for an incoming message. However, of course, it is not possible for each device to independently initiate an incoming broadcast. It would be possible for a broadcast to be scheduled, and for each device to poll for the schedule. But this would add complexity, particularly because all the device clocks would have to be fairly closely synchronized.
>
> c) Broadcast Message Authentication
> Message authentication has potential advantages over whole-object authentication (see #5). When MACs are used over unicast, typically the cost of asymmetric crypto for each message is avoided by using asymmetric crypto just once to transmit a shared key, which is then used to verify each MAC. However, that process is only secure for unicast. For broadcast or multicast delivery, the sender only sends each message once, using one key for the MAC that would therefore have to be shared with every receiver. Then any receiver could masquerade as the genuine sender. TESLA is a solution to this [RFC4082], but it would again increase the complexity of each device and the servers, not least because it requires loose clock synch (nonetheless, uTESLA has been implemented for challenged devices [2]).
>
> Aside regarding broadcast encryption:
> In section 3.3. "Use state-of-the-art security mechanisms", it says:
>    "The information that is encrypted individually for each device must
>    maintain friendliness to Content Distribution Networks, bulk storage,
>    and broadcast protocols."
> That implies a magic encyption scheme that is beyond any state-of-the-art that I am aware of! If information is encrypted individually for each device, surely by definition it will not be friendly to broadcast protocols. Actually, I suspect the authors did not mean to say "encrypted individually for each device", because a shared group key is adequate for confidentiality - a shared group key is only problematic for message or source authentication (see above).
>
> [Hannes] You write "implies a magic encyption scheme that is beyond any state-of-the-art"
>
> As you correctly mentioned, a symmetric group key is established and the firmware image is protected with that group key. This requirement only says that we shouldn't design a manifest solution that avoids using such a broadcast scheme rather than designing one on our own.

[BB] I meant: The contradiction in the draft text is between "encrypted 
individually for each device" and "must maintain friendliness to 
broadcast protocols".


>
> 7. Missing Security Concerns [Technical]
>
> a) Avoiding Reliance on the Device's System Clock
>
> I suggest that the document makes the point that it is preferable for the firmware update process not to rely on the device's system clock.
>
> Reasoning: Even if the TEE maintains the system clock, protection against attacks on this clock rely on voting between multiple time sources. No amount of authentication provides any proof of message timing. So, it is hard for a TEE to protect against tampering with the timing of its messages, given they pass via the untrusted execution environment of the rest of the device, similar to the problem of a secure time source for virtualized functions [3].
>
> I think IoT developers can be reassured that none of the requirements for firmware update need to rely on the system clock. For instance roll-back attack prevention (section 3.4) only requires comparison between version numbers, not comparison between a release time and the clock.
>
> However, I think not relying on the clock is worth mentioning, because key expiry and key revocation have to be designed carefully to avoid relying on secure time, and this is a subtle point that might not be appreciated by IoT device designers.
>
> [Hannes] Sounds useful although other IoT groups in the IETF assume that the device has a reliable clock.
>
>
> b) Key revocation
>
> When keys are in tamper-resistant storage but otherwise not within a physically secure site, the question of revocation surely has to be addressed. In particular, there should be a discussion about the advisability or otherwise of pre-loading the same keys into multiple devices.
>
> [Hannes] Fine with me.
>
>
> == Minor Editorial Issues ==
>
> 1. Intro
>    "Updates to the firmware of an IoT device are done to fix bugs in software..."
> This would be a good place to highlight the focus on patching security vulnerabilities.
>
> [Hannes] As mentioned previously, it is equally fine to update code for other reasons.
>
> "This version of the document assumes... Future versions may also describe..."
> I assume this aspiration needs to be deleted now?
>
> [Hannes] OK.
>
>
> 2. Terminology
>
> There are ~22 occurrences of lower case 'must' in this document, and one 'should' (excluding multiple uses in rhetorical questions). I'm not sure whether it is intentional to make it seem like this is an RFC that is mandating behaviour, perhaps for readers who don't understand the subtleties of the IETF informational track. I would prefer it to be clear that this document is not mandating anything, by using alternatives to 'must' like 'ought to' or 'has to'. Otherwise it could be considered disingenuous.
>
> [Hannes] We could add the following text to the terminology section:
>
> "
>
>     The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
>     "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
>     document are to be interpreted as described in [RFC2119], with the
>     qualification that unless otherwise stated, they apply to the design
>     of the SUIT manifest, not its implementation or application.
>     "

[BB] The new template wording in RFC8174 adds:

       when, and only when, they
       appear in all capitals, as shown here

which would be another way to address my concern.


>
>
>    "The term ’system on chip (SoC)’ is often used for these types of devices."
> Perhaps more useful:
>    "The term ’system on chip (SoC)’ is often used interchangeably with MCU, but
>    MCU tends to imply more limited peripheral functions."
>
> [Hannes] OK.
>
>
>    "The following entities are used:"
> The list is a mix of stakeholders and functions, which tends to show that the authors themselves might not be clear about the distinction. It would be useful to split into two lists.
>
> [Hannes] We can split the list.
>
>
>    "The terms device and
>    firmware consumer are used interchangeably since the firmware
>    consumer is one software component running on an MCU on the
>    device."
> I didn't notice them being used interchangeably. If they are anywhere, why not just edit to use whichever term is more appropriate and delete this sentence?
>
> [Hannes] Ok.
>
> Status Tracker
>    "While the IoT device itself runs the client-
>    side of the status tracker it will most likely not run a status
>    tracker itself unless it acts as a proxy for other IoT devices in
>    a protocol translation or edge computing device node."
> The client-side of a status tracker surely does run a status tracker itself (the clue is in the name). I know what is intended, but the writer was clearly in two minds as to whether a status tracker is the combination of client and server or just the server.
>
> [Hannes] We will clarify the client and the server component of the status tracker.
>
>
> 3. Requirements
>
> 3.5 "High reliability" -> 'Robust against becoming unbootable'.
> The title for this requirement otherwise implies a much more general requirement than the description under it.
>
> [Hannes] I am fine with the alternative title.
>
>
> 3.6 Small bootloader
> "...again using firmware updates over serial, USB or even wireless connectivity like a limited version of Bluetooth Smart."
> Don't see why it has to be "...a limited version of...". Suggest these words are deleted.
>
> [Hannes] Fine.
>
>
> s/poses a risk in reliability/
>   /poses a reliability risk/
>
> [Hannes] OK
>
> s/must fit in the available RAM/
>   /must fit in the available memory/
> (not necessarily RAM)
>
> [Hannes] OK
>
>
> s|there are not other task/processing running|
>   |there are not other tasks/processes running|
>
> [Hannes] OK
>
>
> s/unlike it may be the case/
>   /unlike that which may be the case/
>
> [Hannes] OK
>
>
> s/Note: This is an implementation requirement./
>   /Note: This last paragraph is an implementation requirement./ (Otherwise, 'this' could ambiguously refer to the whole requirement)
>
> [Hannes] OK
>
>
>
> 3.7 Small Parsers
> "Since parsers are known sources of bugs they must be minimal."
> To be honest, I suspect the target audience will find this sentence and others like it rather pious. Given the purpose of this document is meant to be to encourage implementers to provide secure firmware update, I think these peripheral "requirements" will just serve to make any implementers reading this feel they are being patronized.
>
> As with the earlier requirement about 'robustness against becoming unbootable', I think many of these 'requirements' would be easier to stomach within a discussion of tradeoffs, rather than as a list of pronouncements that demand perfection.
>
> [Hannes] The need to keep the parser small was the main discussion point in the group. If you could only imagine how concerned the IoT community is about picking the appropriate encoding formats so that the parsers are as small as possible. For most IETF IoT people, this is the key design aspect. We even have entire IETF working groups that re-do the work done previously just to make the encoding format (and the parser) smaller.

[BB] I think you misunderstand me. I am not disagreeing with the need 
for parsers to be minimal. I am saying that just pronouncing that 
parsers must be minimal does not capture the true issue, which is not a 
straightforward requirement, but rather a tradeoff - between how minimal 
a parser should be and how expressive a language should be.



Bob

>
>
> 3.8
> s/Minimal impact on existing firmware formats/  /No impact on existing firmware formats/
> Reason: This is what the text underneath says.
>
> [Hannes] Correct. Will fix it.
>
>
> 3.9 Robust permissions
>
>    "...the authorization policy is separated from the
>    underlying communication architecture. This is accomplished by
>    separating the entities from their permissions."
> I'm not sure whether either of these sentences makes much sense (at least not to me). Perhaps the first sentence means to say that
>    "...the authorization policy is separated from the
>    firmware it applies to"
> And then the second sentence could be deleted. I'm not sure the second sentence would ever be necessary, because entities are always separate from their permissions (otherwise you would have to access an entity to find out you weren't allowed to access it). To be honest, I don't really see the point of the whole requirement. So if it is important, maybe its meaning needs to be clarified for people like me. Otherwise, if it's just stating the obvious, maybe it's not necessary at all.
>
> [Hannes] We will re-word it to improve readability
>
>
> 3.10. Operating modes
> Later, in S.5. the term 'delivery modes' is used. If these are meant to mean the same thing, then the same term should be used consistently. In my experience, the term 'interaction model' is used to describe things like polled request-reply, push, publish-subscribe, etc.
>
> [Hannes] The delivery modes refer to the ability to attach the firmware image to the manifest vs. to keep the two separate. The operating modes refer to the how the client learns about the update. We will figure out whether there is a better way to label them.
>
>
>
> "The pre-authorisation step involves verifying..."
> When describing a distributed system, pls avoid passive sentences like this, which don't specify which entity is performing the action. It is followed up later by "...the firmware consumer must also...", which implies the subject is the firmware consumer, but it's best not to rely on implication, especially not if it requires two passes to understand.
>
> [Hannes] We can change the sentence to improve readability
>
>    "Pushing a manifest and firmware image to the transfer to
>    the Package resource of the LwM2M Firmware Update object"
> Garbled?
>
> [Hannes] Yes, garbled. Will fix it.
>
>
>    "...it may need to wait for a trigger from the
>    status tracker to initiate the installation, may trigger the update
>    automatically, or may go through a more complex decision making
>    process to determine the appropriate timing for an update"
> I had to read this a few times before realizing it was a list.
> How about:
>    "... to initiate the installation, it may either need to wait for a trigger
>    from the status tracker; or trigger the update automatically; or go through a
>    more complex decision making process to determine the appropriate timing for
>    an update"
>
> [Hannes] Sounds good to me.
>
> 3.11.
> s/Suitability to software and personalization data/  /Suitability for software and personalization data/
>
> The document suddenly jumps into a different style at the start of 3.11, more like an log of WG activity than a requirement. Pls consider making the style consistent, especially given it switches back after the first sentence of the 2nd para.
>
> [Hannes] OK. Will change the text.
>
>
> 4. Claims
> s/Only install firmware with a matching vendor/  /Only install firmware with a matching author/ ?
>
> [Hannes] In this example, the attempt is to avoid installing an update on an incorrect device.
> We can add the intention of the claim example.
>
>
> 5. Communication Architecture
>
> The document often repeats that it's agnostic to the communication architecture, then this section starts with the phrase:
>    "Figure 1 shows the communication architecture..."
> Perhaps it means 'firmware update architecture'?
> Or, possibly this implies that the authors are in two minds as to what 'communications architecture' means. Or the heading was intended to be 'Communications Architectures' (plural) and the first phrase was meant to say
>    "Figure 1 shows an example communication architecture..."
>
> [Hannes] The text in the body says that SUIT is agnostic to how firmware images are distributed and the intention is not to require a specific protocol, like HTTP, CoAP, MQTT, BLE, etc., to be used. A communication architecture is something different for me, namely how the different entities interact with each other on a higher level.
>
> The text needs to make it clear that a status tracker is optional in the client pull case but not in the server push case (see item #4a earlier).
>
> [Hannes] OK.
>
>
> It would be useful for the doc to say what it means for an operator circle to enclose a function. For instance the 'Device Operator' in Fig 1 encloses the status tracker, which to me implies it controls the status tracker. However, the network operator encloses the device, which probably doesn't imply it operates the device. Perhaps an enclosing circle means 'within the physical security control of'? The network operator isn't mentioned in the text - why is it in the diagram, given it has no role in the firmware update, other than as a common carrier of opaque bits?
>
>    "The following assumptions are made to allow the firmware consumer to
>    verify the received firmware image and manifest before updating
>    software:"
> The following three bullets aren't really assumptions. Perhaps 'statements about the verification process' would be a better phrase. Would another reference to suit-information-model here be useful, to explain why the details are not given here?
>
> [Hannes] The term "assumption" is indeed a bit odd. Statements sounds better. I am also OK to put a reference to the information model here, if it helps readers.
>
> See item #4b) above about highlighting that confidentiality is optional, not just 'deployment specific'.
>
> [Hannes] We can state that it is optional to implement but may be required in some deployments.
>
>
>
>    "There are different types of delivery modes, which are illustrated
>    based on examples below."
> Shouldn't this sentence start section 5? (Also see my earlier point about 'operating modes' / 'interaction modes' terminology).
>
> [Hannes] Not really. As explained before, the delivery modes here refer only to the bundling of the firmware image with the manifest vs. keeping it separate. Maybe the term "delivery mode" isn't the best.
>
> Fig 3 is inconsistent with Fig 1, in that it omits the firmware consumer function.
>
> [Hannes] We can add the firmware consumer to the figure but the main point of the figure was something else.
>
> Fig 4 is inconsistent with Figs 1 & 3, in that there is also an arrow from the status tracker to the author. What does this imply?
>
> [Hannes] We should add the arrow to be consistent. Thanks for catching this. The author needs to inform the status tracker that a firmware image is available.
>
>    "This architecture does not mandate a specific delivery mode but a
>    solution must support both types.
> Whatever for? This requirement surely over-plays the IETF's hand, which is not in a position to make such a demand? Is the intention really that being agnostic to the delivery mode means every solution must support all delivery modes?
>
> [Hannes] The IETF SUIT manifest specification has to allow: (a) bundling the firmware with the manifest and (b) to keep them separate.
> Maybe we should be more explicit about what we are trying to require here from the solution specification
>
>
> 6. Manifest
>
> Given each of the items in the second bullet list addresses one of the questions in the first bullet list, it would be useful to tabulate them side-by-side and to put them in a more meaningful order, e.g. in the order they occur during firmware update. Also, the the first question bullet (author
> trust) is not specifically addressed in the second list - implied within the last bullet, but not explicitly stated.
>
> [Hannes] A tabular representation is fine for me.
>
> 7.1
> s/Combined with the non-relocatable nature of the code/  /Due to the non-relocatable nature of the code/
>
> [Hannes] OK.
>
> 7.3
>    "This configuration has two or more CPUs in a single SoC that share
>    memory (flash and RAM). Generally, they will be a protection
>    mechanism to prevent one CPU from accessing the other’s memory."
> I know what is intended, but it reads as if line 1 contradicts line 3. Perhaps:
>   "...
>    mechanism to prevent one CPU from unintentionally accessing memory currently
>    allocated to the other."
>
> [Hannes] OK.
>
>
> 9. Example
>
> In at least one example figure, it would be useful to show the initial pre-loading of keys, policy logic and trust anchor into the firmware consumer / bootloader.
>
> [Hannes] I can do that.
>
>
>
> s/starting with an author uploading the new firmware to firmware server/  /starting with an author uploading the new firmware to the firmware server/
>
> [Hannes] Ok.
>
>    "This setup does
>    not use a status tracker and the firmware consumer component is
>    therefore responsible for periodically checking whether a new
>    firmware image is available for download."
> It needs to be much clearer that the status tracker has both a monitoring function and an update triggering function.
>
> [Hannes] OK. I was hoping that the definition of the status tracker would do that job. We can, however, repeat this aspect in Section 9 (example) again.
>
>   So, altho it is essential in the server push model - to trigger updates, it's monitoring function means it is not ruled out for the client pull model.
>
> [Hannes] Correct. Most IoT devices today are operated with a status tracker.
>
> Fig 5 & 6 are inconsistent, in that the former omits the IoT device box around the Firmware consumer and bootloader.
>
> [Hannes] Good catch. I will fix that.
>
> s/Figure 6 shows an example follow with the device using a status tracker./  /Figure 6 shows an example with the device using a status tracker./
>
>    "For editorial reasons the author publishing the manifest at
>    the status tracker and the firmware image at the firmware server is
>    not shown."
> How about:
>    "Depiction of the author publishing the manifest at
>    the status tracker and the firmware image at the firmware server would
>    be the same as in Figure 5. So for brevity they are not shown."
>
> [Hannes] OK.
>
>
> 11. Security Considerations
>
> Between
>    "A report about this workshop can be found at [RFC8240]."
> and
>    "A standardized firmware manifest format..."
> there either needs to be some glue text to explain that the initial manifest format was an output of the workshop (if it was), or a new para if the second sentence really doesn't follow from the first.
>
> [Hannes] You are right. We need to glue the two sentences together.
>
> Note also that I suggest (item #1) that the motivating text about the workshop should be moved to the introduction. I also say (in item 2c) that the scoping bullets would be better at the end of the Intro too. However, I can also see a case for them remaining under Security Considerations; to admit that the document does not fully address all possible security concerns.
>
> [Hannes] I need to see how the resulting text looks like.
>
> Given this could leave nothing in the Security Considerations section, it would be appropriate to merely point to all the sections of the document that already cover security matters.
>
> == References ==
> [1] Byers, J.; Luby, M.; Mitzenmacher, M. & Rege, A. A Digital Fountain Approach to Reliable Distribution of Bulk Data Proc. ACM SIGCOMM'98, Computer Communication Review, 1998, 28
>
> [2] Perrig, A.; Szewczyk, R.; Wen, V.; Culler, D. E. & Tygar, J. D. SPINS:
> Security Protocols for Sensor Networks Proc. ACM International Conference on Mobile Computing and Networks (Mobicom'01), 2001, 189-199
>
> [3] Briscoe (Ed.), B. & others Network Functions Virtualisation; Security; Problem Statement ETSI NFV Industry Specification Group (ISG), ETSI NFV Industry Specification Group (ISG), 2014
>
>
> Ciao
> Hannes
> IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
> _______________________________________________
> Tsv-art mailing list
> Tsv-art@ietf.org
> https://www.ietf.org/mailman/listinfo/tsv-art

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/