Re: [Suit] On device resets during manifest processing

"Rønningstad, Øyvind" <Oyvind.Ronningstad@nordicsemi.no> Thu, 07 July 2022 13:04 UTC

Return-Path: <Oyvind.Ronningstad@nordicsemi.no>
X-Original-To: suit@ietfa.amsl.com
Delivered-To: suit@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7ADC3C1594A9 for <suit@ietfa.amsl.com>; Thu, 7 Jul 2022 06:04:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.805
X-Spam-Level:
X-Spam-Status: No, score=-1.805 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=0.1, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=nordicsemi.onmicrosoft.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tg4hqlBw4MwP for <suit@ietfa.amsl.com>; Thu, 7 Jul 2022 06:04:52 -0700 (PDT)
Received: from EUR03-DBA-obe.outbound.protection.outlook.com (mail-dbaeur03on2057.outbound.protection.outlook.com [40.107.104.57]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4169CC157B5D for <suit@ietf.org>; Thu, 7 Jul 2022 06:04:51 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=i6clQmu381NFvVJ0Sl/Iep0xKrbgnjlJ6DbmgPIOmCffdpvxwwhwnu1js86p92mRHAHsXIgKin0z5p5EB/VRAnH1Gl/cDwxYf6DR3s4/SXLmJxuLiCHh43qnMsyM1zTvjAeFcoV9iNg8Cdvw9h6x8Hsw+gclISt0KTRN64+tn177InZJVj7EMk9UgP4ByQ+4uoe12SkWpRLEJ1Lt2PAWQmsIF6inlmxIijxYZMT162TTTnHZIPqtjmIkYpeMm3d0M1JpPgXb8MALeAZzOAWXhxVHWM20UyhbUCZeCCf4RoRPpw+szqJn49Ja2dsgtzr6SWPgeM/JHDg8dhdNab2/dg==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=bl2Qqsip9+GamYSVKs7hBBCMHwkSy1fzJLkIa25GI1E=; b=CRCeOkYqUHt9NKtjN8Xf3UhYFIHeJuaL8aCgjcOzl2z1wh6VJyO9C4dIPt8H+EN5dkrdF9nPjyiGRE76jU+9btaPh6RsNCcnAZxhINjHcsnEXbFEThSaYg4SnBbNOd0oDyt3fLWNKb3dVOvimL5CsfTunZ1f2djgUpI5845mIjLE9NHj0X7MDtpHjYZyRRM8FL9kfGp46BVrY21bGGDhIQwPSxjHvaWaA8rux6n1bYt/mRGGhgYCqq9EyDwYpj9DrHREYJLOxrx/+ZaLFptffi1ibQ6CB+nFW4wEjz7+CR8nwh+8WMNx9qF8cNqkCr34oygTfUDjRDNzckGeou1h9w==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nordicsemi.no; dmarc=pass action=none header.from=nordicsemi.no; dkim=pass header.d=nordicsemi.no; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nordicsemi.onmicrosoft.com; s=selector2-nordicsemi-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=bl2Qqsip9+GamYSVKs7hBBCMHwkSy1fzJLkIa25GI1E=; b=NEgXYdAwc+VkP62PYWtvX3a9j/nJF4U3r5X1kMSOYqukvPkozBua1W+pDmHSg1Bnrj+6VY1CkML6pJfdgXNYPtL8zksufqgbIM/e3aCibEe8T/CSeWbDBtDN4L9u5KimQqv5DjEsVEElFDoQJC49+Uo9160u7EHq8RVXPQbLfSk=
Received: from AM9PR05MB7668.eurprd05.prod.outlook.com (2603:10a6:20b:2cc::13) by PA4PR05MB9402.eurprd05.prod.outlook.com (2603:10a6:102:27d::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5417.16; Thu, 7 Jul 2022 13:04:46 +0000
Received: from AM9PR05MB7668.eurprd05.prod.outlook.com ([fe80::dd1f:a176:9106:ff37]) by AM9PR05MB7668.eurprd05.prod.outlook.com ([fe80::dd1f:a176:9106:ff37%2]) with mapi id 15.20.5417.016; Thu, 7 Jul 2022 13:04:46 +0000
From: "Rønningstad, Øyvind" <Oyvind.Ronningstad@nordicsemi.no>
To: Brendan Moran <Brendan.Moran@arm.com>
CC: "suit@ietf.org" <suit@ietf.org>
Thread-Topic: [Suit] On device resets during manifest processing
Thread-Index: AdiPr99Tl0KFMg/lQV6mT+U8i/TDfgBgohtAABV+N4AAG+fEAAAB86IQ
Date: Thu, 07 Jul 2022 13:04:46 +0000
Message-ID: <AM9PR05MB766849CF85C47B8C6B1A26C688839@AM9PR05MB7668.eurprd05.prod.outlook.com>
References: <AM9PR05MB766851037AC1FA6D542713A588809@AM9PR05MB7668.eurprd05.prod.outlook.com> <AM9PR05MB766827642091BA812BAC891188809@AM9PR05MB7668.eurprd05.prod.outlook.com> <1EBCBBDE-3CED-476B-BE26-3A6D096A69BF@arm.com> <0B50DD7D-9AEA-42D2-9EAD-B63DEBF6920F@arm.com>
In-Reply-To: <0B50DD7D-9AEA-42D2-9EAD-B63DEBF6920F@arm.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nordicsemi.no;
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 32c7b89c-1029-464f-44cf-08da6019449c
x-ms-traffictypediagnostic: PA4PR05MB9402:EE_
x-ms-exchange-senderadcheck: 1
x-ms-exchange-antispam-relay: 0
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: PYNFu8ZA0+PAGorMj1iv+h7wyK/3AOg26xFaoXErp1IHs6jLQzR2we5mrqMj6EWMHDtQdYHP1uc8JW01uPmUlnUXfkKU5vn7qc4pJ4tKc5NPR0DhwZvQiT4fF2D/Gfqt18qtflOeDvmRmXsfsBXo3H3olVni95hV2cnwrCYyoDBUCb8nBzwazjwe6XiFpPN544X4kZDBg7vyCCtM6w5W3D9sQt+uvu+KA1qCiEjRX6pXnYNTenCcs4W4wNf1ARwmlmVhhSjRJAftgEnLGZiyeXD9iHs+NmSljVoUJ2CUvNzN0xtBiyoHwXEGjAF4TAPEJcd5ZNAvQ/rs8COCY63Z/7JEb8VbWbM569WGa1aTU185FWoRbd2FwRXpc+qNflgqNUl46lj50saU6JgV+OaKT0tvJSY5HfXIk9fo9zF5lZ1n01S8EWT7skKoaOH/vQWfBuilcM+k421vf/SamYXH7i6kxejKvu7Px3lbdWs3GVBTRDADW8AaKcptfBszxTBil+WeobnGW4diPbnUNlgIXXWEPZP3UKOLqF3eIgJyL0Lo4itLWgvKVCvxHgUEDV2W+pDtT8LUh30g3sTPhcTaZU5WL3FckyXd6VKbltDHC51kA1fV4eol4culVpUx0xTeS5lZYFjeAr8xUiFTFQZWPxGsd8ElZ4rYlG8binPnv0ouGBoM8JigTlHt7n4xSwJsc/0Mpt5w5ZREUG+l+FQmIlEzrmaz/ZpvUbsL9Qq2b/LIImJPV7fETKWPXAFoGOmnUNQCsdDWtrd6eSL5Bxki+6GeIecvNJWUo4hded/+M4UEWbSrCUaTNb5CI/5q1oHrCNTijEPgJ+Roa4Cw1tjCojx1AOwWFkzK4OTxft//qubKy4ljDgR4CzWNNiXFp/Ch3Tp2gyiiH6Cy9rz3ZjOhSg==
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:AM9PR05MB7668.eurprd05.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230016)(4636009)(136003)(376002)(346002)(39850400004)(396003)(366004)(122000001)(38070700005)(316002)(186003)(966005)(6916009)(41300700001)(8936002)(2906002)(30864003)(71200400001)(5660300002)(33656002)(66446008)(66476007)(66946007)(66556008)(86362001)(66574015)(52536014)(53546011)(64756008)(38100700002)(76116006)(4326008)(83380400001)(8676002)(6506007)(166002)(55016003)(478600001)(9686003)(7696005); DIR:OUT; SFP:1101;
x-ms-exchange-antispam-messagedata-chunkcount: 1
x-ms-exchange-antispam-messagedata-0: F6pDozNN54Gsakf0Exddpwzw8n9VG4tE38IQr/C2klZpMCyjRywvwlvhcu1W0wIf7qYTp42gQMd/T38F8UFWwO17zj7KvUW5sbk4f/62U6S2kfjIwugGacfqTLxbThM0x+FroBJNDyEfJ7s4XxYb0sxint1gUNJJsnJCUXfT6bxjsjq99BITJdeL9zL+7Kz3cTMnsTvMLo05vvXEvxd2DUf0FbMeuZ9W27Vf4jeYDY86NJWWWptuSDEx9TWl+aDwZsAS3Mx2X4+fshCHuZ1PaKF0cF0vCOZvue/0e9/MKvr6bWmyv+rN81f2RaxxDQVnyyZIYorQTb03AU29k/4VVkRqXGZIt1KsyV+e7My2G5OaRHy5UHY2J8m/hOiwzNNXQBsKzbx5WvlG3WDMFaD1LSFslSYA8lshOylmCChNhWLZDJE9EbqC3VetFjHM/sW1zYiQIQ9cFFFK6v4A3qjh8xJHuBvfw10nWrdUSAQKpVmC/r1oO9Z/cRQJQTbRHjx4l1HAWs31FkBa5V24ItyU+TSq7+kdtdpHMHXlY3+boEp2BeiIFAOYOtlub8HzS4/r+F7RvpwrR9P9BcC3Xs+466KsAd5woyJ6EfQdIT2bqHAB97jhTyWLPXHfLkSg+2ajWjnhTIfb67b7SWNb1/FfvUe1FilCz6A+V/OZj2GrTdIppuVHcroS5iRCq9nRVAY2Elttl0yM3/kp39RcqwuEa/Oa9Gc7KMcYTcVcWmmeIPBJ0HLtAQF9B9oeYuJUVjl7l14kIZhdXmKnN3ufazC0m9Zb+9wOw+2t0xPDPNsOjr/nsk/K+QWgs6mgleyReYd0KiAChiD/l9XafSKN0Qqoz1atp7dMVyQk7JEP8ISL0rvoVPkWypaplQwWVOWpuapsVF15PLiTzMxEp0hDZshaaPWJPHQxcgEk6Pvjr7joQhRqHgu8WwiJZF1kXnsu0JkHhVzlfqKSKnyNCuuasseZUy7TUSi98GRr76RxSEdnisCavVZ0HBT5TfmcQ9k4LDsg/cJmEs4BMbJxLdXG7LL6hzbJDj7L5BahtVKDRczIxd5Td3lzEvVCcaJ37GoiekL/cqHsmkt38nrNVovXi5C0/JYH46phzcARNyv7Dw7t7litAZEhPHeWnkrFa2NZd7TPLeQRvaL0tlWtAXsopnOiMoySOdm+ZyYGFt3RpUKedeivsYxjjx5mwunP2ROuBQ69KsQ7zmZKKXz0faaogFj/OetadBXIcr1P+m4dbZkU3502RkGef+PbI4xfDK4A2wjv5oCHAkOq9ThY7lj57GrQj2330MQDEkdMwuHJZmUQ6q8srtQVfxBDjzRcvMm45XPEvNkNxlw8XRfMYZEndXD08yhHLC+1dN91x9hFP20ZeDrn7bwfInq3LzErLl0FlLgY3A11h0iLkpgGEviGaHPRGZTYvQoEiipOKlwF3cT7cQDAaxrOCrvY1rnkorjVfoxsWVSBVjvxJ95FIl8qNMQfMLShP/C3PQa00AWSy7SXoHFDmhdyfh6qIAWNMqI9/D6RhpLB76my3x7aH0lIOsW/ro5DhCXByqkS7CaF9gNtpWpyJh3ttalALrnL6wVKnA5QycBblRpqjGHhIJRzZZnIcFTrDf1wnDd6VpA7Igf9syMKfeCAe/T1P3WqK2whoSDIc7leeMDBSzS+z7lWYTI1gw==
Content-Type: multipart/alternative; boundary="_000_AM9PR05MB766849CF85C47B8C6B1A26C688839AM9PR05MB7668eurp_"
MIME-Version: 1.0
X-OriginatorOrg: nordicsemi.no
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: AM9PR05MB7668.eurprd05.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 32c7b89c-1029-464f-44cf-08da6019449c
X-MS-Exchange-CrossTenant-originalarrivaltime: 07 Jul 2022 13:04:46.7418 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 28e5afa2-bf6f-419a-8cf6-b31c6e9e5e8d
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: Pjhgeir0nekmtH/+HaZbdDSPklTfRJ8jtogFmqD4+opvfVj2FdG0MpdzlkLrJhsPnq0dxVzzJ9h8VHhwbld6yn3bmrFtccsW84DHfaHu1sPTjtS0xribeUPQ7n3ErcJI
X-MS-Exchange-Transport-CrossTenantHeadersStamped: PA4PR05MB9402
Archived-At: <https://mailarchive.ietf.org/arch/msg/suit/hOxNTSUTZR0-NgtmYWqhp2kor5c>
Subject: Re: [Suit] On device resets during manifest processing
X-BeenThere: suit@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Software Updates for Internet of Things <suit.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/suit>, <mailto:suit-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/suit/>
List-Post: <mailto:suit@ietf.org>
List-Help: <mailto:suit-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/suit>, <mailto:suit-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Jul 2022 13:04:56 -0000

Looks good in general, some comments on the numbered points:


  1.  "try again" could be expanded upon a bit, which actions are to be tried again? "Fallback/recovery image" is not defined in the document (AFAICT).
  2.  Maybe specify that this applies to both update and boot sequences. Also "repeated invocation" could be something like "(possibly incomplete) repeated invocation".
  3.  "has no side-effects up until the point of failure" could be "can reproduce the processor state at the time of failure."
  4.  "must be resumable" can be "must be resumable or revertible".

Thanks, Øyvind

From: Brendan Moran <Brendan.Moran@arm.com>
Sent: Thursday, July 7, 2022 13:52
To: suit@ietf.org
Cc: Rønningstad, Øyvind <Oyvind.Ronningstad@nordicsemi.no>
Subject: Re: [Suit] On device resets during manifest processing

Dear suit wg,

I have drafted some text regarding resilience to power failures for SUIT. Please let me know if this has adequate detail.

Best Regards,
Brendan



Proposed Manifest Text:

Section 6.8: Resilience to Disruption

As required in Section 3 of RFC9019, devices must not fail when a disruption, such as a power failure or network interruption, occurs during the update process.

The manifest processor must be resilient to these faults. In order to enable this resilience, systems implementing the manifest processor MUST make the following guarantees:

Either:
1. A fallback/recovery image is provided so that a disrupted system can try again
2. Manifests are constructed so that repeated invocation of the manifest always results in a correct system configuration
3. A journal of manifest operations is stored in nonvolatile memory so that a repeated invocation has no side-effects up until the point of failure. This journal can be, for example, a SUIT Report. This report can be used to resume processing of the manifest from the point of failure.

AND

4. Destructive replacement of an image MUST be resumable. For example, an incomplete swap or in-place differential patching operation must be resumable since it can render a system unusable otherwise.






On 6 Jul 2022, at 23:33, Brendan Moran <Brendan.Moran@arm.com<mailto:Brendan.Moran@arm.com>> wrote:

Hi Øyvind,

Thank you for taking the time to work through this. There are two broad approaches that can be used here: progress tracking in non-volatile memory and partial completion detection in the manifest itself. We also need application-specific command-resumption.

I think we have come to a similar set of alternatives:
The best solution might be that the manifest author has to manually specify via a directive when the current state (current parameter values, current instruction) should be cached.
An alternative would be to always store at certain points, e.g. at the end of a sequence. This is awkward since it will interfere with the other reasons one has to group sequences in a certain way. This will also embed the decision of when to cache into the structure of the manifest, so there might as well be an explicit directive.
I don't think that either of these are explicitly necessary. Using a journal (like the SUIT Report) obviates the need for dedicated directives. The reporting policy that is already available allows us to construct a manifest in a way that stores the correct data, though we may wish to override this or separate the concerns of manifest resumption and manifest reporting. See below.
Another alternative is to say that all update procedures must be constructed in such a way that they are intrinsically safe from resets. This places the restriction that a component cannot serve as a copy destination after it has been a copy source, unless it was also a destination before it was a source. It also precludes the use of e.g. the swap directive (unless it has its own revert machinery). This will certainly be ok for a lot of applications, but may be too restrictive as a general rule.
There's a different way to achieve this (more below). We use image match checks to decide the part of the manifest to which we've gotten. This allows a manifest author to work around all of these problems (except swap and delta patching) without support from the manifest parser, but it comes at the cost of more hash calculations and larger manifests.
(Certain directives like swapping and delta patching need their own internal state-saving in any case)
Definitely.

The final alternative is to have the manifest processor infer from the manifest which points it should perform caching at. This requires finding a simple and robust method of making this inference. It likely requires the manifest processor to be aware of when different memory components are used as copy sources, and also the lifetime of volatile components. This might become tricky if e.g. multiple component IDs refer to the same memory areas.
I think this is actually moderately simple to do and it's my preferred alternative. The rules are:

  1.  Condition Failures are recorded because they alter the flow of the manifest
  2.  Nonvolatile state-altering commands (Copy, Fetch, Swap) are recorded because we need to know which failed if there is more than one between other records
These are stored in a journal, like the SUIT Report, and used to guide the manifest parser until it runs out of journal entries. Because a Copy/Fetch with a volatile target is a potential source of problems in this approach, we may need to perform those again. This might become a challenge. I think it might take some more consideration to work through this question.


More on this below.


Progress tracking in non-volatile memory
===============================

Essentially, this is a journal approach. The manifest parser uses the journal (a SUIT Report, for example) along with the manifest to recreate the parser state immediately before failure.

On startup, the manifest processor checks for a SUIT Report matching the manifest in its report storage area. If there is not one, it creates one and starts processing the manifest.

If there is already a SUIT Report matching the manifest, then it begins processing the manifest, using the SUIT Report to determine what state it should hold-since that is the purpose of the SUIT Report-effectively treating each command as a "dry run" if it has a corresponding entry in the SUIT Report.

When it comes to the end of the SUIT Report, it knows that all subsequent commands did not execute to completion. There's a small hiccup here: it's not currently mandatory for commands that modify non-volatile system state to append to the SUIT Report. We can fix that by ensuring that they have the correct reporting policy when constructing the manifest (https://datatracker.ietf.org/doc/html/draft-ietf-suit-manifest-17#section-8.4.7<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdatatracker.ietf.org%2Fdoc%2Fhtml%2Fdraft-ietf-suit-manifest-17%23section-8.4.7&data=05%7C01%7COyvind.Ronningstad%40nordicsemi.no%7C2b1e28b11ef9488916d708da600f30e8%7C28e5afa2bf6f419a8cf6b31c6e9e5e8d%7C0%7C0%7C637927915645863319%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=vJe8KfujW6xRJdYHOR%2FJ7PbA5tC7nJ1QoBImG5FtJMU%3D&reserved=0>) or we can override the default behaviour of reporting policy for those commands in the Manifest Processor.

Since this is an application-specific mitigation for power loss/reset, I would prefer to have the Manifest Processor override the specified reporting policy for any commands that alter non-volatile system state.

QUESTION:
Should we add some language around using a SUIT Report for manifest resumption in scenarios where the manifests may have enough complexity that simply re-running the manifest will not result in a correct installation?


This doesn't have to be a SUIT Report, but it is a convenient way to express the idea. There is still the problem of an incomplete, unrevertable update, such as applying a differential update to an image in-place. Or executing an image swap. That will need support from an Application-Speciffic Command Resumption feature.

Application-Specific Command-Resumption
=================================

We have discussed this in connection with MCUBoot support for image swap in the past. MCUBoot uses a resumable image swap algorithm that will continue even if it is interrupted. This could potentially be applied to other operations that modify non-volatile state, such as applying a differential update. I'm not sure how applicable it is to non-diff, non-swap operations.

QUESTION:
Perhaps we need some language around this?

PROPOSED:
A command that alters system state in a way that prevents a repeated execution of the command from completing correctly MUST provide a journal that it can use to resume execution correctly.

Partial Completion Detection in the Manifest
=================================

Using the Try-Each block we can build a number of constructs that allow us to resume in the correct state. For example, in the sequence you provided:

  1.  Copy from Component ID 100 to 200
  2.  Copy from Component ID 300 to 100

This can be replaced in a reasonably straight-forward way:

  1.  If not Image Match Component ID 200

     *   Copy from Component ID 100 to 200

  1.  If not Image Match Component ID 100

     *   Copy from Component ID 300 to 100

This uses the existing manifest structure to check for a partial completion in a repeatable way.


This second example, however is more challenging.

  1.  Copy from component Foo (non-volatile) to component Bar (volatile)
  2.  Copy from somewhere to Foo
  3.  Copy from Bar to Baz (non-volatile)
I understand this conceptually, however I'm having difficulty seeing whether we should consider it as a use case that dictates how we construct the specification. The actions themselves are inherently risky, regardless of whether or not SUIT is involved.

Regardless, with an approach where the manifest is constructed in a way that supports repeated invocation in the way described above, this can be worked around by a manifest author who knows what the system state should be. Maybe we need a tool for correctly generating manifests under these conditions?
Rather, command sequences should be constructed in a way that, at most, the current parameter values and current instruction needs to be cached.
Even with well-constructed sequences, caching between every instruction is unnecessary, and would make it close to impossible to use volatile memory components as these always need to be re-initialized if a reset happens.
I think this is a great example of reconstruction using a SUIT Report (progress tracking in non-volatile memory). By replaying the manifest, but using the SUIT Report to guide the result of any conditions, and using the SUIT Report to skip any non-volatile state altering commands, we can perfectly reconstruct the state of the parser, prior to any resets or power failures, in-situ, which allows us to continue processing wherever the previous run left off.

Regardless, I propose that we add some text about this in the manifest document, about the usage of volatile vs. non-volatile memory components, and procedures for recovering from resets.
All of this applies most strongly to the fetch, install, and process dependencies steps. The validate, load, and run steps already obviously need to handle device resets, but it might be good to mention that this includes resets that happen in the middle of any of the steps.

Yes, I think this makes sense. Perhaps we should define the two overarching options: journaling and resilient manifest construction.

It might also make sense in the information model, if there is ever an update to that. Something like the following:
                THREAT.DEVICE.RESET: Forcing a reset of the device
                Classification: Tampering with data
                An attacker sends a valid update, then cuts power or otherwise forces a reset of the device while the update is being processed, leaving the device in an invalid state. Depending on how the manifest processor handles such resets, this could force unintended copying of data from a component that is in the wrong state, causing destruction of data, or booting or transmission of bad data.
There's already a requirement in the Architecture document, but perhaps a Threat in the Threat Model would help.

Devices must not fail when

   a disruption, such as a power failure or network interruption, occurs

   during the update process.
https://datatracker.ietf.org/doc/html/rfc9019#section-3<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdatatracker.ietf.org%2Fdoc%2Fhtml%2Frfc9019%23section-3&data=05%7C01%7COyvind.Ronningstad%40nordicsemi.no%7C2b1e28b11ef9488916d708da600f30e8%7C28e5afa2bf6f419a8cf6b31c6e9e5e8d%7C0%7C0%7C637927915645863319%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=mK%2FmmWztfwzPa0cIxxa0WaUKTFz8GQWTtK5%2FifkZYb4%3D&reserved=0>


Thanks again for looking into this.

Best Regards,
Brendan


IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
_______________________________________________
Suit mailing list
Suit@ietf.org<mailto:Suit@ietf.org>
https://www.ietf.org/mailman/listinfo/suit

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.