Re: [Blockchain-interop] Gatweay Crash Recovery Discussion #1

Martin Hargreaves <martin.hargreaves@quant.network> Wed, 02 December 2020 14:00 UTC

Return-Path: <martin.hargreaves@quant.network>
X-Original-To: blockchain-interop@ietfa.amsl.com
Delivered-To: blockchain-interop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3636F3A1404 for <blockchain-interop@ietfa.amsl.com>; Wed, 2 Dec 2020 06:00:53 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.101
X-Spam-Level:
X-Spam-Status: No, score=-2.101 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=quant.network
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FA6Fw60-weeZ for <blockchain-interop@ietfa.amsl.com>; Wed, 2 Dec 2020 06:00:51 -0800 (PST)
Received: from GBR01-LO2-obe.outbound.protection.outlook.com (mail-eopbgr100082.outbound.protection.outlook.com [40.107.10.82]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1AD393A13FD for <blockchain-interop@ietf.org>; Wed, 2 Dec 2020 06:00:50 -0800 (PST)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=I1Gb5n0BnX52/Wg3sD1a2t1sDLES0j2CvHfoAzHx2zPztzEZicN3Eg8vNsLg29/QlyL6QV1Obx5oUNe/EDPEGEPn1SH418E25bkk4CGfRUtwC+TaihE8akIFz5WTayiY8238dmHq1hQGkgSkEiF64/UnPjzVwcaX7dKRxuG9yxcA7Bd1864czUnB2524P+FuyR8Bb4DUMyOgMNpltO6xM/3vAL//j1zZUk4kYR7b20zdl6M8vRiybIh7qylEtSzcjOdNfBohl6d7whcJy3Tl3Tc9OoMpYZi+aIhr5fkFYi5TGnPt0BDMAc72gU+oGw5OkwEswsJAgSkDVRo/MrJKaA==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=OJl+e4UlaPa/Ew+5RM9o/vM3DTzqYb+GJcdm38cRpaI=; b=EARiaILMEQ9A0MuOvhXujLpQhj4aTfxU8IY+MvlSEL27HGTT4H+scaruXEzgYe59GncM44dky/7CLTbcUlR3jgCvLqHCfgtODe14flz52WkpCdcePkeopP60GXwBgiZbAYuEX3mQguLk37JDH9er9iIyM0vgYXPOqSO9UBJ7R5Wh2M0Mg1XXndB6UN5XmlJ0+3ImfwYKLRVwxf3gkyubUDKi5lKsFPcA14QA+SrH6JqwptUFI+GK+XFlAgm+DNX/BXQrj855LD2n+zcObxoKXN28ZG/Bgpe522IcjrHUlu4eeG0LT9L7d58N1sFuP8OYuB911SsLj+nj8KutHjtgWQ==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=quant.network; dmarc=pass action=none header.from=quant.network; dkim=pass header.d=quant.network; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quant.network; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=OJl+e4UlaPa/Ew+5RM9o/vM3DTzqYb+GJcdm38cRpaI=; b=TptmoroSz4otc7qb566oDvuYqJJ+nIFgKxgpKp4ZXTZBH632QBi1LzFrXW8JCbzBvY7abTQzGoHzHNXnID0opKknPqeUxhbRj3Aag5T2Ai8w51Bx9LHSJYmSzeq4G8kj3O5wxPqmeSDKjLLQH952OYNySKejwUK8OX47vdnHQqk=
Received: from LO2P123MB3872.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:145::14) by LO2P123MB4221.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:166::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3632.17; Wed, 2 Dec 2020 14:00:48 +0000
Received: from LO2P123MB3872.GBRP123.PROD.OUTLOOK.COM ([fe80::f051:be61:d7b2:f6c0]) by LO2P123MB3872.GBRP123.PROD.OUTLOOK.COM ([fe80::f051:be61:d7b2:f6c0%5]) with mapi id 15.20.3632.017; Wed, 2 Dec 2020 14:00:48 +0000
From: Martin Hargreaves <martin.hargreaves@quant.network>
To: Rafael Belchior <rafael.belchior=40tecnico.ulisboa.pt@dmarc.ietf.org>, Thomas Hardjono <hardjono@mit.edu>
CC: "blockchain-interop@ietf.org" <blockchain-interop@ietf.org>
Thread-Topic: [Blockchain-interop] Gatweay Crash Recovery Discussion #1
Thread-Index: AQHWxzjTeMIPd6yQ3UCaS7ZW/+g3hanhMdoAgAAOH4CAAbK6AIAA1skAgAAOJ6A=
Date: Wed, 02 Dec 2020 14:00:48 +0000
Message-ID: <LO2P123MB3872DFE8357DA5B8737F80B8FCF30@LO2P123MB3872.GBRP123.PROD.OUTLOOK.COM>
References: <666e283e0d7a452fbf31dc7a42ec71b6@tecnico.ulisboa.pt>, <a1666b75233e112cd7d828ea4fa4fada@tecnico.ulisboa.pt> <a40dc7708df646b385e5ebbdcab43781@oc11expo23.exchange.mit.edu>, <a87a56a2e6e85666e32145d1c83e892e@tecnico.ulisboa.pt> <c0564c70e6a44fd29798e1ee6b4db5ae@oc11expo23.exchange.mit.edu> <baae5633d50058666b7b71bc45ec9ea8@tecnico.ulisboa.pt>
In-Reply-To: <baae5633d50058666b7b71bc45ec9ea8@tecnico.ulisboa.pt>
Accept-Language: en-GB, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: dmarc.ietf.org; dkim=none (message not signed) header.d=none;dmarc.ietf.org; dmarc=none action=none header.from=quant.network;
x-originating-ip: [2a00:23c7:978c:3a00:38ad:f2cb:6101:3d18]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 5fd1de22-3b3d-4626-b56b-08d896caabea
x-ms-traffictypediagnostic: LO2P123MB4221:
x-microsoft-antispam-prvs: <LO2P123MB4221EED6F9D67AB792890C32FCF30@LO2P123MB4221.GBRP123.PROD.OUTLOOK.COM>
x-ms-oob-tlc-oobclassifiers: OLM:9508;
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: yJTvWb8Ma449+b7shMv17t8/yEqU+ZmrM94JxUzD9r7XyfV5DcQKeokvv6aqYPdCz0Qa4wsbc69317lRh1FQoNmNHZP8u0v+6YcCgt4uXYmXk1hKSjnG2ffEsj3y6JxxiloOUksNFPVzs7mf8Lh/D9VYcYxyAT/bROYoNqiM8GsvYMDsDwECUaaTdQ2hC8QsU1v23bTdwTN1kDJEFKPs+Zto4DYb58TXVxkht8iZKG7LRdrfWWGmuDvtlV45rJR2Emncv4aqUcV1E8eDREi9Jfl29rJXkl7Fh90CN99ulDW2vPdKImW5b6ggdTsjbzGqUlKu0Ufx0D/UY4iwoQZ0BALPrAqsXFvspNPiZG6UKIaRd8MNgd9RgkklaD6XkhPWoLqX0EIDMZ6V3kQwDA135LWje0XJ0mnS2Cl0Rs9TYGOa51E5dI6giQpEYmeaeGrsK+aQxvkdIc5E2c7uOfx3iQthK3LRxk0xryLM3/VzdDg=
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:LO2P123MB3872.GBRP123.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(346002)(366004)(396003)(39830400003)(136003)(376002)(71200400001)(8676002)(4326008)(66556008)(66446008)(64756008)(2906002)(83380400001)(44832011)(66574015)(33656002)(52536014)(316002)(66476007)(8936002)(5660300002)(966005)(110136005)(7696005)(86362001)(6506007)(186003)(66946007)(76116006)(45080400002)(478600001)(55016002)(53546011)(9686003)(46492008)(17413003); DIR:OUT; SFP:1101;
x-ms-exchange-antispam-messagedata: gUX/bWbFLhazV4OTACQ194ss2UwFX+eVO7d9GpBtfVQNNeCaT3Evh/927fXHO6Vh26uSAAbbMGkcUcO0eM9JFTXCj3K/ewCcT4asFSWlf3/7oo1J9BEvPyY9rSD/Kt9VnfNJmnN37+AL1XvQvZWZ4amXqulpUA1aUXSCcN/FyHfm0DVa36eilUepN7ShDPdeob2xRTA2sJ1Mg7UmNk13Nc0Fjia0nOgdPp4e315c2kRFprIKpWxgHe0ZnFkmUwlsNqBME5ZTVBzSESDPR+CYZyf1n13hxi4R+N+jXHRWsWMc8hNTK+jCZiY95gR8nWmSyJaPE0AzeOxTawvoMoXyhQ6H1rlQq0vHj0P0hYApiV415vOTEncMvUWVp9Sisy2kLV6Ew59HRctXWPeczPJlUARgIfYjsYVfUHLY2X4rhKD3evHPXRqOvCJBtfzYEBB/KrVrY8rTF3huQT7DW7fndHsvZo41PuLx2O+yL9SB8RI2sRXqhlCpqvbU3m7amoCzmK25aXr95hlSMKkQykHLRsrq7emAYbsqHZpXI2ed4CM+IbSx0n3vyRmJzbBT6wwR6s7aYTIrf82B8PTBYnviP03rcXdFzwCrl4sTo/KPcYGs8OT6HTXk6ltUMyOlTvgmV8c3fVzPjEGZ4oQGo1YxNY60HMGNiipyOPS8lCHHXRU87vF+3L5m38s0kDhhLQxFK5RgQaBTWVJNr59aqBnXEmJ+XCQREDjjBdX5E9pX5TlX3wDf1KwYeb//d0MKo06Wk2D9CJ5SN1ODzsWJQOWqCA/X6IyGlwKOrFdRdH8DEgacQKDYFz8TPCSrCGCQ/oksEP7g/7oWWKEzf1LL8xdtGVMRaFlr7JQ6QF6EIxMx+3IGPeesS8VgtdDzfkqMUAgacaHlShKi6WBTr5KvyzioXJW8TMdJOhLarKh8EvCBmjYq72rKXosBj9KuWWtxPyNV2FBL06RhGbc7fftokm/FQOwigxUd0WigUJHO1KIIakg1UKkbXQvmew2TFTOyq9ifrV8rJ+lqSVYoawEVv6mooRAhmuj57qqoS9pdSLFxSlY=
x-ms-exchange-transport-forked: True
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-OriginatorOrg: quant.network
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: LO2P123MB3872.GBRP123.PROD.OUTLOOK.COM
X-MS-Exchange-CrossTenant-Network-Message-Id: 5fd1de22-3b3d-4626-b56b-08d896caabea
X-MS-Exchange-CrossTenant-originalarrivaltime: 02 Dec 2020 14:00:48.4064 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 70500bf4-d417-4259-8a6e-b7a550c6d120
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: AINQ2xtcfYhNM9cg4xXfDi2aeThzs1RWAaqq3GrcNiPdzVuJYenhs01s6+475NDGTr58irDL84EY3OEoCIBwXejnMY2iyXsUQdKzCSirNB8=
X-MS-Exchange-Transport-CrossTenantHeadersStamped: LO2P123MB4221
Archived-At: <https://mailarchive.ietf.org/arch/msg/blockchain-interop/TRgLKfC5BRGlfP6F-wGAvrDSqcA>
Subject: Re: [Blockchain-interop] Gatweay Crash Recovery Discussion #1
X-BeenThere: blockchain-interop@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Blockchain Gateway Interoperability Protocol <blockchain-interop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/blockchain-interop>, <mailto:blockchain-interop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/blockchain-interop/>
List-Post: <mailto:blockchain-interop@ietf.org>
List-Help: <mailto:blockchain-interop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/blockchain-interop>, <mailto:blockchain-interop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 02 Dec 2020 14:00:53 -0000

Hi both,

In terms of protocol messaging, how should we support this?

It sounds like, on recovery, the recovered gateway scans its logs and finds a set of uncompleted transactions, then need to send some kind of "recovery message" to its counterparties - offers to continue processing, with a list of transactions and phases in each transaction to pick up from.

The gateways that previously saw it time out as it crashed can then evaluate these and respond as to whether they wish to proceed or back out (or indeed don't recognise) the transactions.

What do you think?

Thanks

Martin

> -----Original Message-----
> From: Blockchain-interop <blockchain-interop-bounces@ietf.org> On Behalf
> Of Rafael Belchior
> Sent: Wednesday, December 2, 2020 1:07 PM
> To: Thomas Hardjono <hardjono@mit.edu>
> Cc: blockchain-interop@ietf.org
> Subject: Re: [Blockchain-interop] Gatweay Crash Recovery Discussion #1
>
> CAUTION: This email originated from outside of the organisation. Do not click
> links or open attachments unless you recognise the sender and know the
> content is safe.
>
>
> Thomas,
> Yes, those assumptions are reasonable; i) we can also consider the case
> where a public-private key pair is lost, and thus a new pair needs to be
> generated. ii) I think we can assume a new SSL connection is created. Do you
> envision problems with this?
>
> I think it makes sense for the gateway to resume operations from the last
> message before the crash because this mode is blocking, in principle - we can
> resume from the last event.
>
> If anyone thinks this could be improved, please do not hesitate in providing
> feedback :)
>
> Cheers,
> Rafael
>
>
> A 2020-12-02 00:17, Thomas Hardjono escreveu:
> > Thanks Rafael.
> >
> > Inline:
> >
> >>>> > -- Which mode (self-healing mode, or primary-backup mode) do you
> >>>> >>> > recommend?  (Which one would be the simplest approach for
> >>>> >>> > now, and
> >>>> > what assumptions would we need to make).
> >>>>
> >>>> The self-healing mode is simpler, as the same machine eventually
> >>>> recovers, continuing its operations since the latest log entry. It
> >>>> does not require, in principle, to read from the log storage API.
> >>>> However, we
> >>>> are assuming it eventually recovers, and while this happens the
> >>>> system is down, prejudicing availability.
> >
> > For this scenario, there may need to be some assumptions about the
> > gateway node.
> >
> > For example, (i) the gateway recovers to 100% without any internal
> > losses (e.g. loss of private-keys); (ii) that the SSL connection has a
> > longer life-time than the duration of crash (unavailability); etc.
> >
> > For short-duration unavailabilities, would the gateway pick-up
> > (restart) from the last message before crash, or does it start from
> > the beginning of the Phase?
> >
> > Best
> >
> >
> > -- thomas --
> >
> >
> >
> >
> >> ________________________________________
> >> From: Blockchain-interop [blockchain-interop-bounces@ietf.org] on
> >> behalf of Rafael Belchior
> >> [rafael.belchior=40tecnico.ulisboa.pt@dmarc.ietf.org]
> >> Sent: Monday, November 30, 2020 11:49 AM
> >> To: blockchain-interop@ietf.org
> >> Subject: [Blockchain-interop] Gatweay Crash Recovery Discussion #1
> >>
> >> Dear All,
> >> Attached, the slides of the first discussion on the crash recovery
> >> mechanism for gateways, that took place during the last meeting.
> >>
> >>
> >> Cheers,
> >>
> >> --
> >> Rafael Belchior
> >> Ph.D. student in Computer Science and Engineering, Blockchain -
> >> Técnico Lisboa https://rafaelapb.github.io/
> >> https://www.linkedin.com/in/rafaelpbelchior/
> >
> > --
> > Rafael Belchior
> > Ph.D. student in Computer Science and Engineering, Blockchain -
> > Técnico Lisboa https://rafaelapb.github.io/
> > https://www.linkedin.com/in/rafaelpbelchior/
>
> --
> Rafael Belchior
> Ph.D. student in Computer Science and Engineering, Blockchain - Técnico
> Lisboa https://rafaelapb.github.io/
> https://www.linkedin.com/in/rafaelpbelchior/
>
> --
> Blockchain-interop mailing list
> Blockchain-interop@ietf.org
> https://www.ietf.org/mailman/listinfo/blockchain-interop
This message is intended solely for the addressee and may contain privileged and confidential information. If you have received this message in error, please send it back to us, and immediately and permanently delete it. Do not use, copy or disclose the information contained in this message or in any attachment. Quant Network does not guarantee that this email has not been intercepted and amended or that it is virus free.