Re: [Blockchain-interop] Gatweay Crash Recovery Discussion #1

Thomas Hardjono <hardjono@mit.edu> Mon, 07 December 2020 16:40 UTC

Return-Path: <hardjono@mit.edu>
X-Original-To: blockchain-interop@ietfa.amsl.com
Delivered-To: blockchain-interop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 721EE3A0FC2 for <blockchain-interop@ietfa.amsl.com>; Mon, 7 Dec 2020 08:40:06 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ow3g-TOCPVC8 for <blockchain-interop@ietfa.amsl.com>; Mon, 7 Dec 2020 08:40:04 -0800 (PST)
Received: from outgoing-exchange-1.mit.edu (outgoing-exchange-1.mit.edu [18.9.28.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id EC5333A0F0B for <blockchain-interop@ietf.org>; Mon, 7 Dec 2020 08:40:03 -0800 (PST)
Received: from w92exedge3.exchange.mit.edu (W92EXEDGE3.EXCHANGE.MIT.EDU [18.7.73.15]) by outgoing-exchange-1.mit.edu (8.14.7/8.12.4) with ESMTP id 0B7Gdd48031024; Mon, 7 Dec 2020 11:40:02 -0500
Received: from w92expo23.exchange.mit.edu (18.7.74.77) by w92exedge3.exchange.mit.edu (18.7.73.15) with Microsoft SMTP Server (TLS) id 15.0.1293.2; Mon, 7 Dec 2020 11:38:54 -0500
Received: from oc11expo23.exchange.mit.edu (18.9.4.88) by w92expo23.exchange.mit.edu (18.7.74.77) with Microsoft SMTP Server (TLS) id 15.0.1365.1; Mon, 7 Dec 2020 11:39:41 -0500
Received: from oc11expo23.exchange.mit.edu ([18.9.4.88]) by oc11expo23.exchange.mit.edu ([18.9.4.88]) with mapi id 15.00.1365.000; Mon, 7 Dec 2020 11:39:41 -0500
From: Thomas Hardjono <hardjono@mit.edu>
To: Rafael Belchior <rafael.belchior@tecnico.ulisboa.pt>
CC: Martin Hargreaves <martin.hargreaves@quant.network>, "blockchain-interop@ietf.org" <blockchain-interop@ietf.org>
Thread-Topic: [Blockchain-interop] Gatweay Crash Recovery Discussion #1
Thread-Index: AQHWxzjaLfsSgHvBpEij+cXceb4u96nhMacpgABiJICAAVy3a4ABLMsAgAAPJQCAB2sagIAAG9D2gAB9FgD//6/Ymg==
Date: Mon, 07 Dec 2020 16:39:41 +0000
Message-ID: <9cfbd5854d554025b73728d810beed2b@oc11expo23.exchange.mit.edu>
References: <666e283e0d7a452fbf31dc7a42ec71b6@tecnico.ulisboa.pt>, <a1666b75233e112cd7d828ea4fa4fada@tecnico.ulisboa.pt> <a40dc7708df646b385e5ebbdcab43781@oc11expo23.exchange.mit.edu>, <a87a56a2e6e85666e32145d1c83e892e@tecnico.ulisboa.pt> <c0564c70e6a44fd29798e1ee6b4db5ae@oc11expo23.exchange.mit.edu> <baae5633d50058666b7b71bc45ec9ea8@tecnico.ulisboa.pt> <LO2P123MB3872DFE8357DA5B8737F80B8FCF30@LO2P123MB3872.GBRP123.PROD.OUTLOOK.COM>, <b73189779f4244f2bc71a424200ea32d@tecnico.ulisboa.pt> <2dc519dbd48f404bbd01c4d96d13f646@oc11expo23.exchange.mit.edu>, <2e7b04de865951ac131c8410d6b05262@tecnico.ulisboa.pt>
In-Reply-To: <2e7b04de865951ac131c8410d6b05262@tecnico.ulisboa.pt>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [73.167.220.69]
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/blockchain-interop/pOgJmxKUAWPPhk06P75zHNX7Wg4>
Subject: Re: [Blockchain-interop] Gatweay Crash Recovery Discussion #1
X-BeenThere: blockchain-interop@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Blockchain Gateway Interoperability Protocol <blockchain-interop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/blockchain-interop>, <mailto:blockchain-interop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/blockchain-interop/>
List-Post: <mailto:blockchain-interop@ietf.org>
List-Help: <mailto:blockchain-interop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/blockchain-interop>, <mailto:blockchain-interop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Dec 2020 16:40:07 -0000

Thanks Rafael,

>>> I would only consider option (a), with a new TLS session, unless there
>>> are major reasons for resuming a session.

Yes, this make sense (just assume a new TLS session after a crash).


Best


-- thomas -- 



________________________________________
From: Rafael Belchior [rafael.belchior@tecnico.ulisboa.pt]
Sent: Monday, December 7, 2020 11:25 AM
To: Thomas Hardjono
Cc: Martin Hargreaves; blockchain-interop@ietf.org
Subject: Re: [Blockchain-interop] Gatweay Crash Recovery Discussion #1

Hey,

I would only consider option (a), with a new TLS session, unless there
are major reasons for resuming a session.

The protocol we are creating works both ways, but with differences
depending on if the crashed party is the participant or the coordinator.

Cheers,
Rafael

A 2020-12-07 14:06, Thomas Hardjono escreveu:
> We may have to consider the resumption at 2 levels, namely:
>
> (a) Asset-transfer protocol (ODAP) resumption (i.e. which step of the
> transfer protocol does G1 and G2 restart from);
>
> (b) TLS session resumption (i.e. can G1 and G2 re-use the existing TLS
> parameters, or should they establish a new TLS session).
>
>
> A sketch would look something like this (assuming G2 crashed):
>
> G2 ---> G1:  Transfer-Resume [SessionID; crash-error-code]
>
> G1 ---> G2:  Transfer-Continue[SessionID; protocol-step-X; TLS-resume]
>
> G2 <---> G1: (re-establish TLS channel)
>
> G2 ---> G1:  (protocol-step-X)
>
>
> It'd be nice if the recovery protocol can work both ways (reversible),
> in that it does not matter whether its G1 or G2 that crashes (recovery
> protocol runs the same).
>
>
>
> -- thomas --
>
>
> ________________________________________
> From: Rafael Belchior [rafael.belchior@tecnico.ulisboa.pt]
> Sent: Monday, December 7, 2020 2:17 AM
> To: Martin Hargreaves
> Cc: Thomas Hardjono; blockchain-interop@ietf.org
> Subject: Re: [Blockchain-interop] Gatweay Crash Recovery Discussion #1
>
> Hello Martin,
> Yes - on the crash recovery protocol we ought to support recovery
> messages that allow establishing to all parties what is the current
> state of an asset transfer.
>
> We can specify this in the "3.3 Recovery procedure" section.
>
> Cheers,
> Rafael
>
>
> A 2020-12-02 14:00, Martin Hargreaves escreveu:
>> Hi both,
>>
>> In terms of protocol messaging, how should we support this?
>>
>> It sounds like, on recovery, the recovered gateway scans its logs and
>> finds a set of uncompleted transactions, then need to send some kind
>> of "recovery message" to its counterparties - offers to continue
>> processing, with a list of transactions and phases in each transaction
>> to pick up from.
>>
>> The gateways that previously saw it time out as it crashed can then
>> evaluate these and respond as to whether they wish to proceed or back
>> out (or indeed don't recognise) the transactions.
>>
>> What do you think?
>>
>> Thanks
>>
>> Martin
>>
>>> -----Original Message-----
>>> From: Blockchain-interop <blockchain-interop-bounces@ietf.org> On
>>> Behalf
>>> Of Rafael Belchior
>>> Sent: Wednesday, December 2, 2020 1:07 PM
>>> To: Thomas Hardjono <hardjono@mit.edu>
>>> Cc: blockchain-interop@ietf.org
>>> Subject: Re: [Blockchain-interop] Gatweay Crash Recovery Discussion
>>> #1
>>>
>>> CAUTION: This email originated from outside of the organisation. Do
>>> not click
>>> links or open attachments unless you recognise the sender and know
>>> the
>>> content is safe.
>>>
>>>
>>> Thomas,
>>> Yes, those assumptions are reasonable; i) we can also consider the
>>> case
>>> where a public-private key pair is lost, and thus a new pair needs to
>>> be
>>> generated. ii) I think we can assume a new SSL connection is created.
>>> Do you
>>> envision problems with this?
>>>
>>> I think it makes sense for the gateway to resume operations from the
>>> last
>>> message before the crash because this mode is blocking, in principle
>>> -
>>> we can
>>> resume from the last event.
>>>
>>> If anyone thinks this could be improved, please do not hesitate in
>>> providing
>>> feedback :)
>>>
>>> Cheers,
>>> Rafael
>>>
>>>
>>> A 2020-12-02 00:17, Thomas Hardjono escreveu:
>>> > Thanks Rafael.
>>> >
>>> > Inline:
>>> >
>>> >>>> > -- Which mode (self-healing mode, or primary-backup mode) do you
>>> >>>> >>> > recommend?  (Which one would be the simplest approach for
>>> >>>> >>> > now, and
>>> >>>> > what assumptions would we need to make).
>>> >>>>
>>> >>>> The self-healing mode is simpler, as the same machine eventually
>>> >>>> recovers, continuing its operations since the latest log entry. It
>>> >>>> does not require, in principle, to read from the log storage API.
>>> >>>> However, we
>>> >>>> are assuming it eventually recovers, and while this happens the
>>> >>>> system is down, prejudicing availability.
>>> >
>>> > For this scenario, there may need to be some assumptions about the
>>> > gateway node.
>>> >
>>> > For example, (i) the gateway recovers to 100% without any internal
>>> > losses (e.g. loss of private-keys); (ii) that the SSL connection has a
>>> > longer life-time than the duration of crash (unavailability); etc.
>>> >
>>> > For short-duration unavailabilities, would the gateway pick-up
>>> > (restart) from the last message before crash, or does it start from
>>> > the beginning of the Phase?
>>> >
>>> > Best
>>> >
>>> >
>>> > -- thomas --
>>> >
>>> >
>>> >
>>> >
>>> >> ________________________________________
>>> >> From: Blockchain-interop [blockchain-interop-bounces@ietf.org] on
>>> >> behalf of Rafael Belchior
>>> >> [rafael.belchior=40tecnico.ulisboa.pt@dmarc.ietf.org]
>>> >> Sent: Monday, November 30, 2020 11:49 AM
>>> >> To: blockchain-interop@ietf.org
>>> >> Subject: [Blockchain-interop] Gatweay Crash Recovery Discussion #1
>>> >>
>>> >> Dear All,
>>> >> Attached, the slides of the first discussion on the crash recovery
>>> >> mechanism for gateways, that took place during the last meeting.
>>> >>
>>> >>
>>> >> Cheers,
>>> >>
>>> >> --
>>> >> Rafael Belchior
>>> >> Ph.D. student in Computer Science and Engineering, Blockchain -
>>> >> Técnico Lisboa https://rafaelapb.github.io/
>>> >> https://www.linkedin.com/in/rafaelpbelchior/
>>> >
>>> > --
>>> > Rafael Belchior
>>> > Ph.D. student in Computer Science and Engineering, Blockchain -
>>> > Técnico Lisboa https://rafaelapb.github.io/
>>> > https://www.linkedin.com/in/rafaelpbelchior/
>>>
>>> --
>>> Rafael Belchior
>>> Ph.D. student in Computer Science and Engineering, Blockchain -
>>> Técnico
>>> Lisboa https://rafaelapb.github.io/
>>> https://www.linkedin.com/in/rafaelpbelchior/
>>>
>>> --
>>> Blockchain-interop mailing list
>>> Blockchain-interop@ietf.org
>>> https://www.ietf.org/mailman/listinfo/blockchain-interop
>> This message is intended solely for the addressee and may contain
>> privileged and confidential information. If you have received this
>> message in error, please send it back to us, and immediately and
>> permanently delete it. Do not use, copy or disclose the information
>> contained in this message or in any attachment. Quant Network does not
>> guarantee that this email has not been intercepted and amended or that
>> it is virus free.
>
> --
> Rafael Belchior
> Ph.D. student in Computer Science and Engineering, Blockchain - Técnico
> Lisboa
> https://rafaelapb.github.io/
> https://www.linkedin.com/in/rafaelpbelchior/

--
Rafael Belchior
Ph.D. student in Computer Science and Engineering, Blockchain - Técnico
Lisboa
https://rafaelapb.github.io/
https://www.linkedin.com/in/rafaelpbelchior/