Re: [Blockchain-interop] Gatweay Crash Recovery Discussion #1

Rafael Belchior <rafael.belchior@tecnico.ulisboa.pt> Mon, 07 December 2020 16:25 UTC

Return-Path: <rafael.belchior@tecnico.ulisboa.pt>
X-Original-To: blockchain-interop@ietfa.amsl.com
Delivered-To: blockchain-interop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7F3913A14AD for <blockchain-interop@ietfa.amsl.com>; Mon, 7 Dec 2020 08:25:25 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.02
X-Spam-Level:
X-Spam-Status: No, score=-2.02 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=tecnico.ulisboa.pt
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id j6-dBPCchEsM for <blockchain-interop@ietfa.amsl.com>; Mon, 7 Dec 2020 08:25:20 -0800 (PST)
Received: from smtp1.tecnico.ulisboa.pt (smtp1.tecnico.ulisboa.pt [193.136.128.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8D67C3A143E for <blockchain-interop@ietf.org>; Mon, 7 Dec 2020 08:25:18 -0800 (PST)
Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp1.tecnico.ulisboa.pt (Postfix) with ESMTP id 4D8B8603AAC6; Mon, 7 Dec 2020 16:25:16 +0000 (WET)
X-Virus-Scanned: by amavisd-new-2.11.0 (20160426) (Debian) at tecnico.ulisboa.pt
Received: from smtp1.tecnico.ulisboa.pt ([127.0.0.1]) by localhost (smtp1.tecnico.ulisboa.pt [127.0.0.1]) (amavisd-new, port 10025) with LMTP id BhiTSYnPPumB; Mon, 7 Dec 2020 16:25:13 +0000 (WET)
Received: from mail1.tecnico.ulisboa.pt (mail1.ist.utl.pt [193.136.128.10]) by smtp1.tecnico.ulisboa.pt (Postfix) with ESMTPS id 0A060603AAE1; Mon, 7 Dec 2020 16:25:13 +0000 (WET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tecnico.ulisboa.pt; s=mail; t=1607358313; bh=Fxxdrzi1neqD10DnTS5vN0zWIslhwRpjx2q/KVhdFok=; h=Date:From:To:Cc:Subject:In-Reply-To:References; b=rgK0q/KCzkxL+iWh/ShMAdw7gulaQ4UAl1knaWTw9Bo7CymJrbrmVzlmxS++poleK 5hhMOxhUP2WuUJKUC70HCY1G8vdNdL5ae05z5cJHVBr+FaWmrXf+kYzKUl1IeSHznM TKjduZOSNSFL8XWGp9TNkYQ3j5SQv/LjzqpiII74=
Received: from webmail.tecnico.ulisboa.pt (webmail4.tecnico.ulisboa.pt [IPv6:2001:690:2100:1::8a3:363d]) (Authenticated sender: ist180970) by mail1.tecnico.ulisboa.pt (Postfix) with ESMTPSA id 81A24360072; Mon, 7 Dec 2020 16:25:12 +0000 (WET)
Received: from vs1.ist.utl.pt ([2001:690:2100:1::33]) by webmail.tecnico.ulisboa.pt with HTTP (HTTP/1.1 POST); Mon, 07 Dec 2020 16:25:12 +0000
MIME-Version: 1.0
Date: Mon, 07 Dec 2020 16:25:12 +0000
From: Rafael Belchior <rafael.belchior@tecnico.ulisboa.pt>
To: Thomas Hardjono <hardjono@mit.edu>
Cc: Martin Hargreaves <martin.hargreaves@quant.network>, blockchain-interop@ietf.org
In-Reply-To: <2dc519dbd48f404bbd01c4d96d13f646@oc11expo23.exchange.mit.edu>
References: <666e283e0d7a452fbf31dc7a42ec71b6@tecnico.ulisboa.pt>, <a1666b75233e112cd7d828ea4fa4fada@tecnico.ulisboa.pt> <a40dc7708df646b385e5ebbdcab43781@oc11expo23.exchange.mit.edu>, <a87a56a2e6e85666e32145d1c83e892e@tecnico.ulisboa.pt> <c0564c70e6a44fd29798e1ee6b4db5ae@oc11expo23.exchange.mit.edu> <baae5633d50058666b7b71bc45ec9ea8@tecnico.ulisboa.pt> <LO2P123MB3872DFE8357DA5B8737F80B8FCF30@LO2P123MB3872.GBRP123.PROD.OUTLOOK.COM>, <b73189779f4244f2bc71a424200ea32d@tecnico.ulisboa.pt> <2dc519dbd48f404bbd01c4d96d13f646@oc11expo23.exchange.mit.edu>
Message-ID: <2e7b04de865951ac131c8410d6b05262@tecnico.ulisboa.pt>
X-Sender: rafael.belchior@tecnico.ulisboa.pt
User-Agent: Roundcube Webmail/1.3.15
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/blockchain-interop/uZXYJXs6eGojMqjOKoZU2C6vtHQ>
Subject: Re: [Blockchain-interop] Gatweay Crash Recovery Discussion #1
X-BeenThere: blockchain-interop@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Blockchain Gateway Interoperability Protocol <blockchain-interop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/blockchain-interop>, <mailto:blockchain-interop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/blockchain-interop/>
List-Post: <mailto:blockchain-interop@ietf.org>
List-Help: <mailto:blockchain-interop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/blockchain-interop>, <mailto:blockchain-interop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Dec 2020 16:25:26 -0000

Hey,

I would only consider option (a), with a new TLS session, unless there 
are major reasons for resuming a session.

The protocol we are creating works both ways, but with differences 
depending on if the crashed party is the participant or the coordinator.

Cheers,
Rafael

A 2020-12-07 14:06, Thomas Hardjono escreveu:
> We may have to consider the resumption at 2 levels, namely:
> 
> (a) Asset-transfer protocol (ODAP) resumption (i.e. which step of the
> transfer protocol does G1 and G2 restart from);
> 
> (b) TLS session resumption (i.e. can G1 and G2 re-use the existing TLS
> parameters, or should they establish a new TLS session).
> 
> 
> A sketch would look something like this (assuming G2 crashed):
> 
> G2 ---> G1:  Transfer-Resume [SessionID; crash-error-code]
> 
> G1 ---> G2:  Transfer-Continue[SessionID; protocol-step-X; TLS-resume]
> 
> G2 <---> G1: (re-establish TLS channel)
> 
> G2 ---> G1:  (protocol-step-X)
> 
> 
> It'd be nice if the recovery protocol can work both ways (reversible),
> in that it does not matter whether its G1 or G2 that crashes (recovery
> protocol runs the same).
> 
> 
> 
> -- thomas --
> 
> 
> ________________________________________
> From: Rafael Belchior [rafael.belchior@tecnico.ulisboa.pt]
> Sent: Monday, December 7, 2020 2:17 AM
> To: Martin Hargreaves
> Cc: Thomas Hardjono; blockchain-interop@ietf.org
> Subject: Re: [Blockchain-interop] Gatweay Crash Recovery Discussion #1
> 
> Hello Martin,
> Yes - on the crash recovery protocol we ought to support recovery
> messages that allow establishing to all parties what is the current
> state of an asset transfer.
> 
> We can specify this in the "3.3 Recovery procedure" section.
> 
> Cheers,
> Rafael
> 
> 
> A 2020-12-02 14:00, Martin Hargreaves escreveu:
>> Hi both,
>> 
>> In terms of protocol messaging, how should we support this?
>> 
>> It sounds like, on recovery, the recovered gateway scans its logs and
>> finds a set of uncompleted transactions, then need to send some kind
>> of "recovery message" to its counterparties - offers to continue
>> processing, with a list of transactions and phases in each transaction
>> to pick up from.
>> 
>> The gateways that previously saw it time out as it crashed can then
>> evaluate these and respond as to whether they wish to proceed or back
>> out (or indeed don't recognise) the transactions.
>> 
>> What do you think?
>> 
>> Thanks
>> 
>> Martin
>> 
>>> -----Original Message-----
>>> From: Blockchain-interop <blockchain-interop-bounces@ietf.org> On
>>> Behalf
>>> Of Rafael Belchior
>>> Sent: Wednesday, December 2, 2020 1:07 PM
>>> To: Thomas Hardjono <hardjono@mit.edu>
>>> Cc: blockchain-interop@ietf.org
>>> Subject: Re: [Blockchain-interop] Gatweay Crash Recovery Discussion 
>>> #1
>>> 
>>> CAUTION: This email originated from outside of the organisation. Do
>>> not click
>>> links or open attachments unless you recognise the sender and know 
>>> the
>>> content is safe.
>>> 
>>> 
>>> Thomas,
>>> Yes, those assumptions are reasonable; i) we can also consider the
>>> case
>>> where a public-private key pair is lost, and thus a new pair needs to
>>> be
>>> generated. ii) I think we can assume a new SSL connection is created.
>>> Do you
>>> envision problems with this?
>>> 
>>> I think it makes sense for the gateway to resume operations from the
>>> last
>>> message before the crash because this mode is blocking, in principle 
>>> -
>>> we can
>>> resume from the last event.
>>> 
>>> If anyone thinks this could be improved, please do not hesitate in
>>> providing
>>> feedback :)
>>> 
>>> Cheers,
>>> Rafael
>>> 
>>> 
>>> A 2020-12-02 00:17, Thomas Hardjono escreveu:
>>> > Thanks Rafael.
>>> >
>>> > Inline:
>>> >
>>> >>>> > -- Which mode (self-healing mode, or primary-backup mode) do you
>>> >>>> >>> > recommend?  (Which one would be the simplest approach for
>>> >>>> >>> > now, and
>>> >>>> > what assumptions would we need to make).
>>> >>>>
>>> >>>> The self-healing mode is simpler, as the same machine eventually
>>> >>>> recovers, continuing its operations since the latest log entry. It
>>> >>>> does not require, in principle, to read from the log storage API.
>>> >>>> However, we
>>> >>>> are assuming it eventually recovers, and while this happens the
>>> >>>> system is down, prejudicing availability.
>>> >
>>> > For this scenario, there may need to be some assumptions about the
>>> > gateway node.
>>> >
>>> > For example, (i) the gateway recovers to 100% without any internal
>>> > losses (e.g. loss of private-keys); (ii) that the SSL connection has a
>>> > longer life-time than the duration of crash (unavailability); etc.
>>> >
>>> > For short-duration unavailabilities, would the gateway pick-up
>>> > (restart) from the last message before crash, or does it start from
>>> > the beginning of the Phase?
>>> >
>>> > Best
>>> >
>>> >
>>> > -- thomas --
>>> >
>>> >
>>> >
>>> >
>>> >> ________________________________________
>>> >> From: Blockchain-interop [blockchain-interop-bounces@ietf.org] on
>>> >> behalf of Rafael Belchior
>>> >> [rafael.belchior=40tecnico.ulisboa.pt@dmarc.ietf.org]
>>> >> Sent: Monday, November 30, 2020 11:49 AM
>>> >> To: blockchain-interop@ietf.org
>>> >> Subject: [Blockchain-interop] Gatweay Crash Recovery Discussion #1
>>> >>
>>> >> Dear All,
>>> >> Attached, the slides of the first discussion on the crash recovery
>>> >> mechanism for gateways, that took place during the last meeting.
>>> >>
>>> >>
>>> >> Cheers,
>>> >>
>>> >> --
>>> >> Rafael Belchior
>>> >> Ph.D. student in Computer Science and Engineering, Blockchain -
>>> >> Técnico Lisboa https://rafaelapb.github.io/
>>> >> https://www.linkedin.com/in/rafaelpbelchior/
>>> >
>>> > --
>>> > Rafael Belchior
>>> > Ph.D. student in Computer Science and Engineering, Blockchain -
>>> > Técnico Lisboa https://rafaelapb.github.io/
>>> > https://www.linkedin.com/in/rafaelpbelchior/
>>> 
>>> --
>>> Rafael Belchior
>>> Ph.D. student in Computer Science and Engineering, Blockchain -
>>> Técnico
>>> Lisboa https://rafaelapb.github.io/
>>> https://www.linkedin.com/in/rafaelpbelchior/
>>> 
>>> --
>>> Blockchain-interop mailing list
>>> Blockchain-interop@ietf.org
>>> https://www.ietf.org/mailman/listinfo/blockchain-interop
>> This message is intended solely for the addressee and may contain
>> privileged and confidential information. If you have received this
>> message in error, please send it back to us, and immediately and
>> permanently delete it. Do not use, copy or disclose the information
>> contained in this message or in any attachment. Quant Network does not
>> guarantee that this email has not been intercepted and amended or that
>> it is virus free.
> 
> --
> Rafael Belchior
> Ph.D. student in Computer Science and Engineering, Blockchain - Técnico
> Lisboa
> https://rafaelapb.github.io/
> https://www.linkedin.com/in/rafaelpbelchior/

-- 
Rafael Belchior
Ph.D. student in Computer Science and Engineering, Blockchain - Técnico
Lisboa
https://rafaelapb.github.io/
https://www.linkedin.com/in/rafaelpbelchior/