Re: [Blockchain-interop] Gatweay Crash Recovery Discussion #1

Rafael Belchior <rafael.belchior@tecnico.ulisboa.pt> Mon, 07 December 2020 07:18 UTC

Return-Path: <rafael.belchior@tecnico.ulisboa.pt>
X-Original-To: blockchain-interop@ietfa.amsl.com
Delivered-To: blockchain-interop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 933323A10EB for <blockchain-interop@ietfa.amsl.com>; Sun, 6 Dec 2020 23:18:08 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=tecnico.ulisboa.pt
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mWRWprQ5ERbg for <blockchain-interop@ietfa.amsl.com>; Sun, 6 Dec 2020 23:18:05 -0800 (PST)
Received: from smtp1.tecnico.ulisboa.pt (smtp1.tecnico.ulisboa.pt [IPv6:2001:690:2100:1::15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7A1793A10E9 for <blockchain-interop@ietf.org>; Sun, 6 Dec 2020 23:18:04 -0800 (PST)
Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp1.tecnico.ulisboa.pt (Postfix) with ESMTP id 5A438603AACA; Mon, 7 Dec 2020 07:18:02 +0000 (WET)
X-Virus-Scanned: by amavisd-new-2.11.0 (20160426) (Debian) at tecnico.ulisboa.pt
Received: from smtp1.tecnico.ulisboa.pt ([127.0.0.1]) by localhost (smtp1.tecnico.ulisboa.pt [127.0.0.1]) (amavisd-new, port 10025) with LMTP id 0gTxoqP2vH29; Mon, 7 Dec 2020 07:17:59 +0000 (WET)
Received: from mail1.tecnico.ulisboa.pt (mail1.ist.utl.pt [193.136.128.10]) by smtp1.tecnico.ulisboa.pt (Postfix) with ESMTPS id 27291603AACE; Mon, 7 Dec 2020 07:17:59 +0000 (WET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tecnico.ulisboa.pt; s=mail; t=1607325479; bh=puRdjoJPYb1RJ2w6U6RMynX426aFKa/YdIRvaZl8Yi0=; h=Date:From:To:Cc:Subject:In-Reply-To:References; b=SQYi1o8lnUwkjEcI3ugtN54EAz4l1pdZbbYUxyum8GLpEw3ADWE2pkZSTW4VA7Sns yepG6aXZZjPceEaxFtFL8qj3nHv4M5CbtHm20Qk35YM6TmQAcj68IrSsU8SdpI+WvF Gg91VIERN4U35ekHqlEDZnKHzYT3Qfpv1V80A4Mg=
Received: from webmail.tecnico.ulisboa.pt (webmail4.tecnico.ulisboa.pt [IPv6:2001:690:2100:1::8a3:363d]) (Authenticated sender: ist180970) by mail1.tecnico.ulisboa.pt (Postfix) with ESMTPSA id E42EC360073; Mon, 7 Dec 2020 07:17:57 +0000 (WET)
Received: from [95.77.147.241] via vs1.ist.utl.pt ([2001:690:2100:1::33]) by webmail.tecnico.ulisboa.pt with HTTP (HTTP/1.1 POST); Mon, 07 Dec 2020 07:17:57 +0000
MIME-Version: 1.0
Date: Mon, 07 Dec 2020 07:17:57 +0000
From: Rafael Belchior <rafael.belchior@tecnico.ulisboa.pt>
To: Martin Hargreaves <martin.hargreaves@quant.network>
Cc: Thomas Hardjono <hardjono@mit.edu>, blockchain-interop@ietf.org
In-Reply-To: <LO2P123MB3872DFE8357DA5B8737F80B8FCF30@LO2P123MB3872.GBRP123.PROD.OUTLOOK.COM>
References: <666e283e0d7a452fbf31dc7a42ec71b6@tecnico.ulisboa.pt>, <a1666b75233e112cd7d828ea4fa4fada@tecnico.ulisboa.pt> <a40dc7708df646b385e5ebbdcab43781@oc11expo23.exchange.mit.edu>, <a87a56a2e6e85666e32145d1c83e892e@tecnico.ulisboa.pt> <c0564c70e6a44fd29798e1ee6b4db5ae@oc11expo23.exchange.mit.edu> <baae5633d50058666b7b71bc45ec9ea8@tecnico.ulisboa.pt> <LO2P123MB3872DFE8357DA5B8737F80B8FCF30@LO2P123MB3872.GBRP123.PROD.OUTLOOK.COM>
Message-ID: <b73189779f4244f2bc71a424200ea32d@tecnico.ulisboa.pt>
X-Sender: rafael.belchior@tecnico.ulisboa.pt
User-Agent: Roundcube Webmail/1.3.15
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/blockchain-interop/8QZgaI1-I68l1UggeMbu_N8VV4M>
Subject: Re: [Blockchain-interop] Gatweay Crash Recovery Discussion #1
X-BeenThere: blockchain-interop@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Blockchain Gateway Interoperability Protocol <blockchain-interop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/blockchain-interop>, <mailto:blockchain-interop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/blockchain-interop/>
List-Post: <mailto:blockchain-interop@ietf.org>
List-Help: <mailto:blockchain-interop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/blockchain-interop>, <mailto:blockchain-interop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Dec 2020 07:18:09 -0000

Hello Martin,
Yes - on the crash recovery protocol we ought to support recovery 
messages that allow establishing to all parties what is the current 
state of an asset transfer.

We can specify this in the "3.3 Recovery procedure" section.

Cheers,
Rafael


A 2020-12-02 14:00, Martin Hargreaves escreveu:
> Hi both,
> 
> In terms of protocol messaging, how should we support this?
> 
> It sounds like, on recovery, the recovered gateway scans its logs and
> finds a set of uncompleted transactions, then need to send some kind
> of "recovery message" to its counterparties - offers to continue
> processing, with a list of transactions and phases in each transaction
> to pick up from.
> 
> The gateways that previously saw it time out as it crashed can then
> evaluate these and respond as to whether they wish to proceed or back
> out (or indeed don't recognise) the transactions.
> 
> What do you think?
> 
> Thanks
> 
> Martin
> 
>> -----Original Message-----
>> From: Blockchain-interop <blockchain-interop-bounces@ietf.org> On 
>> Behalf
>> Of Rafael Belchior
>> Sent: Wednesday, December 2, 2020 1:07 PM
>> To: Thomas Hardjono <hardjono@mit.edu>
>> Cc: blockchain-interop@ietf.org
>> Subject: Re: [Blockchain-interop] Gatweay Crash Recovery Discussion #1
>> 
>> CAUTION: This email originated from outside of the organisation. Do 
>> not click
>> links or open attachments unless you recognise the sender and know the
>> content is safe.
>> 
>> 
>> Thomas,
>> Yes, those assumptions are reasonable; i) we can also consider the 
>> case
>> where a public-private key pair is lost, and thus a new pair needs to 
>> be
>> generated. ii) I think we can assume a new SSL connection is created. 
>> Do you
>> envision problems with this?
>> 
>> I think it makes sense for the gateway to resume operations from the 
>> last
>> message before the crash because this mode is blocking, in principle - 
>> we can
>> resume from the last event.
>> 
>> If anyone thinks this could be improved, please do not hesitate in 
>> providing
>> feedback :)
>> 
>> Cheers,
>> Rafael
>> 
>> 
>> A 2020-12-02 00:17, Thomas Hardjono escreveu:
>> > Thanks Rafael.
>> >
>> > Inline:
>> >
>> >>>> > -- Which mode (self-healing mode, or primary-backup mode) do you
>> >>>> >>> > recommend?  (Which one would be the simplest approach for
>> >>>> >>> > now, and
>> >>>> > what assumptions would we need to make).
>> >>>>
>> >>>> The self-healing mode is simpler, as the same machine eventually
>> >>>> recovers, continuing its operations since the latest log entry. It
>> >>>> does not require, in principle, to read from the log storage API.
>> >>>> However, we
>> >>>> are assuming it eventually recovers, and while this happens the
>> >>>> system is down, prejudicing availability.
>> >
>> > For this scenario, there may need to be some assumptions about the
>> > gateway node.
>> >
>> > For example, (i) the gateway recovers to 100% without any internal
>> > losses (e.g. loss of private-keys); (ii) that the SSL connection has a
>> > longer life-time than the duration of crash (unavailability); etc.
>> >
>> > For short-duration unavailabilities, would the gateway pick-up
>> > (restart) from the last message before crash, or does it start from
>> > the beginning of the Phase?
>> >
>> > Best
>> >
>> >
>> > -- thomas --
>> >
>> >
>> >
>> >
>> >> ________________________________________
>> >> From: Blockchain-interop [blockchain-interop-bounces@ietf.org] on
>> >> behalf of Rafael Belchior
>> >> [rafael.belchior=40tecnico.ulisboa.pt@dmarc.ietf.org]
>> >> Sent: Monday, November 30, 2020 11:49 AM
>> >> To: blockchain-interop@ietf.org
>> >> Subject: [Blockchain-interop] Gatweay Crash Recovery Discussion #1
>> >>
>> >> Dear All,
>> >> Attached, the slides of the first discussion on the crash recovery
>> >> mechanism for gateways, that took place during the last meeting.
>> >>
>> >>
>> >> Cheers,
>> >>
>> >> --
>> >> Rafael Belchior
>> >> Ph.D. student in Computer Science and Engineering, Blockchain -
>> >> Técnico Lisboa https://rafaelapb.github.io/
>> >> https://www.linkedin.com/in/rafaelpbelchior/
>> >
>> > --
>> > Rafael Belchior
>> > Ph.D. student in Computer Science and Engineering, Blockchain -
>> > Técnico Lisboa https://rafaelapb.github.io/
>> > https://www.linkedin.com/in/rafaelpbelchior/
>> 
>> --
>> Rafael Belchior
>> Ph.D. student in Computer Science and Engineering, Blockchain - 
>> Técnico
>> Lisboa https://rafaelapb.github.io/
>> https://www.linkedin.com/in/rafaelpbelchior/
>> 
>> --
>> Blockchain-interop mailing list
>> Blockchain-interop@ietf.org
>> https://www.ietf.org/mailman/listinfo/blockchain-interop
> This message is intended solely for the addressee and may contain
> privileged and confidential information. If you have received this
> message in error, please send it back to us, and immediately and
> permanently delete it. Do not use, copy or disclose the information
> contained in this message or in any attachment. Quant Network does not
> guarantee that this email has not been intercepted and amended or that
> it is virus free.

-- 
Rafael Belchior
Ph.D. student in Computer Science and Engineering, Blockchain - Técnico
Lisboa
https://rafaelapb.github.io/
https://www.linkedin.com/in/rafaelpbelchior/