Re: [nwcrg] RG Last Call for "BATS Coding Scheme for Multi-hop Data Transport"

Vincent Roca <vincent.roca@inria.fr> Fri, 03 December 2021 16:56 UTC

Return-Path: <vincent.roca@inria.fr>
X-Original-To: nwcrg@ietfa.amsl.com
Delivered-To: nwcrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 836EC3A0C3C for <nwcrg@ietfa.amsl.com>; Fri, 3 Dec 2021 08:56:26 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kTAlPKNhek-d for <nwcrg@ietfa.amsl.com>; Fri, 3 Dec 2021 08:56:21 -0800 (PST)
Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 83DD43A0C17 for <nwcrg@irtf.org>; Fri, 3 Dec 2021 08:56:20 -0800 (PST)
IronPort-Data: A9a23:Ace116B+4UnfIBVW/9Phw5YqxClBgxIJ4g17XOLfBATvh2h01jYCmDMaWTiPa/+ONGLzc91yO4qw8EMFscSAx9UxeLYW3SE0HigS8aIpJvzAcxyuZ3vKRiH7ofMOA/w2MrEsF+hpCC+DzvuRGuK59yAlj/nTHuCU5NPsYUideyc1EU/Ntjozw4bVsqYw6TSIK1vlVeHa+6UzC3f5s9JACV/43orYwP9ZUFsejxtD1rA2TagjUFYzDBD5BrpHTU26ByOQroW5goeHq+j/ILGRpgs1/j82D8+9mbL6f0sWBLfKJQyD4pZUc/H6014b93R0iPtgcqNAAatUo2zhc9RZ1tRLvpG2VUEzOabFsOUbSRhRVS9kVUFD0O+dcSbk4Jz7I0ruNiGEL+9VJFo2LIQe9c58HGFNs/EUNXYGaHiri/i/zq7+S/NwiIElM8LxM8YVs217izjEEfYhW4vrQqjW65lfxjhYrtpJFv/2dcscYyBmKhXGeRxGfFkNYLo4g+yyiVH+aSFW7lWPqsIf5mHJzQFZ1broN9zJYtGWRNkTlUGdzl8qVUyR7goybYTOj2PUqjT227SJxH+TZW7bL5XgntYCvbFZ7jV75MUqaGaG
IronPort-HdrOrdr: A9a23:PBBwNq25oVXoM6PstGKLmAqjBSJyeYIsimQD101hICG9Lfb3qyn+ppsmPEHP5Ar5OUtQ1uxoXZPgfZqyz+8N3WB8B8bBYOCighrTEGgA1/qt/9SDIVyHygc1784JGMISaKySMbE5t7eA3ODRKadn/DDtytHNuQ6q9QYKcegcUdAG0+4WMHf/LqRefng7ObMJUL6nouZXrTupfnoaKu6hAGMeYuTFr9rX0Lr7fB8vHXccmUizpALtzIS/PwmT3x8YXT8K66wl63L5nwvw4bjmm+2nyyXby3TY4/1t6ZvcI5p4dY+xY/ouW3DRYzWTFcBcsoi5zXIISLvG0idUrDCDmWZmAy050QKtQoj8m2qS5+Cn6kd215cpoWXo2UcKYqTCNWkH47Aqv/MFTvODgXBQ4O2VlMlwrjOkX18+N2KboMw4j+K4CC2DUSKP0CQfeKco/g1iud53Us4jkaUPuExSC5sOByT89cQuF/RvFtjV4LJMfUqddG2xhBgi/DWAZAV4Iv69eDlOhiVV6UkhoFlpi08DgMAPlHYJ85wwD5FC+uTfK6xt0KpDS8cHBJgNT9voaaOMex7wqTulChPjHb0mLtBOB5vgke+C3FwF3pDiRHVT9upGpH3oaiIuiVIP
X-IronPort-AV: E=Sophos;i="5.87,284,1631570400"; d="scan'208,217";a="8286398"
Received: from unknown (HELO smtpclient.apple) ([109.190.253.14]) by mail2-relais-roc.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 03 Dec 2021 17:56:18 +0100
From: Vincent Roca <vincent.roca@inria.fr>
Message-Id: <3C2B371C-7B8A-4772-A0D1-CB2B586CA758@inria.fr>
Content-Type: multipart/alternative; boundary="Apple-Mail=_F55596D7-D262-4308-ACA3-1969CC08CB05"
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.13\))
Date: Fri, 03 Dec 2021 17:56:14 +0100
In-Reply-To: <E70E0ECF-4D61-4419-8B0D-E073997765A2@cuhk.edu.cn>
Cc: Vincent Roca <vincent.roca@inria.fr>, "nwcrg@irtf.org" <nwcrg@irtf.org>, Marie-Jose Montpetit <marie@mjmontpetit.com>
To: "Prof. Yang Shenghao (SSE)" <shyang@cuhk.edu.cn>, "draft-irtf-nwcrg-bats.authors@ietf.org" <draft-irtf-nwcrg-bats.authors@ietf.org>
References: <993F22CE-FE37-4C90-B8A1-C2934D714179@inria.fr> <89960E4C-E2DC-4D8B-9BC8-6C30CD1B5A1B@inria.fr> <E70E0ECF-4D61-4419-8B0D-E073997765A2@cuhk.edu.cn>
X-Mailer: Apple Mail (2.3654.120.0.1.13)
Archived-At: <https://mailarchive.ietf.org/arch/msg/nwcrg/Diu-rFfaqyI4q_TSRsM1QknanMg>
Subject: Re: [nwcrg] RG Last Call for "BATS Coding Scheme for Multi-hop Data Transport"
X-BeenThere: nwcrg@irtf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IRTF Network Coding Research Group discussion list <nwcrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/nwcrg>, <mailto:nwcrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nwcrg/>
List-Post: <mailto:nwcrg@irtf.org>
List-Help: <mailto:nwcrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/nwcrg>, <mailto:nwcrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Fri, 03 Dec 2021 16:56:27 -0000

Dear Shenghao, dear authors,

Thanks a lot for this major revision of your I-D that significantly improves its quality.
You’ll find below my answers.
I only have two comments that may need a minor and quick revision of the I-D.
Almost ready!

Cheers,

  Vincent


> Le 1 déc. 2021 à 04:52, Prof. Yang Shenghao (SSE) <shyang@cuhk.edu.cn> a écrit :
> 
> Dear Vincent, 
> 
> Thanks for the comments. We just updated the document. https://datatracker.ietf.org/doc/draft-irtf-nwcrg-bats/02/ <https://datatracker.ietf.org/doc/draft-irtf-nwcrg-bats/02/>
> 
> Please see our response below: 
> 
> 
> Best, 
> 
> Shenghao
> 
>> On Sep 14, 2021, at 4:01 PM, Vincent Roca <vincent.roca@inria.fr <mailto:vincent.roca@inria.fr>> wrote:
>> 
>> Dear Authors, everybody,
>> 
>> As promised (a bit late however, sorry), here is my review of version -01 (draft-irtf-nwcrg-bats-01).
>> 
>> There is still some work to be done on this I-D before I can say it’s ready for IRSG review, but we are converging.
>> Authors, could you update your document accordingly? Thanks in advance.
>> 
>> 
>> # Main comments:
>> 
>> * Section 2.1
>>   Could Fig. 1 be made more explicit? Just to give you an idea of what I have in mind
>>   (to be improved, completed, and probably corrected):
>> 
>>   |
>>   | {set of source packets}
>>   v
>> +-+-+-+-+-+-+-+
>> | source node | outer coding: create batches of M BATS/coded packets each
>> |             | inner coding: recode each BATS/coded packets to form DDP packets
>> |             | transmit DDP packets
>> +-+-+-+-+-+-+-+
>>   |
>>   |     {set of DDP packets}
>>   v
>> +-+-+-+-+-+-+-+-+-+-+
>> | intermediate node | inner coding: recode DDP packets (if needed)
>> |                   | transmit incoming and/or recoded DDP packets
>> +-+-+-+-+-+-+-+-+-+-+
>>   ...
>> 
>>   The textual description of what happens is not cristal clear I'd say, the above figure could help.
>>   One question I have is about the partionning of the incoming set of source packets into batches: is there a fixed number of source packets per batch, is there an overlapping between consecutive batches or not?
>>   It also depends on the "degree" but I understand the degree varies according to a certain distribution, so different BATS packets in a given batch will depend on a different number of source packets…
>>   It’s confused in my mind, I don't have the answer to my question, could you clarify?
>>   Also, in section 1.2, the "degree" definition could suggest (ambiguous) that this degree may be fixed for a given batch.
>>   It did not help me.
>> 
> We have adjusted Fig. 1 according to your advise with more text to illustrate the process. Moreover, we also revise the text after Fig. 1 to better explain the functions of the encoder/recoder/decoder. 

VR: thank you, section 2.1 is now cristal clear.

> For the concern about "degree" and "overlapping", it is due to the lack of a clear description of how a batch is generated by the outer encoder in the the previous version, though it is referred to Section 3.2. To improve the clarity and readability, we add a brief description of the batch encoding procedure in Section 2.2.2: 
> "..., the outer encoder generates M coded packets for each batch ID using the following steps to be described in details at Section 3.2:
>    *  Obtain a degree d by sampling DD.
>    *  Choose d source packets uniformly at random from all the K source packets.
>    *  Generate M coded packets using the d source packets."

VR: okay

>> * Section 2.4
>>   Strange to see a DDP that does not define any protocol version number, no session identifier (if there are multiple flows between the same endpoints).
>>   A full DDP should also define what happens when the small 12-bit BID space reaches its maximum value value (wrapping to zero I guess) and if it is an issue (e.g., with a high throughput link that consummes this space very rapidly), etc.
>>   I understand this is only a minimum example of DDP, not a full featured DDP and that this I-D is above all meant to specify the BATS scheme, not the DDP.
>>   It should be reminded that several simplifications have been made.
>> 
> We agree that the purpose of this section is not to define a DDP packet format, but how to embedding the BATS coding parameters and a coded packet into a DDP packet. So we modify the section title to "Coding Parameters in DDP Packets" and also revise the content according to this purpose.
> 
> We add a comment in Section 2.4.2 about how to use the BID field. 

VR: okay, I now understand.

A new comment, section 2.4.2: it is said "A coded packet has TO octets, », shouldn't it be: « T+O octets »? 

>> * Section 3.2:
>>   Function DegreeSampler() computes a degree distribution table in line with a predefined distribution, and returns a random degree that matches this distribution.
>>   However, looking at Fig. 7, I have the feeling that if K < MAX_DEG, there could be a bias in the actual distribution because of:
>>         return min(d,K)
>>   d matches the desired distribution but not: min(d,K) that will over-represent K in that case.
>>   I don't know if it's an issue.
>> 
> It is a nice observation. We usually use a degree distribution obtained from the asymptotic analysis of the belief propagation decoding, where MAX_DEG could be couple hundreds. Theoretically, this degree distribution has a nearly optimal belief propagation decoding performance when the number of source packet K is very large (possibly be larger than MAX_DEG). When K is small, we usually employ the inactivation decoding. We have observed that the inactivation decoding is not sensitive to the degree distribution, and hence the bias generated when K<MAX_DEG has no practical issue.
> 
> We add a remark in the first paragraph of Section 3.2 to the resolve the concern of readers. We also modify Section 3.4 to add the discussion about inactivation decoding.

VR: okay

>> * Last paragraph of Section 3.3:
>>   I fully agree, especially as forwarding systematic recoded packets immediately will reduce transmission latency.
>>   On the opposite, if only the linear combination approach is used by each intermediate node, latency will accumulate linearly with the number of nodes.
>>   Is it realistic? I'm surprised it's not discussed.
>> 
> We rewrite Section 3.3 to discuss random linear recoding and systematic recoding separately. In a common scenario of unicast communications with one path, systematic recoding has advantages over random linear recoding without sacrificing coding performance. 

VR: okay

>> * Section 3: interoperability considerations
>>   The I-D should specify clearly which GF(256) to consider, i.e., its irreducible polynomial.
>>   This is what we did (we received this comment) in RFC 8681, section  3.7.1. Finite Field Definitions
>>   Do not hesitate to refer to (if meaningful):
>>         https://www.rfc-editor.org/rfc/rfc8681.html#name-finite-field-operations <https://www.rfc-editor.org/rfc/rfc8681.html#name-finite-field-operations>
>> 
> We add the description of the finite field operations for GF(2) and GF(256) in Section 3.1.

VR: great. 

>> * Section 4.1:
>>   - What is meant by throughput in: "The BATS code specification in Section 3 has nearly optimal throughput"?
>>     I guess: BATS approaches ideal codes.
>> 
>>   - The sentence is a bit strange: "The belief propagation decoder in Section 3.4 guarantees the recovery..."
>>     There is no guaranty. If the loss rate is too high the decoder will face problems even to recover a subset of the source packets.
>>     Maybe: "the BP usually enables the recovery of (at least) a fraction of the source packets".
>> 
> - We revised the sentence with "throughput" and explained the meaning of "throughput" in words. 
> - We add the condition such that the decoding can be successful with a high probability: the total rank of all the batches used for decoding should be slightly larger than the number of source packets. 

VR: okay

>> ** Section 6: intro.
>>   I find the use of the term « confidentiality » excessive.
>>   If an eavesdropper can collect a sufficient number of DDP packets, he can  decode them and recover the source packets.
>>   There is no hidden, encrypted, info (e.g., the coefficients or some key info) that could prevent it.
>>   Okay if the eavesdropper only captures a few DDP packets, but this is not a realistic attacker model.
>>   Even if we admit random coefficients are encrypted, it remains that the "confidentiality" depends on the message content (imagine a long message that contains a 32-bit secret and which is padded by thousands of null bytes).
>>   This is not in line with what the security community understands by confidentiality.
>>   This is also why [Bhattad05] that you refer to uses the adjective « Weakly » in the title.
>>   Obfuscation is preferable IMHO.
>> 
> We rewrite the first part of Section 6.1 to give further information about security under the condition that the eavesdropper does not collect a sufficient number of packets. We also discuss the research problem about how to enhance the BATS code scheme to provide security when the eavesdropper can collect a sufficient number of packets for decoding.

VR: thanks. One comment: « […] is to encrypt some of the crucial information used in decoding.Such information can be, for example, the batch ID and the batch generator matrix. » 
The batch ID is a small size field (13 bits in the example DDP), with values that evolve in a predictive manner in some situations. 
Not sure encrypting it provides any valid security (brut force attacks are trivial).
And if the BID value is replaced by its encrypted value, make sure the encryption also encompasses some value that changes across packets, otherwise all packets of the same batch will contain the same encrypted(BID) value!
I’d be in favor of removing sentence: « Such information can be, for example, the batch ID and the batch generator matrix. » because it will raise concerns during the SecDir review IMHO (at least I would raise it with my sector reviewer hat).
And what about encrypting a subset of the DDP packet payloads? It’s more costly, but it’s easier to assess the property you are interested in.


>> * Section 6.1: there are two instances of "must" (lower case).
>>   Since this is not normative language as per RFC 2119, I don't understand what is meant.
>>   Is a strong MUST more appropriate, or should it be "SHOULD"?
>> 
> We change must to MUST. 

VR: okay

>> * Section 6.2, item 3:
>>   - Typo in "Original authentication". I guess it's "source" or "origin" authentication.
>>   - Additionally, I think you mean "origin authentication and message integrity".
>> 
> Changed "original" to "origin", "message origin" to "message integrity", and "communication peer authentication" to "origin authentication".

VR: okay

>> * It's good practice to have an "Acknowledgement" section ;-)
>> 
> We add the acknowledgment section

VR: thank you ;-)

>> * A reference to rfc 8406 "Taxonomy of Coding Techniques for Efficient Network Communications" would be meaningful since this is the NWCRG foundations.
>>         https://www.rfc-editor.org/rfc/rfc8406.html <https://www.rfc-editor.org/rfc/rfc8406.html>
>> 
> A reference to RFC 8406 is added.

VR: okay

>> Minor comments:
>> 
>> * Section 2.4.1:
>>   - Mq also refers the O value. To be added to:
>>       "Mq: 4-bit unsigned integer to specify the value of M and q as Table 1."
>>   - Also, I don't understand why the table is not organized with increasing values of Mq? It’s not wrong, I’m just surprised.
>>   - You could also say if value 0 is invalid (?) and whether values 8 and above are unused in this version but left for future evolutions.
>> 
> O value in Tables 1 is eliminated, since the calculation of O is given in Section 2.2.1. We change Mq value from 4 bits to 3 bits, and and add one more bit to BID.

VR: okay

>> * section 3.2: s/return/returns/ in "Define a function called DegreeSampler that return an integer d"
>> 
>> * section 3.4: s/batches/batch/ in   "Find a batches j that is decodable."
>> 
>> * section 4.1: s/call/called/ in: "which is also call multicast"
>> 
>> * section 4.3: s/techinques/techniques/
>> 
> The typos are fixed.
> 
>> 
>> I hope it will help.
>> Regards,
>> 
>> 
>>    Vincent
>> 
>> 
>>> Le 30 juil. 2021 à 12:44, roca <vincent.roca@inria.fr <mailto:vincent.roca@inria.fr>> a écrit :
>>> 
>>> Dear all,
>>> 
>>> Following the recent update of the I-D and in line with IETF111 discussion, we would like to officially start a RG Last Call for:
>>> "BATS Coding Scheme for Multi-hop Data Transport » / draft-irtf-nwcrg-bats-01
>>> https://datatracker.ietf.org/doc/draft-irtf-nwcrg-bats/ <https://datatracker.ietf.org/doc/draft-irtf-nwcrg-bats/>
>>> 
>>> Since many participants may be on vacation, the call will end on Monday September 6th (5 weeks).
>>> 
>>> Please read it and provide feedback on the mailing list. Thanks in advance.
>>> 
>>> Regards,
>>> 
>>>     Marie-Jose and Vincent
>>> 
>> 
>