Re: [nwcrg] RG Last Call for "BATS Coding Scheme for Multi-hop Data Transport"

Shenghao Yang <shenghao.yang@gmail.com> Thu, 09 December 2021 00:48 UTC

Return-Path: <shenghao.yang@gmail.com>
X-Original-To: nwcrg@ietfa.amsl.com
Delivered-To: nwcrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BDA403A09D1 for <nwcrg@ietfa.amsl.com>; Wed, 8 Dec 2021 16:48:04 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ttBTT-tnLtOq for <nwcrg@ietfa.amsl.com>; Wed, 8 Dec 2021 16:47:59 -0800 (PST)
Received: from mail-oi1-x234.google.com (mail-oi1-x234.google.com [IPv6:2607:f8b0:4864:20::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5ABAB3A09D0 for <nwcrg@irtf.org>; Wed, 8 Dec 2021 16:47:59 -0800 (PST)
Received: by mail-oi1-x234.google.com with SMTP id m6so6547150oim.2 for <nwcrg@irtf.org>; Wed, 08 Dec 2021 16:47:59 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=l7qTdpahsVwTA5kbtPF6YvwXwj4VFNmaDSbGJOnfUAc=; b=qpn0v2+r2qB2QrdcBEadk6UHhFxKQh16zC7iCBq3/slefNHTgXWXvWIEsfwMeDfvng kUB68lcpbKKpQM/oFDPn0yS/sbQX/b3AAEjXNsak7j47FHwDVQClMIYcOa7c+fSAWZiX jWbNYQUDusCAMy4m3S1RRAn1M9WBoIVVf1FwcbLVR4ZJJWumLZslDPKFAj5n0DLL9Na4 hQD8ZRHef8A+9qeQZLvmCGtUeuurosnNT6KdrUnBiydSkN2jKtSH3bsid5c3/Ncz8qz4 w237Xjw/y/vmRBEDfAu8dr5NVpT9rwbEI3PAvIJqI35wzyeYNS6Rm+2hrELbck0dmbvC jQkA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=l7qTdpahsVwTA5kbtPF6YvwXwj4VFNmaDSbGJOnfUAc=; b=LGHCP/f/DgYJMvOfMlNiTNo3aRktflC6TVDbHfl9HY/ThcUjbPL2EDVuGUhTuAJc8w FzpshqWAToYQKBR9BcVI/oLzv0vpsvLRjNy4462PUhIui7FQJOZC06FELMCUAq5H3Cht H27IGNfKHupwsMgolbbvJnu/aFeAEhLFkzab2B579PDWvz/3jhHwhN0/mPF0fpNQxxc9 dlaVbye+kmu++y1R+WgceycI0JXM1xOIppGf3e35Az4fsma0PxS1w1LdBzj4g4luX1K9 2ajhczjSesV95iijbmdKs1w26QvChzV69UchER58nhWmECpwBPzNNmXjVHvB1X1z5fGf Thqw==
X-Gm-Message-State: AOAM533bVY3/J2eVZkz7d/pbaz7Zfj1MZdZpYy0Q0cqc5HqweJHggzOw 56Fn1TjoYKuBJtKzoKsO14Hh7VMu0XO/yCyb/90=
X-Google-Smtp-Source: ABdhPJxv3QYs2ROMPyBnSKFYr+O2K+UD7UqczJfVadtrYjNtSH6OEVEPE3D7b32UuecU/qb9zUUkcImzr+Yci1W2Zb0=
X-Received: by 2002:aca:d07:: with SMTP id 7mr2844988oin.92.1639010876556; Wed, 08 Dec 2021 16:47:56 -0800 (PST)
MIME-Version: 1.0
References: <993F22CE-FE37-4C90-B8A1-C2934D714179@inria.fr> <89960E4C-E2DC-4D8B-9BC8-6C30CD1B5A1B@inria.fr> <E70E0ECF-4D61-4419-8B0D-E073997765A2@cuhk.edu.cn> <3C2B371C-7B8A-4772-A0D1-CB2B586CA758@inria.fr>
In-Reply-To: <3C2B371C-7B8A-4772-A0D1-CB2B586CA758@inria.fr>
From: Shenghao Yang <shenghao.yang@gmail.com>
Date: Thu, 09 Dec 2021 08:47:45 +0800
Message-ID: <CAMGveSWxQHJkMzFhMxXdM=mE=k8OK8PE4TBW2Wrz_TEp3xDQ5A@mail.gmail.com>
To: Vincent Roca <vincent.roca@inria.fr>
Cc: "Prof. Yang Shenghao (SSE)" <shyang@cuhk.edu.cn>, "draft-irtf-nwcrg-bats.authors@ietf.org" <draft-irtf-nwcrg-bats.authors@ietf.org>, Marie-Jose Montpetit <marie@mjmontpetit.com>, "nwcrg@irtf.org" <nwcrg@irtf.org>
Content-Type: multipart/alternative; boundary="000000000000148ad805d2abf5fa"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nwcrg/3Q5VVp-6eDUH8gFVYkd3HnYbeAY>
Subject: Re: [nwcrg] RG Last Call for "BATS Coding Scheme for Multi-hop Data Transport"
X-BeenThere: nwcrg@irtf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IRTF Network Coding Research Group discussion list <nwcrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/nwcrg>, <mailto:nwcrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nwcrg/>
List-Post: <mailto:nwcrg@irtf.org>
List-Help: <mailto:nwcrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/nwcrg>, <mailto:nwcrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Thu, 09 Dec 2021 00:48:05 -0000

Dear Vincent,

Thanks for the comments. Please check whether the following changes address
these issues. If okay, we will submit a new version.

A new comment, section 2.4.2: it is said "A coded packet has TO octets, »,
shouldn't it be: « T+O octets »?


Authors: Right, TO is equal to T+O. TO is defined in 2.2.1. We add TO=T+O
in 2.4.2 to remind the readers.


One comment: « [...] is to encrypt some of the crucial information used in
decoding.Such information can be, for example, the batch ID and the batch
generator matrix. »
The batch ID is a small size field (13 bits in the example DDP), with
values that evolve in a predictive manner in some situations.
Not sure encrypting it provides any valid security (brut force attacks are
trivial).
And if the BID value is replaced by its encrypted value, make sure the
encryption also encompasses some value
that changes across packets, otherwise all packets of the same batch will
contain the same encrypted(BID) value! I’d be in favor of removing
sentence: « Such information can be, for example, the batch ID and the
batch generator matrix. » because it will raise concerns during the SecDir
review IMHO (at least I would raise it with my sector reviewer hat).
And what about encrypting a subset of the DDP packet payloads? It’s more
costly, but it’s easier to assess the property you are interested in.



Authors: Thanks for the comments. It is insightful. We remove that sentence
and state this is a research issue.  Below is the new content of this
paragraph:

"If the eavesdropper can collect a sufficient number of coded packets for
correctly decoding, the native security of BATS code is ineffective. One
solution in this case is to encrypt the whole data before using the BATS
code scheme. Better schemes are desired towards reducing the computation
cost of the whole data encryption solution. This is a research issue that
depends on specific BATS code schemes, and will not be further discussed
here."


Best

Shenghao

On Sat, Dec 4, 2021 at 12:56 AM Vincent Roca <vincent.roca@inria.fr> wrote:

> Dear Shenghao, dear authors,
>
> Thanks a lot for this major revision of your I-D that significantly
> improves its quality.
> You’ll find below my answers.
> I only have two comments that may need a minor and quick revision of the
> I-D.
> Almost ready!
>
> Cheers,
>
>   Vincent
>
>
> Le 1 déc. 2021 à 04:52, Prof. Yang Shenghao (SSE) <shyang@cuhk.edu.cn> a
> écrit :
>
> Dear Vincent,
>
> Thanks for the comments. We just updated the document.
> https://datatracker.ietf.org/doc/draft-irtf-nwcrg-bats/02/
>
> Please see our response below:
>
>
> Best,
>
> Shenghao
>
> On Sep 14, 2021, at 4:01 PM, Vincent Roca <vincent.roca@inria.fr> wrote:
>
> Dear Authors, everybody,
>
> As promised (a bit late however, sorry), here is my review of version -01
> (draft-irtf-nwcrg-bats-01).
>
> There is still some work to be done on this I-D before I can say it’s
> ready for IRSG review, but we are converging.
> Authors, could you update your document accordingly? Thanks in advance.
>
>
> *# Main comments:*
>
> * Section 2.1
>   Could Fig. 1 be made more explicit? Just to give you an idea of what I
> have in mind
>   (to be improved, completed, and probably corrected):
>
>   |
>   | {set of source packets}
>   v
> +-+-+-+-+-+-+-+
> | source node | outer coding: create batches of M BATS/coded packets each
> |             | inner coding: recode each BATS/coded packets to form DDP
> packets
> |             | transmit DDP packets
> +-+-+-+-+-+-+-+
>   |
>   |     {set of DDP packets}
>   v
> +-+-+-+-+-+-+-+-+-+-+
> | intermediate node | inner coding: recode DDP packets (if needed)
> |                   | transmit incoming and/or recoded DDP packets
> +-+-+-+-+-+-+-+-+-+-+
>   ...
>
>   The textual description of what happens is not cristal clear I'd say,
> the above figure could help.
>   One question I have is about the partionning of the incoming set of
> source packets into batches: is there a fixed number of source packets per
> batch, is there an overlapping between consecutive batches or not?
>   It also depends on the "degree" but I understand the degree varies
> according to a certain distribution, so different BATS packets in a given
> batch will depend on a different number of source packets…
>   It’s confused in my mind, I don't have the answer to my question, could
> you clarify?
>   Also, in section 1.2, the "degree" definition could suggest (ambiguous)
> that this degree may be fixed for a given batch.
>   It did not help me.
>
> We have adjusted Fig. 1 according to your advise with more text to
> illustrate the process. Moreover, we also revise the text after Fig. 1 to
> better explain the functions of the encoder/recoder/decoder.
>
>
> VR: thank you, section 2.1 is now cristal clear.
>
> For the concern about "degree" and "overlapping", it is due to the lack of
> a clear description of how a batch is generated by the outer encoder in the
> the previous version, though it is referred to Section 3.2. To improve the
> clarity and readability, we add a brief description of the batch encoding
> procedure in Section 2.2.2:
> "..., the outer encoder generates M coded packets for each batch ID using
> the following steps to be described in details at Section 3.2:
>    *  Obtain a degree d by sampling DD.
>    *  Choose d source packets uniformly at random from all the K source
> packets.
>    *  Generate M coded packets using the d source packets."
>
>
> VR: okay
>
> * Section 2.4
>   Strange to see a DDP that does not define any protocol version number,
> no session identifier (if there are multiple flows between the same
> endpoints).
>   A full DDP should also define what happens when the small 12-bit BID
> space reaches its maximum value value (wrapping to zero I guess) and if it
> is an issue (e.g., with a high throughput link that consummes this space
> very rapidly), etc.
>   I understand this is only a minimum example of DDP, not a full featured
> DDP and that this I-D is above all meant to specify the BATS scheme, not
> the DDP.
>   It should be reminded that several simplifications have been made.
>
> We agree that the purpose of this section is not to define a DDP packet
> format, but how to embedding the BATS coding parameters and a coded packet
> into a DDP packet. So we modify the section title to "Coding Parameters in
> DDP Packets" and also revise the content according to this purpose.
>
> We add a comment in Section 2.4.2 about how to use the BID field.
>
>
> VR: okay, I now understand.
>
> A new comment, section 2.4.2: it is said "A coded packet has TO
> octets, », shouldn't it be: « T+O octets »?
>
> * Section 3.2:
>   Function DegreeSampler() computes a degree distribution table in line
> with a predefined distribution, and returns a random degree that matches
> this distribution.
>   However, looking at Fig. 7, I have the feeling that if K < MAX_DEG,
> there could be a bias in the actual distribution because of:
>         return min(d,K)
>   d matches the desired distribution but not: min(d,K) that will
> over-represent K in that case.
>   I don't know if it's an issue.
>
> It is a nice observation. We usually use a degree distribution obtained
> from the asymptotic analysis of the belief propagation decoding, where
> MAX_DEG could be couple hundreds. Theoretically, this degree distribution
> has a nearly optimal belief propagation decoding performance when the
> number of source packet K is very large (possibly be larger than MAX_DEG).
> When K is small, we usually employ the inactivation decoding. We have
> observed that the inactivation decoding is not sensitive to the degree
> distribution, and hence the bias generated when K<MAX_DEG has no practical
> issue.
>
> We add a remark in the first paragraph of Section 3.2 to the resolve the
> concern of readers. We also modify Section 3.4 to add the discussion about
> inactivation decoding.
>
>
> VR: okay
>
> * Last paragraph of Section 3.3:
>   I fully agree, especially as forwarding systematic recoded packets
> immediately will reduce transmission latency.
>   On the opposite, if only the linear combination approach is used by each
> intermediate node, latency will accumulate linearly with the number
> of nodes.
>   Is it realistic? I'm surprised it's not discussed.
>
> We rewrite Section 3.3 to discuss random linear recoding and systematic
> recoding separately. In a common scenario of unicast communications with
> one path, systematic recoding has advantages over random linear recoding
> without sacrificing coding performance.
>
>
> VR: okay
>
> * Section 3: interoperability considerations
>   The I-D should specify clearly which GF(256) to consider, i.e., its
> irreducible polynomial.
>   This is what we did (we received this comment) in RFC 8681,
> section  3.7.1. Finite Field Definitions
>   Do not hesitate to refer to (if meaningful):
>
> https://www.rfc-editor.org/rfc/rfc8681.html#name-finite-field-operations
>
> We add the description of the finite field operations for GF(2) and
> GF(256) in Section 3.1.
>
>
> VR: great.
>
> * Section 4.1:
>   - What is meant by throughput in: "The BATS code specification in
> Section 3 has nearly optimal throughput"?
>     I guess: BATS approaches ideal codes.
>
>   - The sentence is a bit strange: "The belief propagation decoder in
> Section 3.4 guarantees the recovery..."
>     There is no guaranty. If the loss rate is too high the decoder will
> face problems even to recover a subset of the source packets.
>     Maybe: "the BP usually enables the recovery of (at least) a fraction
> of the source packets".
>
> - We revised the sentence with "throughput" and explained the meaning of
> "throughput" in words.
> - We add the condition such that the decoding can be successful with a
> high probability: the total rank of all the batches used for decoding
> should be slightly larger than the number of source packets.
>
>
> VR: okay
>
> ** Section 6: intro.
>   I find the use of the term « confidentiality » excessive.
>   If an eavesdropper can collect a sufficient number of DDP packets, he
> can  decode them and recover the source packets.
>   There is no hidden, encrypted, info (e.g., the coefficients or some key
> info) that could prevent it.
>   Okay if the eavesdropper only captures a few DDP packets, but this is
> not a realistic attacker model.
>   Even if we admit random coefficients are encrypted, it remains that the
> "confidentiality" depends on the message content (imagine a long message
> that contains a 32-bit secret and which is padded by thousands of null
> bytes).
>   This is not in line with what the security community understands by
> confidentiality.
>   This is also why [Bhattad05] that you refer to uses the adjective
> « Weakly » in the title.
>   Obfuscation is preferable IMHO.
>
> We rewrite the first part of Section 6.1 to give further information about
> security under the condition that the eavesdropper does not collect a
> sufficient number of packets. We also discuss the research problem about
> how to enhance the BATS code scheme to provide security when the
> eavesdropper can collect a sufficient number of packets for decoding.
>
>
> VR: thanks. One comment: « […] is to encrypt some of the crucial
> information used in decoding.Such information can be, for example, the
> batch ID and the batch generator matrix. »
> The batch ID is a small size field (13 bits in the example DDP), with
> values that evolve in a predictive manner in some situations.
> Not sure encrypting it provides any valid security (brut force attacks are
> trivial).
> And if the BID value is replaced by its encrypted value, make sure the
> encryption also encompasses some value that changes across packets,
> otherwise all packets of the same batch will contain the same
> encrypted(BID) value!
> I’d be in favor of removing sentence: « Such information can be, for
> example, the batch ID and the batch generator matrix. » because it will
> raise concerns during the SecDir review IMHO (at least I would raise it
> with my sector reviewer hat).
> And what about encrypting a subset of the DDP packet payloads? It’s more
> costly, but it’s easier to assess the property you are interested in.
>
>
> * Section 6.1: there are two instances of "must" (lower case).
>   Since this is not normative language as per RFC 2119, I don't understand
> what is meant.
>   Is a strong MUST more appropriate, or should it be "SHOULD"?
>
> We change must to MUST.
>
>
> VR: okay
>
> * Section 6.2, item 3:
>   - Typo in "Original authentication". I guess it's "source" or "origin"
> authentication.
>   - Additionally, I think you mean "origin authentication and message
> integrity".
>
> Changed "original" to "origin", "message origin" to "message integrity",
> and "communication peer authentication" to "origin authentication".
>
>
> VR: okay
>
> * It's good practice to have an "Acknowledgement" section ;-)
>
> We add the acknowledgment section
>
>
> VR: thank you ;-)
>
> * A reference to rfc 8406 "Taxonomy of Coding Techniques for Efficient
> Network Communications" would be meaningful since this is the NWCRG
> foundations.
>         https://www.rfc-editor.org/rfc/rfc8406.html
>
> A reference to RFC 8406 is added.
>
>
> VR: okay
>
> *Minor comments:*
>
> * Section 2.4.1:
>   - Mq also refers the O value. To be added to:
>       "Mq: 4-bit unsigned integer to specify the value of M and q as Table
> 1."
>   - Also, I don't understand why the table is not organized with
> increasing values of Mq? It’s not wrong, I’m just surprised.
>   - You could also say if value 0 is invalid (?) and whether values 8 and
> above are unused in this version but left for future evolutions.
>
> O value in Tables 1 is eliminated, since the calculation of O is given in
> Section 2.2.1. We change Mq value from 4 bits to 3 bits, and and add one
> more bit to BID.
>
>
> VR: okay
>
> * section 3.2: s/return/returns/ in "Define a function called
> DegreeSampler that return an integer d"
>
> * section 3.4: s/batches/batch/ in   "Find a batches j that is decodable."
>
> * section 4.1: s/call/called/ in: "which is also call multicast"
>
> * section 4.3: s/techinques/techniques/
>
> The typos are fixed.
>
>
> I hope it will help.
> Regards,
>
>
>    Vincent
>
>
> Le 30 juil. 2021 à 12:44, roca <vincent.roca@inria.fr> a écrit :
>
> Dear all,
>
> Following the recent update of the I-D and in line with IETF111
> discussion, we would like to officially start a RG Last Call for:
> "BATS Coding Scheme for Multi-hop Data Transport »
> / draft-irtf-nwcrg-bats-01
> https://datatracker.ietf.org/doc/draft-irtf-nwcrg-bats/
>
> Since many participants may be on vacation, the call will *end on Monday
> September 6th (5 weeks)*.
>
> Please read it and provide feedback on the mailing list. Thanks in advance.
>
> Regards,
>
>     Marie-Jose and Vincent
>
>
>
>
> _______________________________________________
> nwcrg mailing list
> nwcrg@irtf.org
> https://www.irtf.org/mailman/listinfo/nwcrg
>