Re: [nwcrg] Some comments on draft-heide-nwcrg-rlnc-background-00

"Kerim Fouli" <fouli@codeontechnologies.com> Wed, 27 March 2019 23:02 UTC

Return-Path: <fouli@codeontechnologies.com>
X-Original-To: nwcrg@ietfa.amsl.com
Delivered-To: nwcrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0F79512006B for <nwcrg@ietfa.amsl.com>; Wed, 27 Mar 2019 16:02:49 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=codeontechnologies.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1hIrQy008m77 for <nwcrg@ietfa.amsl.com>; Wed, 27 Mar 2019 16:02:42 -0700 (PDT)
Received: from golden.birch.relay.mailchannels.net (golden.birch.relay.mailchannels.net [23.83.209.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B83B31203E1 for <nwcrg@irtf.org>; Wed, 27 Mar 2019 16:02:40 -0700 (PDT)
X-Sender-Id: 63fz4d685t|x-authuser|fouli@codeontechnologies.com
Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id C01E06A2562; Wed, 27 Mar 2019 23:02:39 +0000 (UTC)
Received: from valandil.securewebz.com (100-96-3-137.trex.outbound.svc.cluster.local [100.96.3.137]) (Authenticated sender: 63fz4d685t) by relay.mailchannels.net (Postfix) with ESMTPA id B5D536A0BCF; Wed, 27 Mar 2019 23:02:38 +0000 (UTC)
X-Sender-Id: 63fz4d685t|x-authuser|fouli@codeontechnologies.com
Received: from valandil.securewebz.com ([TEMPUNAVAIL]. [64.188.10.113]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384) by 0.0.0.0:2500 (trex/5.17.2); Wed, 27 Mar 2019 23:02:39 +0000
X-MC-Relay: Neutral
X-MailChannels-SenderId: 63fz4d685t|x-authuser|fouli@codeontechnologies.com
X-MailChannels-Auth-Id: 63fz4d685t
X-Befitting-Arithmetic: 17fce45b57647792_1553727759573_2499978018
X-MC-Loop-Signature: 1553727759573:3903607498
X-MC-Ingress-Time: 1553727759573
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=codeontechnologies.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:In-Reply-To:References:To:From:Sender:Reply-To:Cc: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=WgT4ze80TA86qagHSnCFtsusCbghf74jS44bnoJmkig=; b=bKSMahN1nN3ea5LhnHOB+Ah3V jXGl5eqfzJH34dEjSFERW2EXGea2dZoRT/4bypr00ZP6GX/wA7P8XF346HyqP7xTTXJZyrew0y0P/ PjIs95QYhOfq8Be//BBksRbcye8EaBerEoYfpvsTvFjSEaVbeXZ7rt8zS4UAN+Hzi85C86GjVKT+h CJj6sY8Q7R+2xHekcAJVSu3sMzxJp9S26sh4x9WXRsB2/4ZtQaL5zZF0znLVhtI5Yc6OhTf12W4po yI7xxR0+i4dKHN5BqUjcxNjCsmxTr0uPywG+tbCDepQl40oYgerkSuPeCmSQYc5e9Nsz/AK9DKnCW xjURS15hQ==;
Received: from fw-1-user-net-flrs.cictr.com ([204.9.220.36]:54775 helo=DESKTOPG0L8B7H) by valandil.securewebz.com with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.91) (envelope-from <fouli@codeontechnologies.com>) id 1h9HZ2-00E3T2-QW; Wed, 27 Mar 2019 18:02:33 -0500
From: Kerim Fouli <fouli@codeontechnologies.com>
To: "'David R. Oran'" <daveoran@orandom.net>, 'nwcrg' <nwcrg@irtf.org>
References: <D191CD6F-21FC-4165-8854-0BE358146469@orandom.net>
In-Reply-To: <D191CD6F-21FC-4165-8854-0BE358146469@orandom.net>
Date: Wed, 27 Mar 2019 19:02:31 -0400
Message-ID: <001101d4e4f1$28d22920$7a767b60$@codeontechnologies.com>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----=_NextPart_000_0012_01D4E4CF.A1C7B510"
X-Mailer: Microsoft Outlook 16.0
Thread-Index: AQHpWJDJ+9nUNp4hZ0km/iVOTCAPOKX3HxPA
Content-Language: en-us
X-AuthUser: fouli@codeontechnologies.com
Archived-At: <https://mailarchive.ietf.org/arch/msg/nwcrg/G9WURRd4IqqynFjsdodCXAbK28c>
Subject: Re: [nwcrg] Some comments on draft-heide-nwcrg-rlnc-background-00
X-BeenThere: nwcrg@irtf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IRTF Network Coding Research Group discussion list <nwcrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/nwcrg>, <mailto:nwcrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nwcrg/>
List-Post: <mailto:nwcrg@irtf.org>
List-Help: <mailto:nwcrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/nwcrg>, <mailto:nwcrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Wed, 27 Mar 2019 23:02:49 -0000

Dave,

 

Thanks again for your constructive comments. We will attempt to cover some of them in our presentation tomorrow. 

 

Quick responses to your comments are embedded below.

 

Best,

 

+K

 

 

From: nwcrg <nwcrg-bounces@irtf.org> On Behalf Of David R. Oran
Sent: Wednesday, February 20, 2019 3:02 PM
To: nwcrg <nwcrg@irtf.org>
Subject: [nwcrg] Some comments on draft-heide-nwcrg-rlnc-background-00

 

First, thanks for splitting this draft off from the symbol representation specification and providing a nice background document that can serve as an “entry point” to what will hopefully evolve into a coherent set of encoding and algorithmic specifications for ongoing standardization work.

<KF> Thank you. This is precisely our intention for the document, and feedback is much appreciated.

I found it nicely structured and easy to read, especially compared to the earlier material. I have a few general comments, and then some detailed technical comments and nits which I’ve embedded in a snipped copy of the associated text from the draft.

General comments:

1.	The document points well to the need for concrete encoding specifications for symbol representation and gives useful guidance on the technical tradeoffs to consider in picking a given representation for integration into an enclosing protocol. I wish you had done the same for representation and encoding in a protocol of coding parameters, where the document basically waffles by saying there are lots of ways to do it. You mention the possibility of out-of-band conveyance and conveyance together with the encoded symbols, but really don’t give good advice on the tradeoffs, as you do with symbol representation. There’s some discussion, but I’d really like to see this fleshed out in a future revision.

<KF> This is a good point. We may have avoided that topic as it depends on the protocol involved, and the possibilities are numerous. We do mention that the coding vector can be transmitted out-of-band or in-band, amongst a few more protocol-related remarks. The aim there was to point out to the scope of possibilities rather than enumerate them. This being said, it’s a good idea to emphasize the tradeoffs related to some coding parameters in the next version.

2.	As general terminology issue, I found the term “connection” creeping in various places (I point out some specific instances my detailed comments) where it is either unnecessary or in some cases inappropriate, as with multicast. In reading through, I think it’s adequate to just talk about “senders” and “receivers” without any baggage of two-party state that typically defines a connection. This is particularly evident when the material on feedback comes into play, as feedback needs to make it to one (or more) senders but does not necessarily happen “inside” some connection state (although it might in the case of embedding RLNC into connection-orient transport protocols like TCP or QUIC).

<KF> Great point. We will weed those out for the next iteration of the document.

3.	The discussion of security is not adequate, in my opinion. At the very least, some discussion in the security considerations is needed to address whether cryptographic integrity is assumed from the lower-level protocol to protect the coded symbols (and possible coding vectors) from manipulation by an attacker. Conversely, if one assumes that RLNC provides some degree of robustness against manipulation of coded data and parameters by attackers (including foiling both memory or computational DoS attacks on the decoders in receivers), one can move cryptographic integrity to the enclosing protocol and check only the decoded data. In terms of confidentiality, it may be that exposing coding parameters and encoded symbols does not represent any privacy leakage, but that still needs to have some amount of discussion with appropriate justification for whether encryption is needed below RLNC to prevent leakage, even if the encoded data is encrypted at a higher layer.

<KF> The initial assumption was that the document dealt with a separate "coding layer", where other layers took care of security, hence our generic security section. Some coding- and FEC-related drafts do that. However, we are now discussing updating the security section with an overview of the security implications of RLNC. 

Detailed comments and nits:

 Random Linear Network Coding (RLNC): Background and Practical
                         Considerations
              draft-heide-nwcrg-rlnc-background-00

Abstract

This document describes the use of Random Linear Network Coding
(RLNC) schemes for reliable data transport. Both block and sliding
window RLNC code implementations are described. By providing erasure
correction using randomly generated repair symbols, such RLNC-based
schemes offer advantages in accommodating varying frame sizes and
dynamically changing connections,

<DO> don’t say connection - say “dynamically changing communication conditions”

<KF> Agreed.

reducing the need for feedback, and
lowering the amount of state information needed at the sender and
receiver. 

<DO> In the main document, I didn’t actually see material showing the state reduction. Certainly there is the valuable reduction in overhead in communicating the state changes, but it would be very helpful to give justification (with example or two) of what state is actually reduced at the sender and/or receiver.

<KF> The typical example is TCP ACKs of received packets, and we can add such an example somewhere in the text. The general concept is that RLNC reduces the need to convey node and system state in various topologies (e.g., PTP/E2E, mesh, multipath, multicast) by allowing coded symbols to represent a dynamic set of systematic symbols, where the identity of the represented symbols is self-contained (i.e., the coding vector). This, at least, should be clear from the document.  

The practical considerations' section identifies RLNC-

<DO> s/considerations’/considerations/

<KF> OK

encoded symbol representation as a valuable target for
standardization.

[snip]

1.1. Random Linear Network Coding (RLNC) Basics

Unlike conventional communication systems based on the "store-and-
forward" principle, RLNC allows network nodes to independently and
randomly combine input source data into coded symbols over a finite
field [HK03]. Such an approach enables receivers to decode and
recover the original source data as long as enough linearly
independent coded symbols, with sufficient degrees of freedom, are
received. At the sender, RLNC can introduce redundancy into data
streams in a granular way. 

<DO> Granular could mean either fine or coarse. I’d try to be more specific - there are two degrees of freedom here - both symbol size and amount of redundancy.

<KF> A finer granularity. I think it is clear we're dealing with amount of redundancy here.

 

At the receiver, RLNC enables progressive

decoding and reduces feedback necessary for retransmission.
Collectively, RLNC provides network utilization and throughput
improvements, high degrees of robustness and decentralization,
reduces transmission latency, and simplifies feedback and state
management.

<DO> Compared to what? You might get less argument if this were stated in non-comparative terms, like “RLNC enables progressive decoding and low overhead feedback to manage retransmissions. RLNC is highly efficient in network utilization thus improving throughput, while providing strong robustness properties and decentralized control. RLNC further can improve end-to-end latency while simplifying feedback and state management.”

<KF> Total agreement, especially on the inexact PHY term.

[Aside: RLNC can’t actually reduce transmission latency; that’s a property of the channel]

To encode using RLNC, original source data are divided into symbols
of a given size and linearly combined. Each symbol is multiplied
with a scalar coding coefficient drawn randomly from a finite field,
and the resulting coded symbol is of the same size as the original
data symbols.

Thus, each RLNC encoding operation can be viewed as creating a linear
equation in the data symbols, where the random scalar coding
coefficients can be grouped and viewed as a coding vector.
Similarly, the overall encoding process where multiple coded symbols
are generated can be viewed as a system of linear equations with
randomly generated coefficients. Any number of coded symbols can be
generated from a set of data symbols, similarly to expandable forward
error correction codes specified in [RFC5445] and [RFC3453]. Coding
vectors must be implicitly or explicitly transmitted from the sender
to the receiver for successful decoding of the original data. For
example, sending a seed for generating pseudo-random coding
coefficients can be considered as an implicit transmission of the
coding vectors. In addition, while coding vectors are often
transmitted together with coded data in the same data packet, it is
also possible to separate the transmission of coding coefficient
vectors from the coded data, if desired.

<DO> why would I desire this? See my general comment above.

<KF> I got mixed responses on this one. Some of us agree that there are no mainstream use cases. However, why remove the possibility in an informational?   

To reconstruct the original data from coded symbols, a network node
collects a finite but sufficient number of degrees of freedom for
solving the system of linear equations. This is beneficial over
conventional approaches as the network node is no longer required to
gather each individual data symbol. 

<DO> Isn’t the misleading? No decent coding scheme requires you “gather each individual data symbol”. I think you mean that the network node is not required get specific data symbols, any combination with enough degrees of freedom will do.

<KF> Yes. I'll clarify that point.

In general, the network node
needs to collect slightly more independent coded symbols than there
are original data symbols, where the slight overhead arises because
coding coefficients are drawn at random, with a non-zero probability
that a coding vector is linearly dependent on another coding vector,
and that one coded symbol is linearly dependent on another coded
symbol. This overhead can be made arbitrarily small, provided that
the finite field used is sufficiently large.

A unique advantage of RLNC is the ability to re-encode or "recode"
without first decoding. Recoding can be performed jointly on
existing coded symbols, partially decoded symbols, or uncoded
systematic data symbols. This feature allows intermediate network
nodes to re-encode and generate new linear combinations on the fly,
thus increasing the likelihood of innovative transmissions to the

<DO> “Innovative”? I know that one can use the word this way but perhaps it would be better to say “increasing likelihood that the receiver obtains enough linearly independent symbols”

<KF> Most of us feel this should stay as it is established in RLNC literature ([SS09] has a good definition). I’ll make sure it is defined and referenced in the next version, though.

receiver. Recoded symbols and recoded coefficient vectors have the
same size as before and are indistinguishable from the original coded
symbols and coefficient vectors.

In practical implementations of RLNC, the original source data are
often divided into multiple coding blocks or "generations" where
coding is performed over each individual generation to lower the
computational complexity of the encoding and decoding operations.
Alternatively, a convolutional approach can be used, where coding is
applied to overlapping spans of data symbols, possibly of different
spanning widths, viewed as a sliding coding window. In generation-
based RLNC, not all symbols within a single generation need to be
present for coding to start. Similarly, a sliding window can be
variable-sized, with more data symbols added to the coding window as
they arrive. Thus, innovative coded symbols can be generated as data
symbols arrive. This "on-the-fly" coding technique reduces coding
delays at transmit buffers, and together with rateless encoding
operations, enables the sender to start emitting coded packets as
soon as data is received from an upper layer in the protocol stack,
adapting to fluctuating incoming traffic flows. Injecting coded
symbols based on a dynamic transmission window also breaks the

<DO> s/breaks/reduces/

<KF> A block code with a fixed block size requires a fixed time to assemble and decode its symbols (a delay lower bound). Sliding windows with flexible window sizes can remove that limitation, hence the use of "breaks". So while I agree delay is "reduced", the lower bound is "removed". 

decoding delay lower bound imposed by traditional block codes and is
well suited for delay-sensitive applications and streaming protocols.

When coded symbols are transmitted through a communication network,
erasures may occur, depending on channel conditions and interactions
with underlying transport protocols. 

<DO> it may be worth saying that RLNC assumes that bit or burst errors, when they occur either on a communication channel, via memory corruption, or by security attacks, are converted into erasures by lower-layer error detection procedures.

<KF> Correct. We could simply say “RLNC operates on an erasure channel”, but I prefer your formulation. 
Check out this paper on correcting bit errors by combining RLNC with redundancy-check mechanisms:

https://ieeexplore.ieee.org/document/7742414 

RLNC can efficiently repair
such erasures, potentially improving protocol response to erasure
events to ensure reliability and throughput over the communication
network. For example, in a point-to-point connection, RLNC can

<DO> It might be better to say “two-party” rather than “point-to-point” as the latter tends to describe a channel rather than a communication relationship which might occur over a multi-hop, multi-path network. Similarly, as noted in general comments, avoiding the word “connection” by saying “for example, in two-party communication scenarios…”

<KF> Agreed. However, we feel “two-party” is not that common. We’re discussing “communication between two nodes” or “[RLNC] as an end to end code”. 

proactively compensate for packet erasures by generating Forward
Erasure Correcting (FEC) redundancy, especially when a packet erasure
probability can be estimated. As any number of coded symbols may be
generated from a set of data symbols, RLNC is naturally suited for
adapting to network conditions by adjusting redundancy dynamically to
fit the level of erasures, and by updating coding parameters during a
session. Alternatively, packet erasures may be repaired reactively
by using feedback requests from the receiver to the sender, or by a
combination of FEC and retransmission. RLNC simplifies state and
feedback management and coordination as only a desired number of
degrees of freedom needs to be communicated from the receiver to the
sender, instead of indications of the exact packets to be
retransmitted. The need to exchange packet arrival state information
is therefore greatly reduced in feedback operations.

<DO> in order to quantify “greatly”, an example with actual numbers would be useful.

<KF> Actually, we prefer to remove "greatly" and add a reference.

The advantages of RLNC in state and feedback management are apparent
in a multicast setting. In this one-to-many setup, uncorrelated
losses may occur, and any retransmitted data symbol is likely to
benefit only a single receiver. By comparison, a transmitted RLNC
coded symbol is likely to carry a new degree of freedom that may
correct different errors at different receivers simultaneously.
Similarly, RLNC offers advantages in coordinating multiple paths,
multiple sources, mesh networking and cooperation, and peer-to-peer
operations.

A more detailed introduction to network coding including RLNC is
provided in the books [MS11] and [HL08].

1.2. Generation-Based RLNC

This section describes a generation-based RLNC scheme.

[snip]

For any protocol that utilizes generation-based RLNC, a setup process
is necessary for establishing a connection and conveying coding
parameters from the sender to the receiver. 

<DO> I wouldn’t couch this as a “setup process”, nor use the word “connection” as noted in my general comment. Instead say something like “For any protocol that utilizes generation-based RLNC, the coding parameters used to create the coded symbols must be conveyed to the receiver(s) before decoding can occur. This can be done either as part of the packets containing the coded symbols, via separate packets sent prior to the packets with coded symbols in the same packet flow, or out-of band using a separate protocol”

<KF> Agreed.

Such coding parameters
can include one or more of field size, code specifications, index of
the current generation being encoded at the sender, generation size,
code rate, and desired feedback frequency or probability.

<DO> this is nice, but just as with symbol representation, more guidance and hopefully pointers to one or more actual specifications (to be produced) would really make this more complete. Additionally, just as with the symbol representation, this is a good place to describe the options and tradeoffs for how the conveyance of coding parameters are embedded into a containing transport or application protocol. Section 2.1.2 covers only part of this - the part dealing with the interaction of coding parameters with the matrix size (and some MTU considerations which probably belong separately since they apply to both symbol size and coding parameters).

<KF> We don't think that giving complete coverage of how coding parameters are conveyed (whether it is in a is in a dedicated section or ID) is realistic. We expect topology- and architecture-specific protocols to devise ways of setting the parameters they need. As stated above, a discussion of tradeoffs for some coding parameters is adequate. We can link to it here too. 

Some
coding parameters are updated dynamically during the transmission

<DO> s/are/may be/ ?

<KF> OK.

process, reflecting the coding operations over sequences of
generations, and adjusting to channel conditions and resource
availability. For example, an outer header can be added to the
symbol representation specified in [Symbol-Representation] to
indicate the current generation encoded within the symbol
representation. Such information is essential for proper recoding
and decoding operations, but the exact design of the outer header is
outside the scope of the current document. 

<DO> Well, so is the actual symbol representation since we decided to split it off, so this needs a bit of re-write. Also, casting this as an “outer header” is only one option as it may be either prepended, interleaved, or appended when creating coded packets, sent separately, or sent out-of band.

<KF> I think this addition makes sense. You're right about "out of scope" being a hold from the previous version, so the part about “the exact design” is to be removed. 

At the minimum, an outer
header should indicate the current generation, generation size,
symbol size, and field size. Section 2 provides a detailed

<DO> Might be more specific and say section 2.1.2 instead of just section 2.

<KF> Good point.

discussion of coding parameter considerations.

1.3. Sliding Window RLNC

This section describes a sliding-window RLNC scheme. Sliding window
RLNC was first described in [SS09].

In sliding-window RLNC, input data as received from an upper layer in
the protocol stack is segmented into equal-sized data symbols for
encoding. In some implementations, the sliding encoding window can
expand in size as new data packets arrive, until it is closed off by
an explicit instruction, such as a feedback message that re-initiates
the encoding window. 

<DO> I’m clearly missing something unless sliding window is only intended to work two party, and not with multicast. If that’s the case, say so. If multicast is supposed to work, we need some more description of how window closure is supposed to be managed across multiple receivers (especially in the presence of stragglers).

<KF> OK, I agree that it would be good to clearly state that sliding window schemes may be implemented with or without feedback, in end-to-end, multipath, multicast, and even cooperative mesh (coding) topologies. Explaining how sliding window feedback is managed in complex topologies is one of those topics that may fall "outside the scope" of a general background document, though. (See https://web.mit.edu/medard/www/papers2011/SpeedingMulticast.pdf for an interesting take on multicast feedback).

In some implementations, the size of the
sliding encoding window is upper bounded by some parameter, fixed or
dynamically determined by online behavior such as packet loss or
congestion estimation. Figure 3 below provides an example of a
systematic finite sliding window code with rate 2/3.

[snip]

For any protocol that utilizes sliding-window RLNC, a setup process
is necessary for establishing a connection and conveying coding
parameters from the sender to the receiver. Such coding parameters
can include one or more of field size, code specifications, symbol
ordering, encoding window position, encoding window size, code rate,
and desired feedback frequency or probability.

<DO> Repeat same comments I made above in the generational RLNC material.

<KF> OK. 

Some coding
parameters can also be updated dynamically during the transmission
process in accordance to channel conditions and resource
availability. For example, an outer header can be added to the
symbol representation specified in [Symbol-Representation] to
indicate an encoding window position, as a starting index for current
data symbols being encoded within the symbol representation. Again,
such information is essential for proper recoding and decoding
operations, but the exact design of the outer header is outside the
scope of the current document. At the minimum, an outer header
should indicate the current encoding window position, encoding window
size, symbol size, and field size. Section 2 provides a detailed
discussion of coding parameter considerations.

<DO> Ditto - repeat same comments I made above in the generational RLNC material.

<KF> We're removing "outside the scope" and specifying the subsection.

Once a connection is established, RLNC coded packets comprising one
or more coded symbols are transmitted from the sender to the
receiver. 

<DO> Ditto - repeat same comments I made above in the generational RLNC material with respect to the word “connection”.

<KF> Thanks. We're removing "connection".

The sender can transmit in either a systematic or coded

<DO> coding cognoscenti will know that you mean here, but somewhere (probably earlier in the general RLNC Intro), you should have a few sentences about how RLNC can work as either a systematic or non-systematic code. It comes out of the blue here (and interestedly wasn’t brought up in the generational RLNC material).

<KF> Good point. We can add an explanation of "systematic" and "coded" transmission earlier in the document and make sure the generational RLNC material refers to that distinction. 

fashion, with or without receiver feedback. In progressive decoding
of RLNC coded symbols, the notion of "seen" packets can be utilized
to provide degree of freedom feedbacks. Seen packets are those
packet that have contributed to a received coded packet, where
generally the oldest such packet that has yet to be declared seen is
declared as seen [SS09].

<DO> [SS09] talks about TCP. What about other protocols? I’m mostly reacting to the word “oldest” in the above text, which might mean different things in different protocols, particularly ones that do not rigorously enforce in-order delivery like TCP does.

<KF> Good point. I would do away with the second part of the sentence, as the first part defines "seen" clearly, is sufficient for feedback purposes, and is not related to in-order delivery. 

[snip]

2.	Practical Considerations

This is an open section describing various practical considerations
such as standardization approaches and implementation-related topics.

2.1. Symbol Representation

This sub-section argues for the specification of symbol
representation as a starting point for network coding standardization

<DO> s/as a starting point for/as one critical element of/

<KF> I think it’s both, as it's a common requirement for otherwise diverse protocols. However, that point is made below, so we’ll go with “critical element”.  

and provides relevant coding parameter design considerations.

2.1.1. Symbol Representation as a Standardization Approach

Symbol representation specifies the format of the symbol-carrying
data unit that is to be coded, recoded, and decoded. In other words,
symbol representation defines the format of the coding-layer data
unit, including header format and symbol concatenation.

Network Coding has fundamentally different requirements from
conventional point-to-point codes. 

<DO> I find this as just asking for somebody to argue. Reword? Maybe “Network coding has a multi-dimensional structure in terms of coding field size, symbol size, dynamic modification, etc. This leads to requirements for a highly reconfigurable symbol set.”

<KF> Point taken.

Network coding owes its distinct
requirements to its dynamic structure, leading to a highly
reconfigurable symbol set. For example:

o Coefficient Location: RLNC's encoding, recoding, and decoding
process requires coefficients and payload to go through identical
coding operations. 

<DO> I found this a bit confusing - why do the coefficients go through “identical coding operations”?

<KF> Once the coding is performed and the coefficients are in place, we run the same linear operations on the data structure containing the coefficients and coded symbols in order to carry out recoding and decoding. Theoretically, coefficient location information is not needed before decoding (i.e., not needed at intermediate nodes). 

 

  These operations are independent from the
  location of the coefficients.  As a consequence, coefficient
  location is flexible.  While some designs cluster coefficients
  together, other designs may distribute them throughout the payload
  in a manner that is specific to a given protocol.  [SS09]

o Number of coefficients: RLNC is designed to allow coding and
recoding even when the number of input symbols is dynamic, leading
to varying code density. As a consequence, the number of
coefficients and source data symbols need not be fixed.

o Payload Size: Although an identical size of symbols is desirable
when performing coding operations, padding and fragmentation are
viable not only at the source but also throughout the network, as
illustrated in the example of Figure 5. This allows flexibility
in the payload size.

o Field: Although the finite field is typically a fixed system
variable, 

<DO> saying “system” here begs the question of what exactly is the “system”. Maybe say “fixed for a given application instance and protocol embedding”?

<KF> Agreed.

 

  this is not necessarily the case.  Network coding need
  not specify a single field for all payload components, as
  different symbols may belong to different fields (e.g., packet
  concatenation).  This feature does not necessarily complicate
  coding, since finite field operations defined in a given field are
  typically valid in multiple other fields.

[snip]

Useful symbol representations should include provisions for the major
coding functions that are relevant to the application, such as
recoding, feedback, or inter-session network coding. For example,
recoding requires the coefficients to be accessible at the
intermediate recoding nodes. Hence, architectures and protocols
requiring recoding must specify coefficient location.

<DO> Here’s a particular place (recoding by intermediaries) where interaction with crypto may come into play, not necessarily to discuss here, but maybe to forward-point to some security considerations. It seems the privacy issues around what in the end-to-end communication is exposed to intermediaries needs to be spelled out.

<KF> Agreed. There are a number of contributions on the implications of exposing coefficients and coded payload at intermediate nodes (and allowing said nodes to modify them). Here's an example: https://www.mit.edu/~medard/mpapers/2010-jsac.pdf 

We will discuss an adequate update to the security section. 

[snip]

The absence of information on coefficient location has important
implications. One such implication is that any additional coding
needs to be carried out within a new coding layer, potentially
leading to higher computational and transport overheads.

<DO> use of the term “layer” here can be confusing - you could be referring to “layer” in the sense protocol people do as a protocol encapsulation, or “layer” as coding people do in the sense of “layered coders”.

<KF> How about the following? "One such implication is that any additional coding

needs to be carried out within an encapsulating (outer) coding layer, potentially leading to higher computational and transport overheads."

The elements discussed above demonstrate that the design choices
related to symbol representation have a direct impact on the
viability of protocols, topologies, and architecture. The importance
of symbol representation is illustrated in Figure 5, where the term
"architecture" includes coding architecture (e.g., generation or
sliding window), the layer placement of coding operations, and coding
objectives (e.g., erasure correction, multisourcing, etc.).

                +---------------+
                |Architecture   |
                |               |     Symbol
                |               |     Representation
                |               |
    +-------------------+       |          ^
    |Topology   |       |       |          |
    |           |  +-------------------+   |
    |           |  |----|       |      |   |
    |           |  |----| <----------------+
    |           |  |----|       |      |
    |           +---------------+      |
    |              |    |              |
    +-------------------+              |
                   |                   |
                   |           Protocol|
                   +-------------------+

Figure 5: The specification of symbol representation has major
implications on system architecture, topology, and protocol.

Since symbol representation has implications on core design elements,
it is expected that coding implementations that share protocol,
architecture, and topology elements 

<DO> I found this pretty confusing. What is the “topology” impact? In the case of topology are you saying I need to know the network topology in order to choose a good coding symbol representation? Or conversely that if I choose the “wrong” symbol representation coding won’t work well in my specific topology? The diagram above didn’t help me…

<KF> It's both. The diagram above is a simple illustration of the fact that symbol representation matters to the three indicated areas. In the case of topology, for example, the absence of access to coefficients means that we can't recode, hence no collaborative mesh (recoding) protocol or coded mesh topology. In Figure 5, topology refers to the coding topologies (e.g, E2E, multicast, multipath, mesh) accessible using a given symbol representation. We will clarify that point.

re likely to reuse the same
symbol representation. For example, implementations with security
requirements can reuse a common symbol representation that hides
coefficient locations.

<DO> sorry… I’m having trouble seeing the connection between hiding coefficient locations and security requirements. This would seem to be saying you can hide coefficient locations and that makes your security considerations different? More basically, how do you “hide” the coefficient locations without using crypto in the first place?

<KF> It is more akin to scrambling. Hiding here means not sharing with intermediate nodes, so that only a node with access to coefficient location (e.g., destination node) can actually decode. We’re discussing a reformulation, also in the context of an updated security section.

Another example can be found in [Symbol-Representation], which
specifies symbol representation designs for generation-based and
sliding window RLNC implementations. These designs introduce highly
reusable formats that concatenate multiple symbols and associate them
with a single symbol representation header.

2.1.2. Coding Parameter Design Considerations

[snip]

The generation size or coding window size is a tradeoff between the
strength of the code and the computational complexity of performing
the coding operations. With a larger generation/window size, fewer
generations or coding windows are needed to enclose a data message of
a given size, thus reducing protocol overhead for coordinating
individual generations or coding windows. In addition, a larger
generation/window size increases the likelihood that a received coded
symbol is innovative with respect to previously received symbols,
thus amortizing retransmission or FEC overheads. Conversely, when
coding coefficients are attached, larger generation/window sizes also
lead to larger overheads per packet. The generation/window size to
be used can be signaled between the sender and receiver when the
connection is first established.

<DO> s/when the connection/when communication/

<KF> OK

Lastly, to successfully decode RLNC coded symbols, sufficient degrees
of freedom are required at the decoder. The maximum number of
redundant symbols that can be transmitted is therefore limited by the
number of linearly independent coding coefficient vectors that can be
supported by the system. For example, if coding vectors are
constructed using a pseudo-random generator, the maximum number of
redundant symbols that can be transmitted is limited by the number of
available generator states.[RFC5445]

3.	Security Considerations

This document does not present new security considerations.

<DO> New compared to what?

<KF> See point above. We are discussing an updated security section for the next iteration of this document.

[snip]

DaveO