Re: [icnrg] Comments on draft-irtf-icnrg-ccninfo-02

Hitoshi Asaeda <asaeda@ieee.org> Tue, 20 August 2019 08:32 UTC

Return-Path: <asaeda@ieee.org>
X-Original-To: icnrg@ietfa.amsl.com
Delivered-To: icnrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 866881208EB for <icnrg@ietfa.amsl.com>; Tue, 20 Aug 2019 01:32:44 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=ieee.org
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id accT3OzaDLFa for <icnrg@ietfa.amsl.com>; Tue, 20 Aug 2019 01:32:41 -0700 (PDT)
Received: from mail-pg1-x541.google.com (mail-pg1-x541.google.com [IPv6:2607:f8b0:4864:20::541]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 643211208EA for <icnrg@irtf.org>; Tue, 20 Aug 2019 01:32:41 -0700 (PDT)
Received: by mail-pg1-x541.google.com with SMTP id n9so2811688pgc.1 for <icnrg@irtf.org>; Tue, 20 Aug 2019 01:32:41 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ieee.org; s=google; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=GC3GfcPMLKvrrZVfM0HRUOe8kLv9TdPVjGdeRPjGavs=; b=eH6KssQBqj71OIxicVW5CBDHLnl1aZRcojc09yxE9+OkiyX7g8+CChK9qz6w4hioNu Bc+mrO8SOjwDoqcwmQ/yz42b98iJy+FlwSmTtB/43aNzAgnnlpO+TsEJH1QxZPYJ+sgJ 6+IywU+tKkSf7p4f5zkMhtt64mNZnv6MHmma8=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=GC3GfcPMLKvrrZVfM0HRUOe8kLv9TdPVjGdeRPjGavs=; b=CTSc1bREL7VdpuHpqmGLIoubFP6qiYjgtXkzfyzeB/3EUxZY2QVMXeyOfX/aO6onp1 O6n+3gYgvMF7Kmf88llJbaakzt3uKTiqCB0Srk+7/C581sPCEmJ5+q0r8G6wyCNDUuNx 3gHFiKWmAKm9aimROWdW9O1WHVzF3InllEEsHaZxcETvprFEsW9DKqxk4MvPm1hwCbzS pCR6UjQyeeRq0jTOAT4ZGOoZWIDXLWiY8F5HSRfaME1q8K6HiSXHjawohilCtoQJgun/ 7GzwOtzUg6xJOTIxEBqBjUzphJGFFrcKgW+4ISBefM28ews+iaBuBHGPyChBbmyEL7JT PpXA==
X-Gm-Message-State: APjAAAWPqhyZdajgXezSqiRiPuPxxcIEVH7qfScEglqIDoxnn3lTwjTk PSPqXp0MDD2bbm4Gz3HkC/j5qDXyMWs=
X-Google-Smtp-Source: APXvYqzgx8iZPlr5ffGytHM2FZw6XJzKPDSs7xrI144QcYcUXvc6xPKiA6MnaVGasYlz2qi/XvsxvA==
X-Received: by 2002:a63:5811:: with SMTP id m17mr23682661pgb.237.1566289960494; Tue, 20 Aug 2019 01:32:40 -0700 (PDT)
Received: from [133.69.36.103] ([133.69.36.103]) by smtp.gmail.com with ESMTPSA id j10sm18803085pfn.188.2019.08.20.01.32.38 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 20 Aug 2019 01:32:39 -0700 (PDT)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\))
From: Hitoshi Asaeda <asaeda@ieee.org>
In-Reply-To: <FDD9172E-D6C5-4A4C-B8B1-C7787BCA7D4B@orandom.net>
Date: Tue, 20 Aug 2019 17:32:36 +0900
Cc: ICNRG <icnrg@irtf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <F9139187-D018-4105-B816-217AE45D342C@ieee.org>
References: <FDD9172E-D6C5-4A4C-B8B1-C7787BCA7D4B@orandom.net>
To: "David R. Oran" <daveoran@orandom.net>
X-Mailer: Apple Mail (2.3445.104.11)
Archived-At: <https://mailarchive.ietf.org/arch/msg/icnrg/d5DXk4DHr4Odgc9h42xO_XUIRp4>
Subject: Re: [icnrg] Comments on draft-irtf-icnrg-ccninfo-02
X-BeenThere: icnrg@irtf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Information-Centric Networking research group discussion list <icnrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/icnrg>, <mailto:icnrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/icnrg/>
List-Post: <mailto:icnrg@irtf.org>
List-Help: <mailto:icnrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/icnrg>, <mailto:icnrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Tue, 20 Aug 2019 08:32:45 -0000

Hi Dave,

Thank you very much for your review (and sorry for delay).

> On Aug 2, 2019, at 23:16, David R. Oran <daveoran@orandom.net>; wrote:
> 
> I did a full re-read of the latest CCNInfo draft. These comments are with <Chair Hat off>. I’ll be sending a separate message with my <chair hat on> comments shortly.
> 
> General comments:
> 
> 	• The specification is a lot more complete and easy to understand. It’s already implemented (by the authors), and I suspect another implementer could get pretty close to an interoperable implementation from the material in the current version.
> 
> 	• The design seems fundamentally solid. It’s not the design I would have chosen, but given this is for research purposes and heading toward experimental status rather than a standard, I think progressing this approach is fine. We can experiment with it (and other approaches) to learn more about how to instrument and manage networks based on CCNx. There is of course other complementary work in ICNRG (e.g. ping and traceroute) so researchers would be encouraged to do both qualitative and quantitative evaluation of the tradeoffs in the various approaches. There is also work under submission to the ICN’19 conference that takes yet another approach so soon we may have a rich set of things to compare and contrast.

I agree.

> 	• The forwarding model is pretty much separate from the Interest/Data forwarding procedures. I understand the authors’ motivation for this, but it does mean a lot of extra code in all the forwarders in order to have appropriate coverage. It also raises a number of possibly tricky resource allocation issues, since CCNInfo will be competing for both memory (i.e. PIT) and link bandwidth with Interest/Data while having somewhat different dynamic characteristics. For example, the Request messages will be large compared to regular Interests (especially so if they are signed), and the set of state stored in the PIT somewhat different, meaning you either have a separate data structure or a more complex joint PIT data structure.

I don't deny that CCNinfo requires additional implementation costs for forwarders. One (big) positive situation is that we've already implemented CCNinfo specified in the current draft into Cefore, which is an open source and can be referred by developers. Hope it contributes to the other forwarders' implementations.
In addition, the current CCNinfo specification allows to return null values for several fields such as First/Last Seqnum or Elapsed Cache Time fields in the Reply sub-block. (Section 3.2.1.1 says these values MAY be null.) It means that the forwarder can not only hide these values because of privacy/security policy, but also can skip the implementations of the complex functions to report these values. 

> 	• It was somewhat difficult for me to get a holistic grasp of the security properties of the CCNInfo Request/Reply protocol. For example:
> 
> 		• Requests can be signed by the initiator, but the request blocks are not individually signed. This seems to result in a fully transitive trust model upstream toward the FHR (CCNInfo terminology identifying the forwarder directly attached to a producer). Is this a problem? I’m not sure.

CCNinfo inherits the manner of the regular CCN for signing the messages only with publisher/consumer keys. CCNinfo additionally considers the lightweight access control using the node IDs. Except this access control, the current CCNinfo does not provide additional secure mechanisms by itself. 

Imposing on each router to verify and authenticate all of the request blocks is a heavy duty. However, if people prefer CCNinfo to verify whether the request/reply messages are sent from adjacent routers, I can add that requirement with SHOULD (or MAY? MUST is too strong) in the revision.
Note that HopAuth we have proposed in a separate draft is the general mechanism to verify and authenticate CCN messages along the forwarding path (as well as consumers/publishers). CCNinfo can cooperate with such external secure mechanism. I think mentioning such cooperation is another choice.

What do you think?

> 		• Conversely, recording the identity of every upstream forwarder may represent a privacy problem. Again, the tradeoffs are difficult to assess, since there is the ability in the protocol for information hiding, and administrative controls at domain boundaries. However, the anonymity characteristics of Interest handling (the further you are away from the consumer, the harder it is to localize the consumer in the topology, discover the identity, or establish linkability) are compromised. Does this matter? I’m not sure.

Thank you for your comments.

Regarding "router/forwarder" privacy, in section 10.1, we say;
"according to the policy configuration, the Node Identifier field in the Report block MAY be null (i.e., all-zeros), but the Request Arrival Time field SHOULD NOT be null."
This means that CCNinfo allows forwarders to hide their Node IDs (e.g., node names, IP addresses) if they wish. However, if a forwarder sets its Node ID to null, its upstream routers cannot recognize and verify the forwarder with the Node ID. This is a dilemma of authentication vs privacy (or anonymity).
Note that as seen in section 10.2, currently it is not allowed consumers (i.e., CCNinfo users) to hide their Node IDs. It means that the Node Identifier field in the Request block MUST NOT be null.
Section 10.3 describes "topology" privacy and section 10.4 describes "content (or publisher)" privacy. However, I agree that all of these sections should be improved.

As said above I can add the requirement of message verification from adjacent routers, but I believed we don't need to define other "CCNinfo oriented or original" secure mechanisms in this document as the expected threats are not CCNinfo oriented but resided in CCN itself.
What do you think?

> 		• On a similar note, it appears the individual reply blocks in Reply messages are not signed so again CCNInfo appears to rely on transitive trust to protect the returned information. Even if they were to be signed, deciding whether the right key was used is not at all clear (this is related to another comment later on the question of how forwarders are named/identified).

Right, it is not addressed by the regular CCN nor CCNinfo at this moment as of above discussions. Should I add the requirement or method of message verification? Or can we rely on some other mechanisms, e.g., HopAuth?

> 	• Some of the statistics reported in CCNInfo are likely to be expensive to compute, and possibly quite algorithmically difficult to implement efficiently in high-performance implementations that shard the PIT & CS. It isn’t clear whether these ought to be done by CCNInfo or obtained instead by a direct application-layer management protocol talking to the management application in a forwarder node. It may be worth scrubbing these to only return basic information about an individual Data Object in a cache, repo or producer, and only return a return handle to use (via a separate protocol) to obtain the details for groupings like prefixes.

As I replied above, CCNinfo is not obliged to implement several complex functions, e.g., to report the values such as First/Last Seqnum or Elapsed Cache Time in the Reply sub-block. They can be reported with null.
The current draft may not clearly mention that implementing the complex functions can be omitted or simplified in order to prioritize higher throughput.
We'll add some clarification in the revision. Thanks.

> 	• CCNInfo creates a non-CCNx namespace for naming/identifying forwarders. Why? This seems to me both fraught with management complexity (allocation, duplication, security, etc.), at odds with the naming architecture of CCNx (or NDN) and frankly unnecessary. If instead CCNinfo used the existing namespace structure to name forwarders, things would (at least in my opinion) be much better:

I'm not clear your question.
The draft mention the possibility to use an IP address to identify a node, but it is not obliged at all. The current CCNinfo can use any identifier, say node name, for forwarder/publisher/consumer.
Or, are you saying we should not allow to use IP addresses for node identification because of some reasons? I prefer to less limitation, though.

> 		• name allocation for forwarders is no different from that for any other data producer
> 		• by providing these names in CCNInfo Replies to the initiator, you now have a direct and easy coupling to a more comprehensive management protocol driven by regular interest/data exchanges (or for more sophisticated uses, by RICE).
> 		• security becomes a lot better integrated, as you can use all the existing trust schema work to decide what to sign, what to encrypt, and using what keys.

I'm sorry that I cannot understand your points. Are they addressed if CCNinfo uses "node name" for the request/reply messages? Or are you saying other things?

> 	• It appears flow balance is seriously violated by the discovery form of CCNInfo request/reply exchanges, as a single request can generate multiple replies (in pathological cases, an exponential number of replies since Request aggregation is turned off explicitly). We really need to deal with this. Some possible alternatives:
> 
> 		• quarantine relies at intermediate forwarders and combine them; returning only one “aggregate” reply on each downstream link.

We thought the reply aggregation previously, but to aggregate multiple replies, we impose on forwarders need to keep replies for some period and merge them into a single message. The concern here is that intermediate forwarders ought to deal with the message aggregation, which is an additional complex task and delays the reply. In addition, some replies are potentially lost or delayed due to various conditions, and it is impossible to recover the lost/delayed replies in any case.
There are papers (mainly for IoT to gather sensor data within a network) that adopt CCN/NDN message aggregation inside networks, but I couldn't find a perfect answer for in-network aggregation. Do you think we should mention some (maybe optional) mechanism for the message aggregation in the revision?

> 		• Explicitly bound the number of replies as well as their size and communicate this in the protocol so the resource allocators can account for this in congestion management.
> Here are some more detailed comments on things that I found in reading the specification:
> 
> 	• Report blocks can get pretty big. The recommendation is to just give up and not report anything more if your reply would exceed an IPv6 MTU. Why? CCNx supports Data objects as large as 64K and current NDN restricts to 4K. Admittedly this has all sort of “interesting” fragmentation and congestion control implications (which are the subject of a new I.D. I plan to submit soon…but I digress). Given that there is no exclusion capability, nor steering capability it’s not possible to get beyond the “horizon” created by this limitation. I’m not advocating necessarily adding either of these, however the small MTU for replies does seem to be problematic for a network management tool.

Are you saying a big "request" or "reply"? In the current spec, CCNinfo returns an error for a fragmented "request" message, while it does not prohibit a fragmented "reply".
Honestly speaking, I forgot the precise reason why I had restricted fragmented "request".
I guess I considered the "request" message must not be fragmented as the "reply" message includes all request blocks of the original request message and then causes additional fragmentation on its way back. But I'm not sure..

Anyway, since CCNinfo does not rely on underlay, say IP networks, and I cannot remember the precise reason, we can remove the restriction of the fragmentation for any CCNinfo message (request/reply) and the NO_SPACE error code in the revision. What do you think?

> 	• In describing the information returned from caches, it wasn’t entirely clear if the these sum all the things in the cache matching a given prefix or something else. If they are in fact prefix-matched, this really needs to be done in the background and not in the forwarding path given the expense of traversing the cache data structures. (Folks will recall we got rid of prefix-match on data in CCNx for exactly this reason, and NDN recently changed to prefer exact match as well). Once you do this it may be appropriate to ask why not just layer these capabilities of CCNInfo as an application (as does the NIST work for NDN) rather than bake it into the basic network layer packet forwarding. I don’t necessarily advocate this; in fact I would prefer to simplify CCNInfo to make it feasible to implement at high performance (see my general comment earlier about only returning simple stuff in CCNInfo and doing the complex stuff on top).

The exact name includes the chunk or segment number of the content, such as ccn:/abc/xyz/Chunk=10. CCNinfo supports exact match for chunk level trace, e.g., ccn:/abc/xyz/Chunk=10. In addition, it supports prefix match; it can "only" omit the chunk or segment numbers for the request. e.g., ccn:/abc/xyz/. CCNinfo does not trace with a prefix search like FIB that finds longer prefix match.

For content retrieval, you may want to only allow the exact match (i.e., chunk/sequence number must be specified). However, for discovering content/cache location/status, I thought we can allow to specify chunk numbers because of its usability. 
I have a question. What do you think if we change the prefix matching is optional? I still believe the prefix matching is highly beneficial especially for research. How about changing the last paragraph of section 3.2.1.1 to the following sentence:
  "CCNinfo allows to specify an exact name of content (such as 
   "ccn:/news/today/Chunk=10"). It is OPTIONAL to support a request
   with a content prefix name, which omits a chunk or segment number
   (such as "ccn:/news/today"). When a CCNinfo user specifies an exact
   name, s/he will obtain only about the specified content object in
   the content forwarder. When a CCNinfo user specifies a prefix name,
   s/he will obtain the summary information of the matched content
   objects in the content forwarder."

For the next comment. Application-layer vs. network-layer is the fundamental discussion. Both have pros and cons. Previously, we had developed an application-layer CCN measurement tool and published an article ("Contrace: A Tool for Measuring and Tracing Content-Centric Networks", IEEE ComMag, Mar. 2015), which is ref [6] in the draft. We inherited the major concept of this one to CCNinfo, yet we chose the network-layer measurement tool. The reasons are, in short, the standardized approach embedded into the standardized protocol is advantageous from the viewpoint of its deployment and interoperability, and type value reservation is better than name reservation or port number reservation for this kind of tool as sequentially increased type values are more manageable, and conceptually CCN works with any name for data forwarding or doesn't rely on TCP/UDP port number.

For the last comment, I agree to high performance measurement (or minimized negative performance at least) and hence this draft allows to skip implementations of several complex (or heavy duty) functions.

> 	• I didn’t see use of the existing T_MTU_TOO_LARGE error when you run out of space (page 11). Did I miss it?

Instead of MTU_TOO_LARGE, we defined NO_SPACE error code. But as I said above, we can completely remove such fragmentation error and its code from the revision, if you agree.

> 	• Is 16 bits enough entropy for RequestID (page 12)? it may be ok if you don’t do any aggregation across consumers (which you currently don’t) and lifetimes are bounded very small as you currently do (4 seconds). It still makes me nervous though.

CCNinfo is a tool to discover the path information toward the caching forwarder/publisher along FIB and the cache status in the caching forwarder. It could be used by researchers and operators to measure/recognize the CCN conditions. CCNinfo request/reply are not very frequently happened like the ordinary interest/data exchange in my expectation. Hence IMO 16 bits Request ID is currently enough. But if people agree on enlarging the size, I can do it in the revision. The Reply timeout for full discovery is now 3 seconds as its default.

> 	• On page 14 you RECOMMEND routers have synchronized clocks. This too strong in my opinion, for three reasons:
> 		• you are throwing away the low order 16 bits of the NTP timestamp anyway

We truncate the 16 bits of the lower part of 32 bit fraction part of a second. This method (see the formula in page 14) is used by the standard protocol, e.g., Mtrace2 (RFC 8487, ref [8]). Precisely, it truncates about 15 micro sec as the maximum. We don't need this micro sec level preciseness for RTT measurement.

> 		• loose synchronization is sufficient for the kinds of uses I think you want from CCNInfo.

To measure one-way latency or end-to-end RTT, time synchronization among routers can be omitted. For per-hop RTT measurement, however, this time synchronization is required.

> 		• If you expect CCNInfo to give you accurate per-hop and total RTT estimation using these clocks this isn’t terribly helpful given that the whole protocol runs on a different forwarding model, so you can’t use the CCNInfo RTT measurements to say much of anything about what real Interest/Data exchanges will experience. The alternative Ping and Traceroute proposals should do a much better job of this.

As said above, for the total (end-to-end) RTT measurement, time synchronization is not a mandate.
I don't deny the alternative approaches, but it is very useful for CCNinfo to have the functionality. If the word, "RECOMMEND", the current draft uses is strong for you/people, I may suggest to change the statement to the following one (it uses MUST but totally sounds optional as I use "if").
  "CCNinfo measures one-way latency and end-to-end RTT; however, if one
   wants to measure per-hop RTT as well, all the routers on the path MUST
   have synchronized clocks."
Is this statement acceptable?

> 	• Is the count of “received interests” (page 18) all those received or only those satisfied?
> 
> 	• I didn’t see much value in the Reply block information on First/Last Segnum or Elapsed Cache time. For these and some others it might help to given some examples of how one might use this.

First/Last Seqnum are the values to roughly expect the consecutiveness of in-network cache. They give a hint of better cache allocation in the network. You may be interested in the paper, "Consecutive Caching and Adaptive Retrieval for In-Network Big Data Sharing," Proc. IEEE ICC, May 2018. In the revision, we will add some text for and usage of these values and explain the situation with that reference. BTW, returning these values are "MAY", so that some CCN implementations can omit to report these values (by filling with null).
Elapsed Cache Time is used to design cache algorithms. We will explain a bit more for this value, too. Note this value is allowed to (MAY) be null as well.

> 	• in Section 4.2 you say you have to compare the number of report blocks with the hop limit. I think this means you have to remember the received hop limit in the PIT entry for the request, but I didn’t see that in the list of state you have to keep.

We mentioned it in section 4.1,
  "CCNinfo user's program MUST keep the following information; Request
   ID and Flags specified in the Request block, Node Identifier and
   Request Arrival Time specified in the Report block, and HopLimit
   specified in the fixed header."

> Minor stuff:
> 
> 	• move the actual packet type and TLV allocations to the IANA considerations section, and mark the values as “TBS” rather than picking them (yes…this might force you to change the implementation, but them’s the rules for RFCs…).

According to my experiments of IANA section in I-Ds, we can specify the pointers (i.e., sections) with the related statements such as "Initial values for the TLV Types are given in the table at the beginning of Section 3.2" in the IANA section, without completely moving the statements or tables. In fact, I prefer to this style since it's much readable.
For the values, Ok, I mark them as TBS. (What's TBS? Not TBA?)

> 	• page 23 s/the fill discovery request/the full discovery request/

Will be done in the revision, thanks.

Thank you very much for your careful review.
It'd be great to hear your subsequent opinions.

Best regards,

Hitoshi



> [End of comments]
> 
> DaveO
> 
> _______________________________________________
> icnrg mailing list
> icnrg@irtf.org
> https://www.irtf.org/mailman/listinfo/icnrg