Re: [cso] NSQuery BOF

Young Lee <ylee@huawei.com> Wed, 15 September 2010 13:21 UTC

Date: Wed, 15 Sep 2010 08:22:21 -0500
From: Young Lee <ylee@huawei.com>
In-reply-to: <691ACCAACCD44CAAB381F5D10164D573@23FX1C1>
To: 'David Harrington' <ietfdbh@comcast.net>, 'Hares Susan' <shares@huawei.com>
Message-id: <000601cb54d9$070e5f90$600c7c0a@china.huawei.com>
MIME-version: 1.0
Content-type: text/plain; charset="us-ascii"
Content-transfer-encoding: 7bit
Thread-index: ActTimT7kaKhjuR8S66RbEfAgfq2pwBTipHg
References: <691ACCAACCD44CAAB381F5D10164D573@23FX1C1>
Cc: 'IAB' <iab@iab.org>, iesg@ietf.org, cso@ietf.org
Subject: Re: [cso] NSQuery BOF
Precedence: list

Dave,

Thanks for your comments on the NSQuery BOF. Please see in-line for our
response. Please let us know if you need further clarification from us. 

Best Regards,
Young

-----Original Message-----
From: David Harrington [mailto:ietfdbh@comcast.net] 
Sent: Monday, September 13, 2010 4:27 PM
To: 'Hares Susan'; ylee@huawei.com
Cc: iesg@ietf.org; 'IAB'
Subject: NSQuery BOF

Hi,

A few comments on this BOF proposal.

1) it would be nice to have this posted in a form that is easy to access and
read.
I don't know what is special about this pdf format, but the link doesn't
resolve nicely on my computer;
http://trac.tools.ietf.org/bof/trac/attachment/wiki/WikiStart/NS_query
%20Charter%20v2%20_2_.pdf yields
"HTML preview not available, since no preview renderer could handle it. Try
downloading the file instead. "

I had to download the document rather than read it online.

The BOF Chairs don't have email addresses on the BOF wiki page. 

>> We will provide text or html format of the charter document shortly to 
>> Ron. 

>> We will also update with BOF chairs' email addresses 

Is there an archive of mailing lists discussions?

>> Yes, there is an archive of mailing lists discussions:
>> http://www.ietf.org/mail-archive/web/cso/current/maillist.html

2) The document mentions interim meetings. Since "interim meeting" is an
official IETF designation, it would be good if this document did not
overload the term. IETF official terms should not be used for
non-IETF/pre-WG events.

>> We will revise the charter description to avoid the term, "interim 
>> meeting." 

3) I would expect it might be difficult to survey the existing protocols
before getting the problem statement written. This type of charter item is
usually for a charter where the conclusion of such a survey is already
decided. The fact that the description takes the approach of "... survey of
existing protocols, and requirements document in order to verify with the
Network Management and OPS IETF community that no mechanisms exist." Verify
that none exist certainly sounds like a decision has been made.

>> Our initial survey suggests that no mechanisms exist. However,
>> the first thing to validate in a BOF or a WG is our initial 
>> investigations are correct.
>> 
>> If there are mechanisms to "re-use", then we should "re-use" these
>> instead of creating new mechanisms.  We expect that this behavior is
>> being a good and wise IETF citizen.  Planning for this in the process
>> seems to be prudent in the IETF process.
>>
>> Is there a reason we should not take this prudent step?  

 
4) Interested Parties: are those parties that have committed to editing and
reviewing the work, or are these parties the WG proponents hope to interest?
The IESG and IAB will really want to know who has committed to doing work,
not just who attended a barBOF.

>> Interested parties are that have committed to editing and reviewing the
>> work. Based on private conversations after a Bar-BOF in Maastricht, we 
>> have strong supports from three carriers (Verizon, Telefonica, NTT). They
>> are committed to write requirements, application scenarios, etc. We also
>> have support from vendors and research institute/academia: Huawei, 
>> Ericsson, ETRI, Univ. of Texas for now. 


5) There are lots of unexpanded acronyms in this document, which makes it
hard to assess what is being proposed. It also gives the feeling that some
work has already been done and is now being brought to IETF to be blessed.
That typically doesn't go over well in IETF. The addition of "potential
collaborations" with other organizations really heightens this impression.
and an implementation report promosed for November 2011 feels premature
unless a spec is already defined. Is this work that has already been done in
other organizations? or is it proprietary technology that somebody is
proposing be standardized?

>> We will revise the charter document explaining all the acronyms. 
>> There is no proprietary technology that we are aware of. 
>> We will reevaluate the date for the implementation report. 
>> Our implementation report date is based on commitments to do
>> prototyping, and early tests on this protocol. 

6) PCP/PEP are terminology that was used during COPS-PR development, and
COPS-PR left a very bad taste in the mouths of some OPS leaders.
Recent attempts to add extensions to COPS-PR were rejected by IETF, and
recent inclusion of COPS-PR in a network management survey doc led to a
firestorm of protest and calls that COPS-PR be declared Historic.
Where do these terms come from?

>> When we used the terms PCP/PEP, we meant not to reinvent the wheels 
>> on the existing policy work in IETF.  
>> 
>> If you have a politically correct term for the same functionality 
>> practiced in networks, we will be glad to switch to this term.
>> 

7) "(as shown in diagram 1)" - where is diagram 1? Obviously, this must have
been taken out of a larger document, one that probably defines all these
acronyms and has diagrams as well. See point#5.

>> We will clean up that. 
>> We appreciate your careful read to detect this error. 

8) "includes transport to physical layer" - The IETF typically works in
layers 3 through 7. Other SDOs work in layers below 3 and higher than 7. Why
is this work appropriate for the IETF to take on?

>> When we say "network stratum", we meant the underlying data transport
>> layers (L1/2/3) that carry the application data. Network conditions such
>> as latency, congestion, loss, bandwidth, etc. are all affected by these
>> components. 

>> CCAMP is an example that addresses control and measurements of MPLS/GMPLS
>> based networks and its scope extends to packet (L3), MPLS (L2.5),
>> SDH/OTN/Lambda (L1/2). 
>>
>> If you have another "architectural" term for this grouping of layers, 
>> which is called "Network Stratum", we would be glad to use it.  
>> 

9) a protocol that "will allow the application to do a whole network query
of information at the network-stratum so that synchronized monitoring would
be enabled across application and network." A whole network query? does this
include the whole global Internet? in one query?

>> A whole network query means the query of an Administrative Domain 
>> (RFC 1136) which means - the single AS or multiple AS under the
>> control of a single administrative (business or equivalent). 
>>
>> This is not the whole Internet - it is the area under a Carrier's domain
>> such as Verizon, AT&T, FT, DT, etc.. 
>> Or it could be under a grouping of collaborating Carriers -- but that
>> would take a policy that makes these an Administrative domain. 
>>
>> As few people use the clear language of RFC 1136's Administrative
>> Domains, this may need more explanation. 
>> Please let us know if we can answer in more detail. 


10) "across many boxes across multiple domains"? I think you're going to
need to present enough information to make people believe this proposed
solution is somehow going to be scalable. I don't see anything here that
discusses scalability, or how this is in any way likely to be scalable.

>> We will begin a single Administrative domain first
>> and then extend to multi-domain Administrative domains. 
>> 
>> We are not querying all the information in a node, but 
>> just a set of information at a particular time.
>> For example, to get data flows - you could query 10 variables
>> + network topologies across all the nodes in a Carrier network 
>> to know you have bandwidth to add a new video server.
>
>> The use within a single Administrative Domain (i.e., carrier such 
>> Telefonicam, Verizon, NTT) requires 
>> the query to go across (multiple-ASes) in a non-proprietary query.  
>> 
>> Carriers wish to have NM that is multi-vendor, multi-domain within
>> their networks. 
>>

11) I don't see any discussion of security across multiple domains.
Typically network management information is not shared across domains, since
that information is considered important for competitive advantage, and
contains details about a providers network architecture that they don't want
others to know about (security by obscurity).

>> This is an issue we need to address clearly. 
>> 
>> We will address the security issue across multi-domains
>> within the a single Administrative domain (many AS same Carrier,
>> or multiple ASes same Data-Center entity),
>> and multiple Administrative domains (across multiple carriers, 
>> or across multiple data center (Google, Yahoo), or 
>> any combination). 
>> 
>> Security of a network query has the sense of a user + capabilities.
>> For example, who can query and who can consume the information. 
>> A single entity may query but multiple people may consume the
>> information.
>>   
>> All Carriers or Data Center providers feel security is critical.  
>> 

12) I have some concern about "application protocols will use this network
protocol mechanism to manage the network". My impression is network
operators want to manage the network, and don't want to delegate that
responsibility to individual applications. 

>> We will clarify this further in the document.
>> 
>> The application is simply trying to managing itself based on 
>> network information from the whole network. 
>> 
>> Application needs a minimum level of network 
>> information to be able to allocate server selection in the "Application
>> Stratum." 
>> 

13) How does this compare to the decade WG and the alto WG, which also focus
on trying to coordinate between optimization of the underlying physical
resources and optimization at the application layer?

>> Altos focuses on collection some network data from internet/ISP in a P2P 
>> context primarily. 
>> 
>> Decade WG focuses on resource allocation of L3 networks. Here our focus
>> is not resource allocation in networks. 
>>
>> This NSQuery wants to query information at a specific time from
>> a group of nodes across a whole administrative domain (see above). 
>> 
>> Of course, we want to re-use existing work. 
>> After or initial survey (includes Altos and DECADE), we feel
>> the whole network query is not available at either resources. 
>> 

14) "Currently Network Management only queries network elements at a
particular layer" seems incorrect. I can easily use SNMP to simultaneously
query MIB modules that reflect instrumentation at multiple layers. One SNMP
GET can carry varbinds from multiple MIB modules.

Netconf is designed to query a complete configuration, or a subset of a
configuration. While the IETF doesn't have multiple data models
standardized, vendors can easily include information from multiple layers in
the same XML response.

>> We agree you can use an SNMP overlay of multiple 
>> to query multiple network elements.  However, SNMP query of 500 nodes
>> for data at a particular time requires 500 connections or 
>> traps.
>>
>> Instead we propose a single query that says "Give me a set of 
>> information(topology, latency, bandwidth, etc.) at a specific time. 
>> This proposal is like the security passwords in OSPF which take effect
>> at a specific time.  

>> A single query could say query 50 nodes (11:00 GMT 9/10/10) 
>> to send information regarding these 10 parameters + 
>> network topology to node X. 
>>
>> This query goes from the application to Network CSO Gateway. Once
>> the network CSO gateway receives the query, it may employ existing
>> mechanisms (SNMP queries) or new mechanism (NETCONF). 

>> The current netconf and netmod charters do not provide a place
>> to contribute this work. This work is focused on querying 
>> the whole network for a set of data at the same time. 
>> 


15) How will a protocol query "across many boxes across multiple domains"?
Is this a multicast protocol? or is it a unicast protocol where the queries
are coordinated by an application? SNMP is often used to look at the
packets-in/packets-out stats for connected interfaces, and an application
coordinates the results. Netconf is designed to use locks and roolback to
support configuring multiple boxes at the same time.

>> Again our initial focus is on a single Carrier domain. We envision this
>> query protocol to be a unicast protocol where application "requests" 
>> network on several things that may affect application decision in 
>> choosing its application sources to end-users or between Data Centers for
>> load balancing purpose, etc.

>>
>> The key point here is that the application speaks to one set of entities
>> in the network (Network CSO controller (in our document)). 
>>
>> This Network CSO Controller uses a query management to query 
>> the network.  Some portions may only support SNMP and/or netconf
>> with lots of instrumentation overlay.
>> 

>> Our focus is to trigger queries to network stratum so that application 
>> would be able to make application decisions that is aware of various 
>> network conditions.   

>> The Network CSO control is in charge of either using existing means 
>> (snmp query or traps) or utilizing a new mechanism to provide
>> a whole network data (netconf/yang) or a new mechanism.
>>
>> The key part of this is to enable the application to request the NCG to
>> start this query for the whole network
>> 

  
How is this proposal different than those existing approaches to network
management of many boxes?

>> Network management approaches do not have a single query that
>> asks for a synchronized query across multiple nodes.

>> The point is that the application nominates a single set of servers:
>> Network CSO Gateways to do the query. These Network CSO Gateways return
>> the answer to the application. 

>> This simplifies the application/network interface. The Network CSO
>> Gateways can then operate using existing protocols or new protocols
>> to accomplish this query. (no flag day).



16) Much of this proposal talks about querying. But the initial paragraph
talks about optimizing where services should be instantiated. Will this
protocol also do the instantiation of services, or just the query?

>> This proposal for BOF comes as part of a larger body of work
>> doing cross-stratus optimization so that application 
>> can manage themselves. One of the ways modern applications can management
>> themselves is instantiate their server location at a particular
>> physical location. 

>> The instantiation of services by the application is not included 
>> in this charter's work. 


17) "Synchronized full-network network by application monitoring is critical
to make the application placement work effectively." I heard similar
concerns in the decade WG, where the proponents wanted the application layer
to control the placvement of blocks or files of data. 

Storage protocols, such as NFS and SCSI, already have mechanisms to
coordinate and optimize distributed storage. The problem with delegating
such optimization to the application is that a single application doesn't
see what all the other applications are doing. 

>> We are not delegating to application the whole-network optimization.
>> The application is simply managing itself using information learned from
>> the network stratum. 


18) How will a protocol that queries "across many boxes across multiple
domains" address congestion control?

>> Dave the MPLS OAM and TE are two management protocols that go 
>> operationally across multiple domains/ASes to tune pipes for 
>> transmission.

>> The same is true at application layer for applications going across 
>> multiple ASes for video transmission. 


David Harrington
Director, IETF Transport Area
ietfdbh@comcast.net (preferred for ietf) dbharrington@huaweisymantec.com
+1 603 828 1401 (cell)

Re: [cso] NSQuery BOF Young Lee