Re: [IPFIX] [Sender: ipfix-bounces@ietf.org] Re: Semantic and structured data

Paul Aitken <paitken@cisco.com> Wed, 17 March 2010 13:22 UTC

Return-Path: <paitken@cisco.com>
X-Original-To: ipfix@core3.amsl.com
Delivered-To: ipfix@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 131743A6A3B for <ipfix@core3.amsl.com>; Wed, 17 Mar 2010 06:22:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.497
X-Spam-Level:
X-Spam-Status: No, score=-7.497 tagged_above=-999 required=5 tests=[AWL=1.972, BAYES_00=-2.599, DNS_FROM_OPENWHOIS=1.13, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id obZdHDyEVA4i for <ipfix@core3.amsl.com>; Wed, 17 Mar 2010 06:22:09 -0700 (PDT)
Received: from ams-iport-1.cisco.com (ams-iport-1.cisco.com [144.254.224.140]) by core3.amsl.com (Postfix) with ESMTP id 5B3C43A6C3C for <ipfix@ietf.org>; Wed, 17 Mar 2010 06:19:25 -0700 (PDT)
Authentication-Results: ams-iport-1.cisco.com; dkim=neutral (message not signed) header.i=none
X-IronPort-AV: E=Sophos;i="4.49,657,1262563200"; d="scan'208";a="58179086"
Received: from ams-core-1.cisco.com ([144.254.224.150]) by ams-iport-1.cisco.com with ESMTP; 17 Mar 2010 13:19:33 +0000
Received: from [144.254.153.37] (dhcp-144-254-153-37.cisco.com [144.254.153.37]) by ams-core-1.cisco.com (8.13.8/8.14.3) with ESMTP id o2HDJWrw002268; Wed, 17 Mar 2010 13:19:32 GMT
Message-ID: <4BA0D6E4.6030509@cisco.com>
Date: Wed, 17 Mar 2010 13:19:32 +0000
From: Paul Aitken <paitken@cisco.com>
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.8.1.22) Gecko/20090605 SeaMonkey/1.1.17 (Ubuntu-1.1.17+nobinonly-0ubuntu0.9.04.1)
MIME-Version: 1.0
To: Gerhard Muenz <muenz@net.in.tum.de>
References: <4AF73525.8050009@net.in.tum.de> <4AF8F999.3000207@cisco.com> <F60CA342-F488-4179-8AB8-079D32D26BCD@tik.ee.ethz.ch> <4B9E33BE.7070907@cisco.com> <7A1F2B11-B407-4BAF-8481-F867CCDEF5FC@tik.ee.ethz.ch> <4B9F520E.8060507@cisco.com> <DB6E59D9-B373-4919-BB58-00EB26014564@tik.ee.ethz.ch> <4BA0C26D.9070901@cisco.com> <4BA0D177.1070506@net.in.tum.de>
In-Reply-To: <4BA0D177.1070506@net.in.tum.de>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
Cc: ipfix@ietf.org
Subject: Re: [IPFIX] [Sender: ipfix-bounces@ietf.org] Re: Semantic and structured data
X-BeenThere: ipfix@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: IPFIX WG discussion list <ipfix.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ipfix>, <mailto:ipfix-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ipfix>
List-Post: <mailto:ipfix@ietf.org>
List-Help: <mailto:ipfix-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipfix>, <mailto:ipfix-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Mar 2010 13:22:20 -0000

Gerhard,

> Some general thoughts from my side:
> 
> - I appreciate that you want to add a basic notion of semantic to the
>   structured data.
> 
> - Up to now, semantic was not in the protocol but in the info model.
> 
> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> |0|               Field ID    |       Element Length            |
> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> | Semantic  |             BasicList Content ...                 |
> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> |                           ...                                 |
> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> 
>   Can't we encode Semantic in an IE?
>   Then, we could do without a new IANA registry.

We'd still need a (sub)registry to allow the IE to be extensible.
eg, see the "MPLS label type" registry.


>   And: This IE could be used for other purposes (e.g., in Templates)

How do you see that being used?


> - I'm not sure whether the bitvector encoding has an advantage over an
>   enumeration type.

A bitvector would allow multiple semantics where most combinations would 
not be useful. eg, "AND OR". Only "NOT" + "second semantic" seems useful.


> Unfortunately, I do not have the time to participate in a deep 
> discussion - I'm busy with IPFIX-CONFIG and other stuff.

Thanks for taking the time.

P.


> Benoit Claise wrote:
>> Hi Brian,
>>> Hi, Benoit,
>>>
>>> Replies inline...
>>>
>>> On Mar 16, 2010, at 2:40 AM, Benoit Claise wrote:
>>>
>>>   
>>>> Hi Brian,
>>>>
>>>> Thanks for your feedback.
>>>>     
>>>>> hi, Benoit, all...
>>>>>
>>>>> Agreed that we should do something about this (i.e., that solution 
>>>>> 1 is no solution.); that said, a few comments in no particular order:
>>>>>
>>>>> 1. In considering adding explicit semantics to structured data, we 
>>>>> as a WG are taking on the task of defining semantics for IPFIX as a 
>>>>> whole. Semantics, as I understand them, are in IPFIX largely 
>>>>> contextual and template dependent, but in almost all cases I can 
>>>>> think of it seems like these are implicitly "AND" semnantics (this 
>>>>> flow has source IP A _AND_ destination IP B _AND_...).
>>>>>
>>>>>        
>>>> Agreed.
>>>>     
>>>>> We will need to make an explicit statement on this.
>>>>>        
>>>> Not sure why.
>>>>     
>>>>> We will need to determine whether these implicit semantics are a 
>>>>> property of the protocol (in which case Structured Data is really a 
>>>>> protocol-level extension), or a property of each information 
>>>>> element (in which case all 5103 IEs have implicit AND semantics; 
>>>>> question 2: will we need to add semantics to the IANA registry in 
>>>>> this case?). We will need to be quite careful about this. It's not 
>>>>> as simple as defining semantics within structured data then calling 
>>>>> it done.
>>>>>
>>>>>
>>>>>        
>>>> I'm not sure why each individual IE should have a semantic ... in 
>>>> the IANA registry.
>>>>      
>>> I'm not either. On reflection it seems like overkill. I'm just saying 
>>> that if we, as a WG, are moving from stating that semantics are 
>>> explicitly out of scope to defining them as in-scope, we need to have 
>>> a consistent approach, and answers to all the questions that arise 
>>> when we consider moving the protocol from a simple framing mechanism 
>>> to a framing mechanism with some logic behind it, so we answer them 
>>> once, so that all future efforts having to do with semantics are 
>>> consistent. This, I acknowledge, is an argument in favor of solution 
>>> 3...)
>>>
>>> One very simple new question that arises here, to illustrate my 
>>> point: Is it legal to export a record that has sourceIPAddress X AND 
>>> NOT sourceIPAddress X?
>>>    
>>  From a protocol point of view, yes
>>  From a semantic point of view, I don't see a use case for that.
>> Now, this question is not different that: with RFC5101,  is it legal 
>> to export a record that has two instances of sourceIPAddress?
>>>   
>>>> As you wrote, "in almost all cases I can think of it seems like 
>>>> these are implicitly "AND" semnantics", so the case we try to solve 
>>>> is when there are multiple instances of a single IE. We know that 
>>>> RFC 5101 foresaw that case " The Collector MUST support the use of 
>>>> Templates containing multiple occurrences of the similar Information 
>>>> Elements", but the idea is not to change RFC5101.  If we put some 
>>>> semantic in the IPFIX structured data, that would solve the vast 
>>>> majority of our cases. Also, we could say: if you want some semantic 
>>>> when exporting multiple IEs, then you SHOULD use IPFIX structured data.
>>>>
>>>> Also, the default value for the semantic field in the IPFIX 
>>>> structured data SHOULD be "NONE", to express that flow record 
>>>> doesn't include any semantic.... like in RFC5101. You might draw 
>>>> your own conclusion, maybe because you know your network, maybe 
>>>> because you have configured the exporter, but then it's your decision.
>>>> The way I see the proposed solution is: in IPFIX structured data, 
>>>> you MAY use the semantic field as a way to express the relationship 
>>>> between IEs within the structure.
>>>>     
>>>>> 2. I don't consider draft-sommer-ipfix-mediator-ext-01 a valid 
>>>>> argument against Solution 2, as it's trying to solve a somewhat 
>>>>> different and more limited problem than structured data. Solution 2 
>>>>> _might_ cause a problem for this draft, but certainly not the other 
>>>>> way around (unless we as a WG want to subsume that work into this 
>>>>> draft, which would probably require rechartering...). Also, I don't 
>>>>> think we need semantics for all of the list types, just basicList 
>>>>> (Illustrative question: How do I meaningfully interpret two "or" 
>>>>> subTemplateMultiLists with disjoint IE sets? Two "and" 
>>>>> subTemplateMultiLists? Nestings thereof?)... So, we don't really 
>>>>> have an explosion to deal with if we do Solution 2 correctly: 
>>>>> andBasicList, orBasicList, xorBasicList, notBasicList; Four IEs, 
>>>>> nestable, done. What can't we do with those four IEs? In this case, 
>>>>> we could even step back and say that semantics outside these four 
>>>>> are within the protocol explicitly _undefined_, and to be 
>>>>> interpreted withi
> n the
>>>>>   context of each Template.
>>>>>
>>>>>
>>>>>        
>>>> If I translate some more what I wrote "While it's solved the router 
>>>> and most mediation function needs today", this would be "Me, myself, 
>>>> and I don't need more that OR and AND now ;-)".
>>>> However, who am I to tell that others don't need it now... and that 
>>>> the logical solution is to use the IPFIX structured data
>>>> Furthermore, if we think a little bit longer term, the next big step 
>>>> in IPFIX is the mediation function. In my company, every features 
>>>> want to export his own data with NetFlow/IPFIX... up to the point 
>>>> where a CPE would not have enough bandwidth across the WAN to export 
>>>> all the "management" information. So we'll need more and more of 
>>>> aggregated flow records (both in time and space) even in the router. 
>>>> Again, the logical solution will be to use the IPFIX structured 
>>>> data. At this point, we will most probably need something else than 
>>>> OR and AND, i.e. RANGE, ORDERRED, etc...
>>>>
>>>> An example of subTemplateMultiLists with disjoint IE sets, let's 
>>>> imagine that you have to export an aggregated observation point, 
>>>> composed of multiple template records
>>>>      template 1: exporterIPaddress
>>>>      template 2: exporterIPaddress, basicList of interfaces
>>>>      template 3: exporterIPaddress, LC
>>>>      
>>> and then you'd want to OR these... Okay, makes sense...
>>>
>>>   
>>>>     
>>>>> 3. If we really _do_ want ranges and so on (which, again, we'd need 
>>>>> to get WG consensus on; this is explicitly out of scope in my 
>>>>> reading of the present charter), then we could do them in the scope 
>>>>> of Solution 3.
>>>>>        
>>>> Btw, acknowledging that one day we will have to solve this is good 
>>>> enough for solution 3. I mean, we don't have to populate the 
>>>> semantic IANA now... even if that would be more efficient.
>>>>     
>>>>> However, this seems a little not-quite-fleshed-out-enough for me to 
>>>>> say whether I like it or not. Could you present an example of how 
>>>>> you would use the proposed Semantic field to model your example? 
>>>>> "(eth1 OR eth2) AND (NOT (eth3 OR eth4)) OR linecard2"
>>>>>
>>>>> (FWIW, I would do this with my proposal to Solution 2 as follows:)
>>>>>
>>>>> (orBasicList (andBasicList (orBasicList ingressInterface eth1 eth2) 
>>>>> (notBasicList (orBasicList ingressInterface eth3 eth4)) 
>>>>> (andBasicList lineCardID 2))
>>>>>
>>>>>
>>>>>        
>>>> (BasicList, OR, (basicList, AND, (basicList, OR, eth1, eth2), 
>>>> (basicList, NOR, eth3, eth4)), (basicList, NONE, linecard2)))
>>>>      
>>> Hm. Okay. This makes sense. A couple of questions then, about 
>>> solution 3:
>>>
>>> 1. Would we want to try and bitfield this, to define the semantics we 
>>> _know_ we have, then leave the rest of it reserved? Something like:
>>>
>>> MSb                      LSb
>>> +---+------+---------------+
>>> | ! | multi| reserved      |
>>> +---+------+---------------+
>>>
>>> ! = negate sense of semantics if 1 (this is the NOT flag)
>>> multi (multiplicity) = 00: undefined, 01: or/oneOrMore, 10: 
>>> xor/exactlyOne, 11: and/exactlyAll
>>> reserved = place for adding other bells and whistles like RANGE and 
>>> so on in the future.
>>>    
>> Maybe we want to start by asking the question: which semantic do we 
>> need now?
>> Is a need for NONE, OR, AND, ORDERED for now. Anything else?
>>> 2. Are we sure we want to do this in one byte? If we do bitfielding 
>>> as above this gives us 32 possible extensions, which seems like _way_ 
>>> more than enough, but does stick another odd offset in there, which 
>>> slows things down on machines that need aligned access. Probably one 
>>> byte is okay and we let the implementation use paddingOctets and set 
>>> padding to fix this...
>>>    
>> With structured data, is the alignment still important? When I look at 
>> the examples throughout the draft... As you wrote, paddingOctets might 
>> be the solution in this case.
>>> 3. Would it make sense maybe to have two sets of structured data 
>>> elements, one with the semantics byte, and one without (which is then 
>>> explicitly undefined)? Then exporters who don't need it don't have to 
>>> bother sticking an extra odd-aligned zero in the stream for every list.
>>>
>>> I'm sure I'l have more questions, but none come to mind now... But it 
>>> seems like we're converging on the least-unnecessarily-complex 
>>> solution here, which is good. :)
>>>    
>> Happy about that. ;-)
>> Note: I was envisioning even something more simpler: one byte, 
>> containing all the semantic possibilities, administered by IANA.  So 
>> no reserved, no !
>>
>> Regards, Benoit.
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> IPFIX mailing list
> IPFIX@ietf.org
> https://www.ietf.org/mailman/listinfo/ipfix


-- 
Paul Aitken
Cisco Systems Ltd, Edinburgh, Scotland.
http://www.cisco.com/web/about/doing_business/legal/cri/index.html