Re: [IPFIX] Semantic and structured data

Benoit Claise <bclaise@cisco.com> Tue, 16 March 2010 09:40 UTC

Return-Path: <bclaise@cisco.com>
X-Original-To: ipfix@core3.amsl.com
Delivered-To: ipfix@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 1A9E63A6A2A for <ipfix@core3.amsl.com>; Tue, 16 Mar 2010 02:40:38 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.468
X-Spam-Level:
X-Spam-Status: No, score=-2.468 tagged_above=-999 required=5 tests=[AWL=0.130, BAYES_00=-2.599, HTML_MESSAGE=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id G-Kz6YBHrT6m for <ipfix@core3.amsl.com>; Tue, 16 Mar 2010 02:40:35 -0700 (PDT)
Received: from av-tac-bru.cisco.com (weird-brew.cisco.com [144.254.15.118]) by core3.amsl.com (Postfix) with ESMTP id 9BBDB3A6A28 for <ipfix@ietf.org>; Tue, 16 Mar 2010 02:40:26 -0700 (PDT)
X-TACSUNS: Virus Scanned
Received: from strange-brew.cisco.com (localhost.cisco.com [127.0.0.1]) by av-tac-bru.cisco.com (8.13.8+Sun/8.13.8) with ESMTP id o2G9eX5R004468; Tue, 16 Mar 2010 10:40:33 +0100 (CET)
Received: from [10.55.43.57] (ams-bclaise-8718.cisco.com [10.55.43.57]) by strange-brew.cisco.com (8.13.8+Sun/8.13.8) with ESMTP id o2G9eUXo000265; Tue, 16 Mar 2010 10:40:31 +0100 (CET)
Message-ID: <4B9F520E.8060507@cisco.com>
Date: Tue, 16 Mar 2010 10:40:30 +0100
From: Benoit Claise <bclaise@cisco.com>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3
MIME-Version: 1.0
To: Brian Trammell <trammell@tik.ee.ethz.ch>
References: <4AF73525.8050009@net.in.tum.de> <4AF8F999.3000207@cisco.com> <F60CA342-F488-4179-8AB8-079D32D26BCD@tik.ee.ethz.ch> <4B9E33BE.7070907@cisco.com> <7A1F2B11-B407-4BAF-8481-F867CCDEF5FC@tik.ee.ethz.ch>
In-Reply-To: <7A1F2B11-B407-4BAF-8481-F867CCDEF5FC@tik.ee.ethz.ch>
Content-Type: multipart/alternative; boundary="------------040705010903090204000402"
Cc: ipfix@ietf.org
Subject: Re: [IPFIX] Semantic and structured data
X-BeenThere: ipfix@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: IPFIX WG discussion list <ipfix.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ipfix>, <mailto:ipfix-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ipfix>
List-Post: <mailto:ipfix@ietf.org>
List-Help: <mailto:ipfix-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipfix>, <mailto:ipfix-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Mar 2010 09:40:38 -0000

Hi Brian,

Thanks for your feedback.
> hi, Benoit, all...
>
> Agreed that we should do something about this (i.e., that solution 1 is no solution.); that said, a few comments in no particular order:
>
> 1. In considering adding explicit semantics to structured data, we as a WG are taking on the task of defining semantics for IPFIX as a whole. Semantics, as I understand them, are in IPFIX largely contextual and template dependent, but in almost all cases I can think of it seems like these are implicitly "AND" semnantics (this flow has source IP A _AND_ destination IP B _AND_...).
Agreed.
> We will need to make an explicit statement on this.
Not sure why.
> We will need to determine whether these implicit semantics are a property of the protocol (in which case Structured Data is really a protocol-level extension), or a property of each information element (in which case all 5103 IEs have implicit AND semantics; question 2: will we need to add semantics to the IANA registry in this case?). We will need to be quite careful about this. It's not as simple as defining semantics within structured data then calling it done.
>    
I'm not sure why each individual IE should have a semantic ... in the 
IANA registry.
As you wrote, "in almost all cases I can think of it seems like these 
are implicitly "AND" semnantics", so the case we try to solve is when 
there are multiple instances of a single IE. We know that RFC 5101 
foresaw that case " The Collector MUST support the use of Templates 
containing multiple occurrences of the similar Information Elements", 
but the idea is not to change RFC5101.  If we put some semantic in the 
IPFIX structured data, that would solve the vast majority of our cases. 
Also, we could say: if you want some semantic when exporting multiple 
IEs, then you SHOULD use IPFIX structured data.

Also, the default value for the semantic field in the IPFIX structured 
data SHOULD be "NONE", to express that flow record doesn't include any 
semantic.... like in RFC5101. You might draw your own conclusion, maybe 
because you know your network, maybe because you have configured the 
exporter, but then it's your decision.
The way I see the proposed solution is: in IPFIX structured data, you 
MAY use the semantic field as a way to express the relationship between 
IEs within the structure.
> 2. I don't consider draft-sommer-ipfix-mediator-ext-01 a valid argument against Solution 2, as it's trying to solve a somewhat different and more limited problem than structured data. Solution 2 _might_ cause a problem for this draft, but certainly not the other way around (unless we as a WG want to subsume that work into this draft, which would probably require rechartering...). Also, I don't think we need semantics for all of the list types, just basicList (Illustrative question: How do I meaningfully interpret two "or" subTemplateMultiLists with disjoint IE sets? Two "and" subTemplateMultiLists? Nestings thereof?)... So, we don't really have an explosion to deal with if we do Solution 2 correctly: andBasicList, orBasicList, xorBasicList, notBasicList; Four IEs, nestable, done. What can't we do with those four IEs? In this case, we could even step back and say that semantics outside these four are within the protocol explicitly _undefined_, and to be interpreted within the context of each Template.
>    
If I translate some more what I wrote "While it's solved the router and 
most mediation function needs today", this would be "Me, myself, and I 
don't need more that OR and AND _now _;-)".
However, who am I to tell that others don't need it now... and that the 
logical solution is to use the IPFIX structured data
Furthermore, if we think a little bit longer term, the next big step in 
IPFIX is the mediation function. In my company, every features want to 
export his own data with NetFlow/IPFIX... up to the point where a CPE 
would not have enough bandwidth across the WAN to export all the 
"management" information. So we'll need more and more of aggregated flow 
records (both in time and space) even in the router. Again, the logical 
solution will be to use the IPFIX structured data. At this point, we 
will most probably need something else than OR and AND, i.e. RANGE, 
ORDERRED, etc...

An example of subTemplateMultiLists with disjoint IE sets, let's imagine 
that you have to export an aggregated observation point, composed of 
multiple template records
     template 1: exporterIPaddress
     template 2: exporterIPaddress, basicList of interfaces
     template 3: exporterIPaddress, LC


> 3. If we really _do_ want ranges and so on (which, again, we'd need to get WG consensus on; this is explicitly out of scope in my reading of the present charter), then we could do them in the scope of Solution 3.
Btw, acknowledging that one day we will have to solve this is good 
enough for solution 3. I mean, we don't have to populate the semantic 
IANA now... even if that would be more efficient.
> However, this seems a little not-quite-fleshed-out-enough for me to say whether I like it or not. Could you present an example of how you would use the proposed Semantic field to model your example? "(eth1 OR eth2) AND (NOT (eth3 OR eth4)) OR linecard2"
>
> (FWIW, I would do this with my proposal to Solution 2 as follows:)
>
> (orBasicList (andBasicList (orBasicList ingressInterface eth1 eth2) (notBasicList (orBasicList ingressInterface eth3 eth4)) (andBasicList lineCardID 2))
>    
(BasicList, OR, (basicList, AND, (basicList, OR, eth1, eth2), (basicList, NOR, eth3, eth4)), (basicList, NONE, linecard2)))

Thanks for your valuable feedback, as always.

Regards, Benoit.

> Best regards,
>
> Brian
>
> On Mar 15, 2010, at 6:18 AM, Benoit Claise wrote:
>
>    
>> Dear all,
>>
>> We've been thinking about this one, and we see 3 solutions. We believe that the solution 3 is the way to go, as we show below.
>>
>> Solution 1.
>> Consider that the semantic is out of scope for this document.
>> This is the easiest solution.  However, we understand it's not right: we would not do a complete job wrt to IPFIX structured data
>> As Gerhard was expressing, the collector has not clue how to treat the example of a BasicList of egress interfaces in a Flow Record.
>>      - Has every counted packet been sent on every egress interface?
>>        =>  multicast case, AND semantic
>>      - Has every counted packet been sent on any one of the egress interfaces?
>>       =>  load balancing case, OR semantic
>>
>> Soltution 2
>> We could focus on the logical AND and OR semantics only , by defining semantic lists, such as andBasicList, andSubTemplateList, andSubTemplateMultiList, orBasicList, orSubTemplateList, and orSubTemplateMultiList
>> So 6 I.E.s in total, describing AND and OR semantics.
>> While it's solved the router and most mediation function needs today, we understand that this is not a complete solution.
>> Gerhard, in one of his old draft, draft-sommer-ipfix-mediator-ext-01, proposed ADTs for orderedList, orderedPair, and portRanges which allow the definition new IEs for port ranges etc.
>> Even if this solution 2 is extensible, in the long term, will lead to an explosion of IEs: 3 list types * the semantic type, where the semantic can be (and, or, orderedList, orderedPair, portRanges, random, etc...)
>>
>> Soltution 3
>> We propose to add a semantic field in the 3 list types.
>> Something such as:
>>
>>      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>> |0|               Field ID    |       Element Length            |
>> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>> | Semantic  |             BasicList Content ...                 |
>> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>> |                           ...                                 |
>> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>>
>> This semantic field would be a new IANA registry, we could be populated initially with NONE, OR, AND, ORDERED, etc... Up to discussion.
>> The advantage of this solution is that it's extensible, and it doesn't need new IEs for each semantic.
>> If ever required (mainly in a mediation function), we can model  very complex semantic, eg, "(eth1 OR eth2) AND (NOT (eth3 OR eth4)) OR linecard2", which could be the new observation point from a aggregated Flow Record
>>
>>
>>
>> Conclusion:
>> After much debate internally, we believe that we should do the effort to include this semantic now, and not postpone the problem, which will lead to an explosion of IEs in the future.
>> Personally, I was initially against the solution 3 ... mostly due to the effort required to modify the complete specs.
>> We're now ready to modify the specifications in the IPFIX structured data, but we would like to get your feedback and agreement in advance as this is not a small piece of work.
>>
>> Please comment.
>>
>> Regards, Paul, Stan, Gowri, and Benoit.
>>      
>>> hi Benoit, Gerhard,
>>>
>>> In this case I'm strongly in favor of leaving semantics out of structured data. Structured data defines containers. The semantics of the containers as a whole and the elements change based upon the information elements within the container and within the record containing it. Wedging semantics into the structured data elements 1. risks further proliferation of (potentially non-interoperable) ways to represent the same thing, 2. gives us more ways to represent nonsensical things (an OR basicList of MPLS stack entries...means...what?), and 3. risks defining an inadequate semantic representation mechanism (what about semantics for records not using structured data? what about ordered versus unordered sets? what about OR basicList vs AND basicList vs nested AND and OR basicLists vs just three identical IEs?). If we really want an unambiguous semantic framework for IPFIX (and here I'm not convinced either way) that's best done on its own, addressing things at the information element and record level. Doing it here confuses the issue.
>>>
>>> Regards,
>>>
>>> Brian
>>>
>>> On Nov 10, 2009, at 6:26 AM, Benoit Claise wrote:
>>>
>>>        
>>>> Gerhard,
>>>>
>>>> Thanks for your email.
>>>> I have no strong feelings about the two solutions you proposed.
>>>>
>>>> > From a pure router point of view,  I don't see any use cases for logical OR in exporting flow records.
>>>> However, from a IPFIX Mediator point of view, I see some use cases.
>>>> I mean that it requires an Intermediate Aggregation Process or Intermediate Correlation Process to express: I've seen this flow record OR that flow record.
>>>> Now, it's true that even routers will have mediation functions...
>>>>
>>>> I'm inclined to add orBasicList, orSubTemplateList, orSubTempalteMultiList to the draft (This is a small addition after all) and to express that, by default, a logical AND is assumed... specifically if the structured data is used in the IPFIX Mediation Protocol.
>>>>
>>>> I'm not convinced by the NOT in the context of structured data, as we don't even have a concept of NOT for a single information element!
>>>>
>>>> I would like to get some more feedback from others.
>>>>
>>>> Regards, Benoit.
>>>>
>>>>          
>>>>> Dear all,
>>>>>
>>>>> Regarding draft-ietf-ipfix-structured-data, I see the risk that the
>>>>> semantic of the exported structured data is not clear.
>>>>>
>>>>> How do you interpret the manifold occurrence of the same Information
>>>>> Element (basicList) or the same group of Information Elements
>>>>> (subTemplateList) in one record?
>>>>>
>>>>> What does it mean if basicList, subTemplateList, or subTemplateMultiList
>>>>> is used for a Flow Key field? Or non-key field?
>>>>>
>>>>> Some Examples:
>>>>>
>>>>> - BasicList of egress interfaces in a Flow Record
>>>>>   How should a Flow Record be interpreted which contains a list of
>>>>>   egress interfaces and a packet counter?
>>>>>   Has every counted packet been sent on every egress interface?
>>>>>     =>  multicast case, AND semantic (see example in section 8.1)
>>>>>   Has every counted packet been sent on any one of the egress
>>>>>   interfaces?
>>>>>     =>  load balancing case, OR semantic
>>>>>   Can it be used as a Flow Key or not?
>>>>>
>>>>> - BasicList of destination ports in a Flow Record
>>>>>   As every packet has only one destination port, the only reasonable
>>>>>   interpretation is that the Flow contains packets having one of
>>>>>   the reported port numbers.
>>>>>     =>  OR semantic
>>>>>   This would be a non-key field.
>>>>>
>>>>>
>>>>> I think there are two solutions:
>>>>>
>>>>> 1. We decide that the semantic of list content is out of scope of
>>>>>    draft-ietf-ipfix-structured-data. We add a note to the draft that
>>>>>    the semantic must be clear from the context or the definition of the
>>>>>    Information Elements used within the lists.
>>>>>
>>>>> 2. We define semantic lists, such as
>>>>>    - andBasicList, andSubTemplateList, andSubTemplateMultiList
>>>>>    - orBasicList, orSubTemplateList, orSubTempalteMultiList
>>>>>    describing AND and OR semantic of the contained IEs/Templates,
>>>>>    respectively.
>>>>>
>>>>>
>>>>> As I wrote in an earlier mail, I see a good use case for orBasicList. It
>>>>> could be used in the Selector Report Interpretation of Property Match
>>>>> Filtering to report a filter like "port 80 or port 443".
>>>>> http://www.ietf.org/mail-archive/web/ipfix/current/msg04856.html
>>>>> At the moment, the Selector Report Interpretation is limited to AND.
>>>>> However, if we also want to express a NOT, we still need another solution...
>>>>>
>>>>> Regards,
>>>>> Gerhard
>>>>>
>>>>> _______________________________________________
>>>>> IPFIX mailing list
>>>>> IPFIX@ietf.org
>>>>> https://www.ietf.org/mailman/listinfo/ipfix
>>>>>
>>>>>            
>>>> _______________________________________________
>>>> IPFIX mailing list
>>>> IPFIX@ietf.org
>>>> https://www.ietf.org/mailman/listinfo/ipfix
>>>>          
>>      
>