Re: [IPFIX] draft-ietf-ipfix-anon-02.txt: WG last call review

Brian Trammell <trammell@tik.ee.ethz.ch> Thu, 08 April 2010 09:47 UTC

Return-Path: <trammell@tik.ee.ethz.ch>
X-Original-To: ipfix@core3.amsl.com
Delivered-To: ipfix@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 5DBAD3A6A2B for <ipfix@core3.amsl.com>; Thu, 8 Apr 2010 02:47:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.603
X-Spam-Level:
X-Spam-Status: No, score=-2.603 tagged_above=-999 required=5 tests=[BAYES_50=0.001, MIME_QP_LONG_LINE=1.396, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PFGy+ItaWE7r for <ipfix@core3.amsl.com>; Thu, 8 Apr 2010 02:46:57 -0700 (PDT)
Received: from smtp.ee.ethz.ch (smtp.ee.ethz.ch [129.132.2.219]) by core3.amsl.com (Postfix) with ESMTP id 6333A3A6A26 for <ipfix@ietf.org>; Thu, 8 Apr 2010 02:46:56 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by smtp.ee.ethz.ch (Postfix) with ESMTP id 46C16D9385; Thu, 8 Apr 2010 11:46:52 +0200 (MEST)
X-Virus-Scanned: by amavisd-new on smtp.ee.ethz.ch
Received: from smtp.ee.ethz.ch ([127.0.0.1]) by localhost (.ee.ethz.ch [127.0.0.1]) (amavisd-new, port 10024) with LMTP id YvYOMQoeDr+u; Thu, 8 Apr 2010 11:46:51 +0200 (MEST)
Received: from public-docking-hg-3-197.ethz.ch (public-docking-hg-3-197.ethz.ch [129.132.246.197]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: briant) by smtp.ee.ethz.ch (Postfix) with ESMTPSA id C874BD9324; Thu, 8 Apr 2010 11:46:51 +0200 (MEST)
Mime-Version: 1.0 (Apple Message framework v1078)
Content-Type: text/plain; charset="us-ascii"
From: Brian Trammell <trammell@tik.ee.ethz.ch>
In-Reply-To: <4BB5D827.1060406@cisco.com>
Date: Thu, 08 Apr 2010 11:46:51 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <643E6408-8BDF-48F5-888D-EA95C812730E@tik.ee.ethz.ch>
References: <4BB5C86D.7020107@cisco.com> <4BB5D827.1060406@cisco.com>
To: Benoit Claise <bclaise@cisco.com>
X-Mailer: Apple Mail (2.1078)
Cc: "ipfix@ietf.org" <ipfix@ietf.org>
Subject: Re: [IPFIX] draft-ietf-ipfix-anon-02.txt: WG last call review
X-BeenThere: ipfix@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: IPFIX WG discussion list <ipfix.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ipfix>, <mailto:ipfix-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ipfix>
List-Post: <mailto:ipfix@ietf.org>
List-Help: <mailto:ipfix-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipfix>, <mailto:ipfix-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 08 Apr 2010 09:47:00 -0000

hi, Benoit,

thank you very much for the review! Comments inline, irrelevant parts snipped. These changes will be applied to future anon-03, which will appear after we get Lothar's WGLC comments.

Cheers,

Brian

On Apr 2, 2010, at 1:42 PM, Benoit Claise wrote:

> Dear authors,
> 
> I review this draft.
> You should consider the comments in sequence.
> 
> Regards, Benoit.
>> IPFIX Working Group                                            E. Boschi
>> Internet-Draft                                               B. Trammell
>> Intended status: Experimental                             Hitachi Europe
>> Expires: August 19, 2010                               February 15, 2010
>> 
>>                      IP Flow Anonymisation Support
>>                       draft-ietf-ipfix-anon-02.txt

<snip>

>> 1.3.  Anonymisation within the IPFIX Architecture
>> 
>> 
>>    "Architecture for IP Flow Information Export" [RFC5470] defines the
>>    functions performed in sequence by the various functional blocks in
>>    an IPFIX Device as in the figure below.
>> 
>>                     Packet(s) coming into Observation Point(s)
>>                       |                                   |
>>                       v                                   v
>>      +----------------+-------------------------+   +-----+-------+
>>      |          Metering Process on an          |   |             |
>>      |             Observation Point            |   |             |
>> 
>> 
>> Boschi & Trammell        Expires August 19, 2010                [Page 5]
>> Internet-Draft        IP Flow Anonymisation Support        February 2010
>> 
>> 
>>      |                                          |   |             |
>>      |   packet header capturing                |   |             |
>>      |        |                                 |...| Metering    |
>>      |   timestamping                           |   | Process N   |
>>      |        |                                 |   |             |
>>      | +----->+                                 |   |             |
>>      | |      |                                 |   |             |
>>      | |   sampling Si (1:1 in case of no       |   |             |
>>      | |      |          sampling)              |   |             |
>>      | |   filtering Fi (select all when        |   |             |
>>      | |      |          no criteria)           |   |             |
>>      | +------+                                 |   |             |
>>      |        |                                 |   |             |
>>      |        |        Timing out Flows         |   |             |
>>      |        |    Handle resource overloads    |   |             |
>>      +--------|---------------------------------+   +-----|-------+
>>               |                                           |
>>       Flow Records (identified by Observation Domain)  Flow Records
>>               |                                           |
>>               +---------+---------------------------------+
>>                         |
>>    +--------------------|----------------------------------------------+
>>    |                    |     Exporting Process                        |
>>    |+-------------------|-------------------------------------------+  |
>>    ||                   v       IPFIX Protocol                      |  |
>>    ||+-----------------------------+  +----------------------------+|  |
>>    |||Rules for                    |  |Functions                   ||  |
>>    ||| Picking/sending Templates   |  |-Packetise selected Control ||  |
>>    ||| Picking/sending Flow Records|->|  & data Information into   ||  |
>>    ||| Encoding Template & data    |  |  IPFIX export packets.     ||  |
>>    ||| Selecting Flows to export(*)|  |-Handle export errors       ||  |
>>    ||+-----------------------------+  +----------------------------+|  |
>>    |+----------------------------+----------------------------------+  |
>>    |                             |                                     |
>>    |                    exported IPFIX Messages                        |
>>    |                             |                                     |
>>    |                +------------+-----------------+                   |
>>    |                |  Anonymise export packet(*)  |                   |
>>    |                +------------+-----------------+                   |
>>    |                             |                                     |
>>    |                +------------+-----------------+                   |
>>    |                |       Transport  Protocol    |                   |
>>    |                +------------+-----------------+                   |
>>    |                             |                                     |
>>    +-----------------------------+-------------------------------------+
>>                                  |
>>                                  v
>>                     IPFIX export packet to Collector
>> 
>> 
>> Boschi & Trammell        Expires August 19, 2010                [Page 6]
>> Internet-Draft        IP Flow Anonymisation Support        February 2010
>> 
>> 
>>    (*) indicates that the block is optional.
>>   
>> 
> In the general architecture, but not in this spec. right?

Indeed. This is a restatement of a part of 5470, intended mainly to provide background for why anonymisation support is crucial for IPFIX. However, the diagram is as you point out kind of confusing here. Its primary function is to frame the work within the larger, longer work of the WG.. which we do in the paragraph below anyway. So we're inclined to cut it.

>>                  Figure 1: IPFIX Device functional blocks
>>   
>> 
> I was hoping to get the reference model from the IPFIX Mediator framework.
> If the reference model was from the IPFIX Mediator framework, then we would not have the issue with optional.
>>    Note that, according to the original architecture specification,
>>   
>> 
> What is this? RFC5470?

Yes. This reference should be here (instead of "original architecture specification").

>>    IPFIX Message anonymisation is optionally performed as the final
>>    operation before handing the Message to the transport protocol for
>>    export.  While no provision is made in the architecture for
>>    anonymisation metadata as in Section 6, this arrangement does allow
>>    for the message rewriting necessary for comprehensive anonymisation
>>    of IPFIX export as in Section 7.  The development of the IPFIX
>>    Mediation [I-D.ietf-ipfix-mediators-framework] framework and the
>>    IPFIX File Format [RFC5655] expand upon this initial architectural
>>    allowance for anonymisation by adding to the list of places that
>>    anonymisation may be applied.  The former specifies IPFIX Mediators,
>>    which rewrite existing IPFIX messages, and the latter specifies a
>>   
>> 
> IPFIX Message?
> It would be nice to say in the terminology section which terms you use from other RFCs.
> For example, data set vs dataset versus Data Set

This is an unfortunate consequence of IPFIX having something called a "Data Set". We need a term within this WG that means "some data" but that does not conflict with the defined terminology. Suggestions?

> For example, flow records is lower case, but I see IPFIX File, Options Template
> Please be consistent with capitalization

Will review. Indeed, here "messages" should be "Messages", however, this sentence may need to be rewritten anyway, because IPFIX Mediators don't rewrite Messages (in the general case) as defined.

>>    method for storage of IPFIX data in files.
>> 
>>    More detail on the applicable architectural arrangements of
>>    anonymisation can be found in Section 7.1
>> 
>> 
>> 2.  Terminology
>> 
>> 
>>    Terms used in this document that are defined in the Terminology
>>    section of the IPFIX Protocol [RFC5101] document are to be
>>    interpreted as defined there.
>> 
>>    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
>>    "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
>>    document are to be interpreted as described in RFC 2119 [RFC2119].
>> 
>> 
>> 3.  Categorisation of Anonymisation Techniques
>> 
>> 
>>    Anonymisation modifies a data set in order to protect the identity of

Here is a perfect example of the point above. This is "some data", not an IPFIX Data Set. Can we agree that "data set" is not "Data Set", or should we invent some more terminology here?

>>    the people or entities described by the data set from disclosure.
>>    With respect to network traffic data, anonymisation generally
>>    attempts to preserve some set of properties of the network traffic
>>    useful for a given application or applications, while ensuring the
>>    data cannot be traced back to the specific networks, hosts, or users
>>    generating the traffic.

<snip>

>> 4.  Anonymisation of IP Flow Data

<snip>

>>    Hardware addresses uniquely identify devices on the network; while
>>    they are not often available in traffic data collected at Layer 3,
>>    and cannot be used to locate devices within the network, some traces
>>    may contain sub-IP data including hardware address data.  Hardware
>>    addresses may be mappable to device serial numbers, and to the
>>    entities or individuals who purchased the devices, when combined with
>>    external databases.  They may also leak via IPv6 addresses in certain
>>    circumstances.  Therefore, hardware address anonymisation is also
>>    important.
>>   
>> 
> I would clearly mention MAC address

Okay... will s/hardware address/MAC address/gi then.

>> 4.1.1.  Truncation
>> 
>> 
>>    Truncation removes "n" of the least significant bits from an IP
>>    address, replacing them with zeroes.  In effect, it replaces a host
>>    address with a network address for some fixed netblock; for IPv4
>>    addresses, 8-bit truncation corresponds to replacement with a /24
>>    network address.  Truncation is a non-reversible generalisation
>>    scheme.  Note that while truncation is effective for making hosts
>>    non-identifiable, it preserves information which can be used to
>>    identify an organization, a geographic region, a country, or a
>>    continent (or RIR region of responsibility).
>>   
>> 
> what is RIR?

Regional Internet Registry [RFC1466], [RFC2050]. Shall we add these as informatives, or drop the reference to RIRs?

<snip>

>> 4.3.  Timestamp Anonymisation
>> 
>> 
>>    The particular time at which a flow began or ended is not
>>    particularly identifiable information, but it can be used as part of
>>    attacks against other anonymisation techniques or for user profiling.
>>    Presice timestamps can be used in injected-traffic fingerprinting
>>   
>> 
> precise

Oops. An embarrassing word to typo. :)

> Yo might want to describe what fingerprinting is

Will do.

>>    attacks as well as to identify certain activity by response delay and
>>    size fingerprinting.  Therefore, timestamp information may be
>>    anonymised in order to ensure the protection of the entire dataset.
>> 
>>           +-----------------------+----------------------------+
>>           | Scheme                | Action                     |
>>           +-----------------------+----------------------------+
>>           | Precision Degradation | Generalisation             |
>>           | Enumeration           | Direct or Set Substitution |
>>           | Random Shifts         | Direct Substitution        |
>>           +-----------------------+----------------------------+
>> 
>> 
>> 4.3.1.  Precision Degradation
>> 
>>   
>> 
> it's like truncation, but specific to time and counter, right?

Truncation is for identifiers, precision degradation is for numbers, but yes. This is why they have the same anon technique number in metadata export.

<snip>

>> 5.  Parameters for the Description of Anonymisation Techniques
>> 
>> 
>>    This section details the abstract parameters used to describe the
>> 
>> 
>> Boschi & Trammell        Expires August 19, 2010               [Page 16]
>> Internet-Draft        IP Flow Anonymisation Support        February 2010
>> 
>> 
>>    anonymisation techniques examined in the previous section, on a per-
>>    parameter basis.  These parameters and their export safety inform the
>>    design of the IPFIX anonymisation metadata export specified in the
>>    following section.
>>   
>> 
> you use different terms for metadata
> IPFIX anonymisation metadata export
> anonymisation technique metadata export
> anonymisation metadata

will normalize these

> metadata about exported flows and the flow collection infrastructure

but this one means something different (anon metadata = "what are the properties of the anonymisation applied to this data?", while this is "anything in an options record (that might be useful to deanonymisation)").

> You might want to be consistent

Perhaps, yes. ;) Will fix these, definitely.

>> 5.1.  Stability
>> 
>> 
>>    Any given anonymisation technique may be applied with a varying range
>>    of stability.  Stability is important for assessing the comparability
>>    of anonymised information in different data sets, or in the same data
>>    set over different time periods.  In general, stability ranges from
>>    completely stable to completely unstable; however, note that the
>>    completely unstable case is indistinguishable from black-marker
>>    anonymisation.  A completely stable anonymisation will always map a
>>    given value in the real space to the same value in the anonymised
>>    space.  In practice, an anonymisation may also be stable for every
>>    data set published by an a particular producer to a particular
>>    consumer, stable for a stated time period within a dataset or across
>>    datasets, or stable only for a single data set.
>> 
>>    If no information about stability is available, users of anonymised
>>    data may assume that the techniques used are stable across the entire
>>    dataset, but unstable across datasets.  Note that stability presents
>>    a risk-utility tradeoff, as completely stable anonymisation can be
>>    used for longer-term trend analysis tasks but also presents more risk
>>    of attack given the stable mapping.
>>   
>> 
> In this section, you don't say if you must export the content of this section, while you do for all the other 5.x sections

Good point. Will mention this (i.e., that we SHOULD).

<snip>

>> 6.  Anonymisation Export Support in IPFIX
>> 
>> 
>>    Anonymised data exported via IPFIX SHOULD be annotated with
>>    anonymisation metadata, which details which fields described by which
>>    Templates are anonymised, and provides appropriate information on the
>>    anonymisation techniques used.  This metadata SHOULD be exported in
>>    Data Records described by the recommended Options Templates described
>>    in this section; these Options Templates use the additional
>>    Information Elements described in the following subsection.
>> 
>>    Note that fields anonymised using the black-marker (removal)
>>    technique do not require any special metadata support.  Black-marker
>>    anonymised fields SHOULD NOT be exported at all; the absence of the
>>   
>> 
> MUST NOT?

A MUST NOT here would conflict with the parallel guidance in RFC 5103 (section 4, para 6, page 7) regarding empty reverse IEs (SHOULD not), and in the configuration draft, which allows fields which can be unambiguously null in context to be exported as null (MAY). What about anonymisation in an environment with hardcoded templates? No reason to take freedom away from implementors in this situation. 

>>    field in a given Data Set is implicitly declared by not including the
>>    corresponding Information Element in the Template describing that
>>    Data Set.
>> 
>> 
>> 6.1.  Anonymisation Options Template
>> 
>> 
>>    The Anonymisation Options Template describes anonymisation records,
>>    which allow anonymisation metadata to be exported inline over IPFIX
>>    or stored in an IPFIX File, by binding information about
>>    anonymisation techniques to Information Elements within defined
>>    Templates.  IPFIX Exporting Processes SHOULD export anonymisation
>>    records for any Template describing exported anonymised Data Records;
>>   
>> 
> an anonymization record is actually a data record of the Anonymization Options Template Record, as opposed to an anonymized record.
> It would be better to define those terms, no

Yep. So "Anonymisation Record" and "Anonymised Data Record" will go into terminology then.

> 
>>    IPFIX Collecting Processes and processes downstream from them MAY use
>>    anonymisation records to treat anonymised data differently depending
>>    on the applied technique.
>> 
>> 
>> Boschi & Trammell        Expires August 19, 2010               [Page 18]
>> Internet-Draft        IP Flow Anonymisation Support        February 2010
>> 
>> 
>>    An Exporting Process SHOULD export anonymisation records after the
>>    Templates they describe have been exported,
>> 
> And MUST send the anonymization record before the anomized records?
> 
> Do you want to benefit from the per-stream SCTP draft.
> That would solve most of your problems: order, reliable
> Otherwise, if the collector doesn't have received the anonimization records, then it might be treating the flow records wrongly.

Yeah, but only if per-stream is anyway compatible with the underlying application. Worth a mention and an informative reference, probably.

>>  and SHOULD export
>>    anonymisation records reliably.
>> 
>>    Anonymisation records, like Templates, MUST be handled by Collecting
>>    Processes as scoped to the Transport Session in which they are sent.
>>    While the Stability Class within the anonymisationFlags IE can be
>>    used to declare that a given anonymisation technique's mapping will
>>    remain stable across multiple sessions, each session MUST re-export
>>    the anonymisation Records along with the templates.
>> 
>>    +-------------------------+-----------------------------------------+
>>    | IE                      | Description                             |
>>    +-------------------------+-----------------------------------------+
>>    | templateId [scope]      | The Template ID of the Template         |
>>    |                         | containing the Information Element      |
>>    |                         | described by this anonymisation record. |
>>    |                         | This Information Element MUST be        |
>>    |                         | defined as a Scope Field.               |
>>    | informationElementId    | The Information Element identifier of   |
>>    | [scope]                 | the Information Element described by    |
>>    |                         | this anonymisation record.  This        |
>>    |                         | Information Element MUST be defined as  |
>>    |                         | a Scope Field.                          |
>>    | informationElementId    | The Private Enterprise Number of the    |
>>    | [scope] [optional]      | enterprise-specific Information Element |
>>    |                         | described by this anonymisation record. |
>>    |                         | This Information Element MUST be        |
>>    |                         | defined as a Scope Field if present.    |
>>   
>> 
> Isn't it privateEnterpriseNumber instead of informationElementId?

Yep, typo. Will fix.

> Actually, the example and the description mention privateEnterpriseNumber
> However, I'm wondering why you need this privateEnterpriseNumber at all,  because the "E" bit is already part of informationElementId 
>  0                   1                   2                   3
>    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
>    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>    |E|  Information Element ident. |        Field Length           |
>    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>    |                      Enterprise Number                        |
>    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> 
> I'm confused.

The enterprise bit only says that it's got a PEN, not what the PEN is, so we still need it.

>>    | informationElementIndex | The Information Element index of the    |
>>    | [scope] [optional]      | instance of the Information Element     |
>>    |                         | described by this anonymisation record  |
>>    |                         | identified by the informationElementId  |
>>    |                         | within the Template.  Optional; need    |
>>    |                         | only be present when describing         |
>>    |                         | Templates that have multiple instances  |
>>    |                         | of the same Information Element.  This  |
>>    |                         | Information Element MUST be defined as  |
>>    |                         | a Scope Field if present.  This         |
>>    |                         | Information Element is defined in       |
>>    |                         | Section 6.2, below.                     |
>>    | anonymisationFlags      | Flags describing the mapping stability  |
>>    |                         | and specialized modifications to the    |
>>    |                         | Anonymisation Technique in use.  SHOULD |
>>    |                         | be present.  This Information Element   |
>>    |                         | is defined in Section 6.2, below.       |
>> 
>> 
>> Boschi & Trammell        Expires August 19, 2010               [Page 19]
>> Internet-Draft        IP Flow Anonymisation Support        February 2010
>> 
>> 
>>    | anonymisationTechnique  | The technique used to anonymise the     |
>>    |                         | data.  MUST be present.  This           |
>>    |                         | Information Element is defined in       |
>>    |                         | Section 6.2, below.                     |
>>    +-------------------------+-----------------------------------------+

<snip>

>> 6.2.2.  anonymisationFlags
>> 
>> 
>>    Description:   A flag word describing specialized modifications to
>>       the anonymisation policy in effect for the anonymisation technique
>>       applied to a referenced Information Element within a referenced
>>       Template.  When flags are clear (0), the normal policy (as
>>       described by anonymisationTechnique) applies without modification.
>> 
>>       MSB   14  13  12  11  10   9   8   7   6   5   4   3   2   1  LSB
>>       +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
>>       |                Reserved                       |LOR|PmA|   SC  |
>>       +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
>> 
>>                             anonymisationFlags IE
>> 
>>    +--------+----------+-----------------------------------------------+
>>    | bit(s) | name     | description                                   |
>>    | (LSB = |          |                                               |
>>    | 0)     |          |                                               |
>>    +--------+----------+-----------------------------------------------+
>>    | 0-1    | SC       | Stability Class: see the Stability Class      |
>>    |        |          | table below, and section Section 5.1.         |
>> 
>> 
>> Boschi & Trammell        Expires August 19, 2010               [Page 20]
>> Internet-Draft        IP Flow Anonymisation Support        February 2010
>> 
>> 
>>    | 2      | PmA      | Perimeter Anonymisation: when set (1), source |
>>   
>> 
> source based on what? I remember the lengthy discussion in biflow.

See section 7.2.2. If we still need to clarify here, we can.

>>    |        |          | address Information Elements are interpreted  |
>>    |        |          | as external addresses, and destination        |
>>    |        |          | address Information Elements are interpreted  |
>>    |        |          | as internal addresses, for the purposes of    |
>>    |        |          | associating anonymisationTechnique to         |
>>    |        |          | Information Elements.  MUST NOT be set when   |
>>    |        |          | associated with a non-endpoint (i.e., source- |
>>    |        |          | or destination-) Information Element.  SHOULD |
>>    |        |          | be consistent within a record (i.e., if a     |
>>    |        |          | source- Information Element has this flag     |
>>    |        |          | set, the corresponding destination- element   |
>>    |        |          | SHOULD have this flag set, and vice-versa.)   |
>>    | 3      | LOR      | Low-Order Unchanged: when set (1), the        |
>>    |        |          | low-order bits of the anonymised Information  |
>>    |        |          | Element contain real data.  This modification |
>>    |        |          | is intended for the anonymisation of          |
>>    |        |          | network-level addresses while leaving         |
>>    |        |          | host-level addresses intact in order to       |
>>    |        |          | preserve host level-structure, which could    |
>>    |        |          | otherwise be used to reverse anonymisation.   |
>>    |        |          | MUST NOT be set when associated with a        |
>>    |        |          | truncation-based anonymisationTechnique.      |
>>    | 4-15   | Reserved | Reserved for future use: SHOULD be cleared    |
>>    |        |          | (0) by the Exporting Process and MUST be      |
>>    |        |          | ignored by the Collecting Process.            |
>>    +--------+----------+-----------------------------------------------+

<snip>

>> 6.2.3.  anonymisationTechnique
>> 
>>   
>> 
> While reading, it looks more logical to have the anonymisationTechnique before the anonymizationFlags IE

For IEs, the sections in the document (as in 5610, 5655) are alphabetized by IE name. However, you're right, it makes more sense to do them in "logical" order when there aren't so many of them.

>>    Description:   A description of the anonymisation technique applied
>>       to a referenced Information Element within a referenced Template.
>>   
>> 
> And options template as well? For example, in my company, we export a options template record with the sampling rates.
> This is something I missed completely, and is valid for all IEs

Absolutely. Will add.

>>       Each technique may be applicable only to certain Information
>>       Elements and recommended only for certain Infomation Elements;
>>       these restrictions are noted in the table below.

<snip>

>> 7.1.  Arrangement of Processes in IPFIX Anonymisation

<snip>

>>    When data is to be published as an anonymised data set in an IPFIX
>>    File [RFC5655], the anonymisation may be done at the final Collecting
>>    Process before storage and dissemination, as well. 
>> 
> For me, it's a different problem. IPFIX File could be done on routers.
> Basically, this paragraph is about the relationship with IPFIX File.

Mainly because this section is about the arrangement of processes, and a File Writer is something 

> Btw, we're alos missing the relationship with reducing redundancy

I'm not sure I get what's special about reducing redundancy with respect to anon.

> with IPFIX structured data
> For the later, does it apply to all instances of a template? So, some kind of recursivity is implied

This is probably something that should be addressed in Structured Data (the only hard question here is how meta-template information is handled with basicList) as opposed to the other way around, because of the chronology, and because anon is not the only meta-template we may have to address. In general we should avoid normative references that go forward in time...

Since this is metatemplate information, it should be stored by the CP as if it belongs to the template. So, yes, it applies to any instance of an template active at the time. This implies that we should point out somewhere that this information expires when the template is withdrawn or expires (ouch!).

>>  In this case, the
>>    Collector should follow the guidelines in Section 7.2, especially as
>>    regards File-specific Options in Section 7.2.4
>> 
>>    In each of these data flows, the anonymisation of records is
>>    undertaken by an Intermediate Anonymisation Process (IAP); the data
>>    flows into and out of this IAP are shown in Figure 3 below.
>> 
>>    packets --+                     +- IPFIX Messages -+
>>              |                     |                  |
>>              V                     V                  V
>>    +==================+ +====================+ +=============+
>>    | Metering Process | | Collecting Process | | File Reader |
>>    +==================+ +====================+ +=============+
>>              |      Non-anonymised | Records          |
>>              V                     V                  V
>>    +=========================================================+
>>    |          Intermediate Anonymisation Process (IAP)       |
>>    +=========================================================+
>>              | Anonymised     ^            Anonymised |
>>              | Records        |               Records |
>>              V                |                       V
>>    +===================+    Anonymisation      +=============+
>>    | Exporting Process |<--- Parameters ------>| File Writer |
>>    +===================+                       +=============+
>>              |                                        |
>>              +------------> IPFIX Messages <----------+
>> 
>>           Figure 3: Data flows through the anonymisation process
>>   
>> 
> This is exactly what I was hoping to see in figure1, i.e. this draft applying to the Intermediate Process in [IFPIX-MD-FMWK]
> ... where the IP can be an independent boxe, or inside the router.
> I completely missed that second part in the draft.

Another reason simply to drop figure 1.

>>    Anonymisation parameters must also be available to the Exporting
>>    Process and/or File Writer in order to ensure header data is also
>>    appropriately anonymised as in Section 7.2.3.
>> 
>>    Following each of the data flows through the IAP, we describe five
>>    basic types of anonymisation arrangements within this framework in
>>    Figure 4.  In addition to the three arrangements described in detail
>>    above, anonymisation can also be done at a collocated Metering
>>    Process and File Writer (see section 7.3.2 of [RFC5655]), or at a
>>    file manipulator (see section 7.3.7 of [RFC5655]).

<snip>

>> 7.2.3.  Anonymisation of Header Data
>> 
>> 
>>    Each IPFIX Message contains a Message Header; within this Message
>>    Header are contained two fields which may be used to break certain
>>    anonymisation techniques: the Export Time, and the Observation Domain
>> 
>> 
>> Boschi & Trammell        Expires August 19, 2010               [Page 28]
>> Internet-Draft        IP Flow Anonymisation Support        February 2010
>> 
>> 
>>    ID
>> 
>>    Export of IPFIX Messages containing anonymised timestamp data where
>>    the original Export Time Message header has some relationship to the
>>    anonymised timestamps SHOULD anonymise the Export Time header field
>>    using an equivalent technique, if possible.  Otherwise, relationships
>>    between export and flow time could be used to partially or totally
>>    reverse timestamp anonymisation.
>>   
>> 
> What if the time is in the future, the collector would probably discard the information.

Why? This seems highly implementation dependent.

> You might want to say which methods of 4.3.* applies. 

Good point.

<snip>
>> 
>> 7.2.4.  Anonymisation of Options Data

<snip>

>>    Since the Time Window Options Template specified for the IPFIX File
>>    Format [RFC5655] refers to the timestamps within the flow data to
>>    provide partial table of contents information for an IPFIX File, care
>>    must be taken to ensure that Options described by this template are
>>    written using the anonymised timestamps instead of the original ones.
>>   
>> 
> Is there something in the IPFIX-MIB or IPFIX-CONF that can be used to deduce something about the anonymization technique?

Not that would be exported, no. If the device supports configuration extensions for anonymisation, clearly you shouldn't export that, but it's also out of band, so not really applicable here.

<snip>

>> 8.  Examples
>> 
>>   
>> 
> A nice to have would be the flow records before and after the anonymization.

Yep. A fair amount of extra work, but, yep, completely agree. 

<snip>

>>                             1                   2                   3
>>         0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
>>        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>>        |          Set ID = 257         |          Length =  68         |
>>        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>>        |          Template 256         | flowStartSeconds       IE 150 |
>>        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>>        | no flags               0x0000 | Not Anonymised              1 |
>>        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>>   
>> 
> Maybe I missed it, but is there a specification in this document that expresses: MUST have a record for each IE of a template, even if no anonymization is applied.
> Is there a specification that expresses: MUST have a record for each IE of each template, even if no IE of a particular template is anonymized.

Nope. If an IE in a template isn't specified, it's "default" which is "maybe not anonymized, but we don't really know." We'll clarify this above.