Re: [ippm] Is loopback/direct export amplification worse than linear?

Hi Martin,
I thought of them as "concerns" ;)
I share your concerns regarding the impact of the Loopback flag. I think
that if the returned packet travels back out-of-band, not as IOAM-esque
packet that might mitigate some risks associated with the Loopback. And
though I don't think that the DEX draft explicitly states that the IOAM
data collected on a node is exported to a collector out-of-band, i.e., not
as an IOAM packet, as well, I'd like the WG to consider this approach.
The rest of my notes reflect my earlier comments.

Regards,
Greg

On Fri, Dec 4, 2020 at 2:53 PM Martin Duke <martin.h.duke@gmail.com> wrote:

> thanks Greg.
>
> I hope I'm understanding you correctly these are critiques of the drafts
> largely unrelated to mine?
>
> On Wed, Dec 2, 2020 at 3:06 PM Greg Mirsky <gregimirsky@gmail.com> wrote:
>
>> Hi Martin,
>> thank you for highlighting some troublesome characteristics of using the
>> Loopback flag in IOAM. I agree that draft-ietf-ippm-ioam-flags could
>> tighten up the description of scenarios when the Loopback flag is used.
>> Also, it seems that the behavior of Direct Export
>> <https://datatracker.ietf.org/doc/draft-ietf-ippm-ioam-direct-export/?include_text=1> can
>> benefit from more clarifications. Now I've read both drafts taking notes
>> and present them to you below:
>> *draft-ietf-ippm-ioam-flags:*
>>
>>    - draft-ietf-ippm-ioam-data defined IOAM as a method to record
>>    "operational and telemetry information in the packet while the packet
>>    traverses a path between two points in the network". It is an example of a
>>    hybrid measurement method, per RFC 7799 classification. The introduction of
>>    the Loopback and Active flags clearly makes IOAM into an active method and
>>    thus it should be discussed as a protocol that originates test packets,
>>    including potential security threats it may be used to create.
>>    - If the purpose of the Loopback flag is producing a series of
>>    responses from traversed nodes, then it might be much easier to use not a
>>    copy of the original packet but a different packet to respond to the sender
>>    (similar to CFM's LTM/LTR). That will avoid any risk of reverse-path
>>    amplification.
>>    - It is not clear what kind of path is expected in "a return path
>>    from transit nodes and destination nodes towards the source (encapsulating
>>    node) exists"? For example, if IOAM is used in an SFC environment, would
>>    Loopback require the use of an SFP to the sender?
>>    - It is not clear what establishes "the identity of the [IOAM]
>>    encapsulating node". Some examples would be helpful.
>>    - There seems to be a disconnect in handling unexpected looped IOAM
>>    packet:
>>
>> In the second paragraph Section 4 it is stated:
>> "If an encapsulating node receives a looped back packet that was not
>> originated from the current encapsulating node, the packet is dropped."
>>
>> But in the second from the last paragraph of the same section is noted
>> that:
>> "If there is no match in the Node ID, the packet is processed like a
>> conventional IOAM-encapsulated packet."
>>
>>
>>    - The following probably may be expressed using the definitions from
>>    the IANA registry:
>>
>>    An IOAM trace option that has the loopback bit set MUST have the
>>    value '1' in the most significant bit of IOAM-Trace-Type, and '0' in
>>    the rest of the bits of IOAM-Trace-Type.
>>
>> like that:
>>    An IOAM Trace Option that has the Loopback bit set MUST have the
>>    "hop_Lim and node_id in short format" bit set IOAM-Trace-Type, and
>> MUST
>>    have all other bits of IOAM-Trace-Type cleared.
>>
>>
>>    - What is the value of collecting "hop_Lim and node_id in short
>>    format" on the return path of a packet looped back by an intermediate node?
>>    As I understand it, only that information can be collected if the Loopback
>>    flag is set. Please correct me if that is not the case.
>>    - Can you point where it is specified that when using the Loopback
>>    "the packet does not have any payload"? Earlier in Section 4 action of an
>>    intermediate node is explained:
>>
>>    A loopback bit that is set indicates to the transit nodes processing
>>    this option that they are to create a copy of the received packet and
>>    send the copy back to the source of the packet.
>>
>> One can understand that the looped back packet is a carbon copy of the
>> original packet less the Loopback flag being cleared. And the actions of
>> the encapsulating node are described as:
>>    The encapsulating node either generates synthetic packets with an
>>    IOAM trace option that has the loopback flag set, or sets the loopack
>>    flag in a subset of the in-transit data packets.
>>
>> Which suggests that, at least in the latter case, the packet with the
>> Loopback flag set does have a payload.
>>
>>
>>    - One of the essential aspects of defining an active measurement
>>    protocol is the specification of the expected rates that test packets be
>>    generated for the intended purpose of the protocol. Some protocols provide
>>    a mechanism to negotiate that rate either through its control-plane
>>    component or as part of bringing the test session to Up state. I couldn't
>>    find any discussion of the expected purpose of using the Active flag, nor
>>    sufficient information that would help to set reasonable limits on
>>    processing packets with the Active flag set. Also, even though the document
>>    is on the Standards track, I find normative language in relation to the
>>    security impact of Loopback and Active flags being underused.
>>    - Applying rate limits throughout the IOAM domain might be
>>    cumbersome. I think that stronger security methods must be defined making
>>    using the Loopback and Active options more secure. For example, using
>>    authentication to ensure the identity of the encapsulating IOAM node that
>>    imposed the Loopback or Active option.
>>
>> *draft-ietf-ippm-ioam-direct-export*
>>
>>    - Section 3.1 includes very important assumption:
>>
>>   The option [DEX] can be read but not modified by transit nodes.
>>
>> That brings the question What protects the DEX option in particular and
>> IOAM-Option-Types in general?
>>
>>
>>    - The description of the processing of the DEX option by
>>    encapsulating, transit, and decapsulating nodes in an IOAM domain in all
>>    cases states "MAY export the requested IOAM data". What are other possible
>>    behaviors of nodes processing the DEX option?
>>    - In the last paragraph of Section 3.1 stated:
>>
>>   A transit IOAM node that does not support the DEX option SHOULD ignore
>> it.
>>
>> What are other allowed behaviors of a node, whether transit or
>> decapsulating? Report an error? Discard the packet with the DEX option?
>>
>>
>>    - Though considerations for performance impact at a collector(s) is
>>    sensible, it seems that the impact on the node exporting telemetry
>>    information triggered by the DEX should also be considered.
>>    - Applying rate limits throughout the IOAM domain might be
>>    cumbersome. I think that there could be other methods to improve the
>>    security of the DEX option.
>>    - I can understand that the identity of a collector(s) of exported
>>    information can be provisioned via the management or control plane. But I
>>    also believe that there must be a discussion of why the Loopback option is
>>    not a special case of the DEX option.
>>
>> Regards,
>> Greg
>>
>> On Sun, Nov 15, 2020 at 9:45 PM Martin Duke <martin.h.duke@gmail.com>
>> wrote:
>>
>>> I recognize that the loopback and direct export drafts discuss potential
>>> amplification results, as a path of length N will generate N packets for
>>> each IOAM packet -- a linear relationship. As these are discussed in the
>>> draft with proposed mitigation, in an editorial sense the job is done.
>>> Whether giving people this kind of foot gun is a good idea is a separate
>>> question, but reasonable people can disagree.
>>>
>>> However, if I understand correctly, certain pathological combinations of
>>> IOAM namespaces could result in amplification far worse than linear.
>>>
>>> 1) Loopback loops
>>> Imagine there is an IOAM namespace that covers the path from Hop A to
>>> Hop D. There is a separate namespace entirely contained in A-D, from hop B
>>> from hop C. Both namespaces enable loopback.
>>>                            A
>>> --------------------------------------------------------D
>>>
>>>  B-------------------------------C
>>>
>>> So a user packet is traveling from A to D. There will be (D-A) loopback
>>> packets headed towards A. The (D - C) loopback packets that generated
>>> between C and D will travel through the encapsulating node at C and then
>>> trigger further loopbacks to C. Thus the total number of packets generated
>>> by the single user packet is
>>> (D - A) + (D - C)(C - B)
>>> and obviousy there could be several of these smaller namespaces, or
>>> nested namespaces, that aggravate the amplification further.
>>>
>>> 2) Direct Export
>>> Direct Export potentially has even worse properties. Suppose namespace
>>> A, N_A hops across, direct exports to a node that is reachable across
>>> namespace B. Namespace B, N_B hops across, direct exports to a node
>>> reachable across namespace A.
>>>
>>> Thus a single user packet sent over namespace A will generate N_A
>>> packets that traverse Namespace B. This will, in turn, generate N_A * N_B
>>> packets that traverse namespace A, which in turn triggers N_A ^ 2 * N_B
>>> packets, and so on.
>>>
>>> In fact, if the namespaces direct export with probability exactly,1/N_A
>>> and 1/N_B, the same level of traffic will ping-pong infinitely. Any more
>>> than that, and the traffic will steadily increase to infinity. If the
>>> probability is 1, the growth is exponential.
>>>
>>> ****************
>>>
>>> This analysis, if correct, raises three questions that I can think of:
>>> - Are these corner cases plausible?
>>> - If so, and they occurred, would the namespace administrators
>>> necessarily be aware of each other? If not, I'm concerned that this is
>>> unsafe to deploy.
>>> - Do phenomena like this indicate that the design is brittle and putting
>>> in some "considerations" doesn't really mitigate the dangers here?
>>>
>>> Martin
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> ippm mailing list
>>> ippm@ietf.org
>>> https://www.ietf.org/mailman/listinfo/ippm
>>>
>>