Re: [OPSAWG] Major NAT MIB Issue: Notifications and processor requirements at the agent

Simon Perreault <sperreault@jive.com> Mon, 27 October 2014 13:35 UTC

Return-Path: <sperreault@jive.com>
X-Original-To: opsawg@ietfa.amsl.com
Delivered-To: opsawg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0F5231ACC72 for <opsawg@ietfa.amsl.com>; Mon, 27 Oct 2014 06:35:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.978
X-Spam-Level:
X-Spam-Status: No, score=-1.978 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 05RavyQ1FP0R for <opsawg@ietfa.amsl.com>; Mon, 27 Oct 2014 06:35:38 -0700 (PDT)
Received: from mail-la0-f45.google.com (mail-la0-f45.google.com [209.85.215.45]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2DDFB1AC7E7 for <opsawg@ietf.org>; Mon, 27 Oct 2014 06:35:35 -0700 (PDT)
Received: by mail-la0-f45.google.com with SMTP id gm9so4382633lab.32 for <opsawg@ietf.org>; Mon, 27 Oct 2014 06:35:34 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=lSVRXjAc63gFdRh4hr/C1VJEHk1ot0HM44bqfBbidIY=; b=mq630oaLZ9UJv2rWwaABSnYE5Xr1zBj+l5nD+0bMs2pGkw8a6RbCNzI7PNnt5MsywH lnM8MUZB0sFm8u2wATKy5z0Ro58iozun4T82+Uamz4y6hkbJ0ipraKT6f7pQ3JyDIlPx 6pt466qKVyFVGCRpf3AKpuBFKA0iRAx+4ifckUlUK+qFkrfX90Mh/s/wwnzF3RSocH/8 b5GaFRANGjlnOJwHSs+j1RnSpWPOYfyG0HedS+q/wnr3BzpWuDtnPNlhLihywaf4eBJi Hl37AQqS7QeUvK+KEc7srP/taEAUpXSiySzaRv+88Q6VrvcbKV3WRGam6KwtpDcQCZID W+eg==
X-Gm-Message-State: ALoCoQnvXwg1g51XATYgz2dd5uzpNl+jg10yBByj5f7YMKhp3n+66ro5/yJz4G6DQurcLefaG/ur
MIME-Version: 1.0
X-Received: by 10.152.9.201 with SMTP id c9mr23428467lab.38.1414416933777; Mon, 27 Oct 2014 06:35:33 -0700 (PDT)
Received: by 10.25.167.20 with HTTP; Mon, 27 Oct 2014 06:35:33 -0700 (PDT)
In-Reply-To: <544D772F.6060009@gmail.com>
References: <544D772F.6060009@gmail.com>
Date: Mon, 27 Oct 2014 09:35:33 -0400
Message-ID: <CANO7kWCNnXN-+nzaygANpiXFcHxCepUiG3V9NLVaPDRoPjkHzw@mail.gmail.com>
From: Simon Perreault <sperreault@jive.com>
To: Tom Taylor <tom.taylor.stds@gmail.com>
Content-Type: multipart/alternative; boundary="001a1132f75ee70aa40506679a39"
Archived-At: http://mailarchive.ietf.org/arch/msg/opsawg/FeTxy4xtKAOTKYJdMjUun_S7p78
Cc: "opsawg@ietf.org" <opsawg@ietf.org>, "behave@ietf.org" <behave@ietf.org>
Subject: Re: [OPSAWG] Major NAT MIB Issue: Notifications and processor requirements at the agent
X-BeenThere: opsawg@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: OPSA Working Group Mail List <opsawg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/opsawg>, <mailto:opsawg-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/opsawg/>
List-Post: <mailto:opsawg@ietf.org>
List-Help: <mailto:opsawg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/opsawg>, <mailto:opsawg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 27 Oct 2014 13:35:41 -0000

On Sun, Oct 26, 2014 at 6:35 PM, Tom Taylor <tom.taylor.stds@gmail.com>
wrote:

> A little terminology to start with:
>
> 1) A realm defines an addressing space. Simple NATs may support just two
> realms: internal and external. This is all that RFC 4008 supports. In more
> complex situations the NAT will support several or many realms. Individual
> realms in that case can be internal from the point of view of some hosts
> served by the NAT and external from the point of view of others.
>
> 2) An address mapping is an assignment of an address in a selected
> external realm to the address of a host in its internal realm.
>
> 3) An address and port mapping is a mapping between the address and port
> of a host in its internal realm for a given protocol to an address and port
> in a selected external realm. For mappings triggered by packets outgoing
> from the internal host, the external realm is selected based on the packet
> destination.
>

While this is often true in practice, in theory the NAT could pick the
external realm based on whatever arbitrary rule with which it is
configured. For example, software NATs (e.g., Linux's Netfilter) allow a
lot of flexibility here.


> For mappings triggered by packets incoming from an external realm, the
> external realm is the one from which the packet originated.
>
> 4) An address pool is a separately-managed set of addresses and ports/ICMP
> identifiers in a particular realm, available for assignment to the
> 'external' portion of a mapping. Pools can be used to ration the available
> external addresses to different realms and classes of subscribers. Where
> more than one pool has been assigned to the realm, policy determines which
> subscribers and/or services are mapped to which pool.
>
>
> The NAT MIB (draft-ietf-behave-nat-mib-11) currently has five
> notifications, which can be enabled and disabled through their respective
> threshold settings:
>
> - Address pool low threshold undershot
>   -- reports under-utilization of an address pool
>
> - Address pool high threshold exceeded
>   -- reports potential over-utilization of an address pool
>
> - Address mapping threshold exceeded
>   -- reports potential exhaustion of addresses
>      for the NAT instance as a whole
>
> - Address and port mapping threshold exceeded
>   -- reports potential exhaustion of address plus port
>      combinations for the NAT instance as a whole
>
> - Per-subscriber address and port mapping threshold exceeded
>   -- can be used for troubleshooting (enabled for specific
>      subscriber) or provisioning (enabled for all subscribers)
>
> Just as a preliminary, my personal view is that the notification relating
> to low pool utilization should not be there, and if the information is
> needed, it should be handled by the management application. Hence this one
> should be omitted from the discussion which follows.
>

Yes, I agree that that can work in practice. However I don't see the
difference in reasoning between this notification and the others. Couldn't
the others also be handled by the management application?

David Harrington in his review was concerned that these notifications
> involved calculations at the agent that were excessively demanding of
> processor capacity. This is especially true of the MIB's current algorithm
> for address pool utilization. More on that in a moment.
>
> Just for background, the MIB supports the following read-write hard
> limits, beyond which incoming packets are dropped rather than translated:
>
> - Maximum number of active address mappings
> - Maximum number of active address and port mappings
> - Maximum number of fragments awaiting reassembly
> - Maximum number of subscribers with active address and port mappings
>
> The resources that need monitoring in the NAT are NAT memory, address pool
> utilization, and processor capacity. The only item in the above list of
> notifications and limits that at least partially addresses processor
> capacity is the last limit, on number of active subscribers. Possibly, a
> notification and limit could be introduced based on the rate of change of
> the number of active mappings.
>

Agreed. And such rate-limiting is required for CGNs per REQ-4-c and REQ-5-b
from RFC6888:

      c.  Additionally, it is RECOMMENDED that the CGN include
          administrator-adjustable thresholds to prevent a single
          subscriber from consuming excessive CPU resources from the CGN
          (e.g., rate-limit the subscriber's creation of new mappings).


[...]


      b.  Additionally, it SHOULD be possible to limit the rate at which
          memory-consuming state elements are allocated.


The problem with NAT pool monitoring is that the pool actually contains two
> resources, addresses and ports, and how they get used up depends on RFC
> 4787 pooling behaviour.
>

Well, yes, but I don't see how this is a problem.

Keep in mind that some NATs don't bother with ports and allocate full
addresses. That is supported with the current pool and mapping models.

For the RECOMMENDED pooling behaviour of 'paired', port exhaustion occurs
> when all the ports for the desired protocol available to the subscriber on
> a mapped external address are used up, the NAT cannot (for reasons of
> policy or availability) assign additional ports to that subscriber on that
> same address, and the NAT cannot assign an additional address containing
> free ports to that subscriber. (This last condition is subject to
> discussion: is it port exhaustion or address exhaustion?)
>

It depends on what you mean by "cannot". A "paired" NAT may decide to
allocate a mapping from any other available address and not even consider
this an exhaustion condition. It may decide to replace the current address
mapping with this new address. Or it may consider this a "temporary address
mapping" to be reverted eventually. Or it may not record this new mapping
at all, and just allocate mappings from any random address as long as the
primary address is exhausted. Or it may refuse to create a new mapping, as
you seem to imply.

We should not "think" too much about NAT implementation... What matters is
observable behaviour, and in particular observable behaviour that is
general enough that it can be usefully modelled in a MIB. Whatever triggers
an exhaustion event is not absolutely important. We just need to be able to
report it. It will be up to the NAT administrator to determine, based on
the particular NAT implementation, what exactly exhaustion means. Some NATs
are very configurable in this matter.

For a pooling behaviour of 'arbitrary', port exhaustion occurs when there
> is no available port for the desired protocol at any of the addresses in
> the pool. Clearly port exhaustion will be a rarer event in the 'arbitrary'
> case.
>

Again, it depends on the implementation.

draft-ietf-behave-nat-mib-11 triggers notifications based on the ratio of
> active address and port mappings using a given pool to the total number of
> address-port combinations made available by the pool, expressed as a
> percentage. As David Harrington pointed out, this makes a lot of demands on
> the agent processor. We have some alternatives:
>
> 1) No notification: leave any processing to the management application.
>

The problem with this is that the management application would effectively
have to bulk fetch the full mapping table to be able to compute pool usage.
That could be much more taxing on the agent than computing pool usage by
itself.

2) Trigger the high usage notification based on the rate of occurrence of
> out-of-port events, with the same threshold value for all pools. This is
> reasonably valid statistically for a low threshold (i.e., taking the view
> that any port exhaustion at all is actionable), but not terribly reliable
> in the 'paired' case, in that a single subscriber can dominate the results.
>
> 3) Trigger the high usage notification based on the rate of occurrence of
> out-of-port events, with a higher threshold value based on pool size
> (addresses x ports) and sharing ratio. This requires some mathematical
> analysis and per-pool configuration.
>

I think these two options are non-starters because they rely on heuristics
rather than hard metrics. Plus, maintaining a "rate of occurence" is not
much less involved than computing pool usage in the first place.

Suggestion: We need to see what current NATs report, especially CGNs. For
many vendors the documentation is public...

Simon