Re: [Anima-signaling] comments on GRASP-07 draft

Brian E Carpenter <brian.e.carpenter@gmail.com> Wed, 05 October 2016 19:48 UTC

From: Brian E Carpenter <brian.e.carpenter@gmail.com>
To: anima-signaling@ietf.org
References: <27023.1475255753@obiwan.sandelman.ca>
Organization: University of Auckland
Message-ID: <95619e19-6b7b-b8e5-16d4-f215b375bde0@gmail.com>
Date: Thu, 06 Oct 2016 08:48:56 +1300
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0
MIME-Version: 1.0
In-Reply-To: <27023.1475255753@obiwan.sandelman.ca>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/anima-signaling/tEXCvOI99qNyvcF-I-H7lb5czW4>
Subject: Re: [Anima-signaling] comments on GRASP-07 draft
Precedence: list

Responses in line:

On 01/10/2016 06:15, Michael Richardson wrote:
> 
> I read draft-ietf-anima-grasp-07 from beginning to end.
> This is probably my first read through in many months.
> I will be happy to turn these into issues in the tracker if told to do so.

At the moment I've been embedding the issues into the draft itself, but
if there's a good reason to use a tracker we can do so of course.

> I am overall pleased with the document, it read well.

Thanks

> I did feel like there was a section missing between 2 and 3, that gives a
> detailed architecture use of GRASP, without resorting to bits on the wire.
> 
> I think that section 3.3 is trying to do this, but it failing because it gets
> into security before it has explained anything about how things work.
> Maybe if 3.3.1/3.3.2 were moved to a new section 3.3b entitles "Protocol Security"
> or some such.

Yes, this is a good point. I think it would flow better if rearranged somehow.
I'm not certain it needs a new section but the logical flow is wrong. We can
take this on board.

> I had to go learn about CDDL from  draft-greevenbosch-appsawg-cbor-cddl-08
> (now -09), and then how it maps to bytes on the wire.
> 
> Can we please have some worked through examples with bytes on the wire?
> I will attempt to contribute some.

As per the discussion with Carsten, this is easy to generate but could
confuse the reader too. I really think it's Appendix material. Probably
enough to take one sample message and reply such as a discover/response?

> Some specific text that I didn't like:
> 
> 3.3.4,  Discovery Procedures section says:
> 
>          An exponential backoff SHOULD be
>          used for subsequent repetitions, in order to mitigate possible
>          denial of service attacks.
> 
> I agree that an exponential backoff SHOULD be used by senders in order to
> deal with overloads due to unintentional in corrolations of senders.
> But, telling well-behaved senders what to do does nothing to mitigate denial
> of service attacks.  The point is that the malicious attackers are not
> well-behaved.

Right.

> If the point is to tell responders that they should backoff in their replies,
> or rate limit their replies, then that would make sense, but since the reply
> will be by TCP (as I understand it), then the opportunity to do forged source
> address attacks is rather low.

True. I could argue that rapid repetition of discovery from a single source
might be a signature of a DoS attack, but that doesn't help with DDoS.
Also, we would expect a burst of discoveries after a major event; that's
exactly when autonomic functions will get busy.

Will fix.

> Later in that section, it says:
>          A GRASP device with multiple link-layer interfaces (typically a
>          router) MUST support discovery on all interfaces.  If it
>          receives a Discovery message on a given interface for a
>          specific objective that it does not support and for which it
>          has not previously cached a Discovery Responder, it MUST relay
>          the query by re-issuing a Discovery message as a link-local
>          multicast on its other interfaces.  The relayed discovery
>          message MUST have the same Session ID as the incoming discovery
>          message and MUST be tagged with the IP address of its original
>          initiator.  Since the relay device is unaware of the timeout
>          set by the original initiator it SHOULD set a timeout at least
>          equal to GRASP_DEF_TIMEOUT milliseconds.
> 
> 
> Could we rewrite this into positive language, and also split it up, and maybe
> even number some of these things.   I suggest:
> 
> 3.3.4.1     GRASP discovery relaying by routers
> 
>          A GRASP device with multiple link-layer interfaces (typically a
>          router) MUST support discovery on all interfaces.
> 
>          Different interfaces may be at different security levels: each group
>          of interfaces with the same security level SHOULD be serviced by the
>          same GRASP process, except for Limited Security Instances which are
>          always single-interface instances.
> 
>          When a router receives a Discovery message on any interface,
>          for an objective that it supports, then acting like any other
>          GRASP device, it replies to it.
> 
>          When a router receives a Discovery message for an objective that
>          it does not support, but which for which it has previously cached
>          a response, then it replies to that request with the cached
>          information.
> 
>          When a router receives a Discovery message for an objective that
>          it does not support, and for which it has no cached response, then
>          it MUST relay the query by re-issuing a Discovery message as a link-local
>          multicast on ALL of its other interfaces which are at the same
>          security level as the incoming interface.
> 
>          The relayed discovery message formed MUST have the same Session ID
>          as the incoming discovery message and MUST be tagged (XXX HOW?)
>          with the IP address of its original initiator.  Since the relay
>          device is unaware of the timeout set by the original initiator it
>          SHOULD set a timeout at least equal to GRASP_DEF_TIMEOUT milliseconds.

Point taken. The (XXX HOW?) should be (see Section 3.7.3. Discovery Message)
The idea was to describe the procedure here but not bother with message
details until later.

> section 3.7.3 says:
>    The discovery initiator sends the Discovery messages via UDP to port
>    GRASP_LISTEN_PORT at the link-local ALL_GRASP_NEIGHBOR multicast
>    address.  It then listens for unicast TCP responses on the same port,
>    and stores the discovery results (including responding discovery
>    objectives and corresponding unicast locators).
> 
> 
> I have a problem with the mixing of UDP and TCP port numbers in this way.

Let me stop you right here. Good catch. That's obviously wrong. Proof: here
is what my code says when I start it up with a couple of IPv6 interfaces:

_MainThread 4068 Starting a discovery TCP listener for interface ('::', 53525, 0, 0) 11
_drlisten 5332 Discovery response listener for interface 11 is up
_MainThread 4068 Starting a discovery TCP listener for interface ('::', 53526, 0, 0) 12
_drlisten 5036 Discovery response listener for interface 12 is up

Two different port numbers, neither of which is GRASP_LISTEN_PORT.

[Note to self: check why Windows isn't doing port randomization.]

So when I look at my discovery responder code, of course it logs
the port from which the discovery multicast came, which would be
53525 or 53526 in the above instance. And that's the port the response
goes back to. So what I do is find out which port is used to source
the outgoing multicast, and bind the TCP socket to that port.
So far, that code has never crashed.

a) is that safe? There is theoretically a race condition where
somebody else could bind to that port just before I do.

b) the draft is completely wrong on this.

> First of all, this is text is ambiguous because the binding of "same" is
> unclear to me.  Let me number things so that the ambiguity is clear:
>           sender:       UDP from: X  -> to: GRASP_LISTEN_PORT
>           responder:    TCP from: Z  -> to: Y
> 
>   "then listens for unicast TCP responses on the same port"
> 
> could mean that         Y = GRASP_LISTEN_PORT,
> or it could mean that   Y = X
> 
> Could we just *say* in the UDP which port the initiator is going to listen on?

If the race condition is real, we should. But if we can avoid the race
condition, we don't need to. Read on...

> Assuming Y=GRASP_LISTEN_PORT, may I suggest that if we say port '0' that it
> means grasp_listen_port, which is often GRASP_LISTEN_PORT, except if one
> might be debugging,etc. when one might set it to another port.   That is, we
> normally say, "0" to mean GRASP_LISTEN_PORT whatever it might be at that
> moment.  When we to reach at a port!="0", then we mean that port.
> 
> Y=X can be made to work if one picks X well, but may be unworkable on some
> platforms.

The sequence that already works on Windoze and Linux is roughly
(please excuse Pythonism):

#send a dummy packet to prime the multicast socket
mc_socket.sendto(_msg_bytes,0,(str(ALL_GRASP_NEIGHBOR_6), GRASP_LISTEN_PORT))

#find out the port number used
port = mc_socket.getsockname()[1]

#make a TCP socket and bind to that port
tcp_socket = socket.socket(socket.AF_INET6, socket.SOCK_STREAM)
tcp_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
tcp_socket.bind(('',port))

If the bind succeeds, we've got the same socket number. If it fails,
we've got a problem.

There are two choices to fix this:
1) Implementation: repeat the above process until the bind() succeeds.
Almost always, it succeeds first time.
2) Protocol: add the port number to the Discovery message. That's a bit
annoying but could be done.

Anyway - there's a bug in the text.

> (I'm saying this mostly from experience with IKE, which originally said to
> ignore the source port of the UDP, and always use 500, which was a problem
> when it came to NAT traversal, but more to the point, it made it sometimes
> painful to test in-vitro).
> 
> I'm understanding that ONLY Discovery requests go out via UDP, but that all
> Discovery responses occur via TCP.  I'm a bit concerned about this process
> during bootstrap.  The Discovery initiator will have to be prepared for many
> TCP connections...

Why? If the pledge wants to discover a proxy, it's at most going to get one
response per router on the LAN, so not many connections. And actually, it
doesn't matter if it loses some responses, as long as it gets at least
one working proxy. If we use the alternative model where the proxy floods out
its coordinates, the problem doesn't even arise.

> I guess we had assumed that the reply would be UDP as
> well.

Yes, it could be but that seems to lead to significant coding complications
(ones that are well known by DNS server implementers). TCP accept() really
does all that for you.

Regards,

    Brian

[Anima-signaling] comments on GRASP-07 draft Michael Richardson
Re: [Anima-signaling] comments on GRASP-07 draft Carsten Bormann
Re: [Anima-signaling] comments on GRASP-07 draft Brian E Carpenter
Re: [Anima-signaling] comments on GRASP-07 draft Brian E Carpenter
Re: [Anima-signaling] comments on GRASP-07 draft Carsten Bormann
Re: [Anima-signaling] comments on GRASP-07 draft Michael Richardson
Re: [Anima-signaling] comments on GRASP-07 draft Michael Richardson
Re: [Anima-signaling] comments on GRASP-07 draft Brian E Carpenter
Re: [Anima-signaling] comments on GRASP-07 draft Michael Richardson
Re: [Anima-signaling] comments on GRASP-07 draft Brian E Carpenter
Re: [Anima-signaling] comments on GRASP-07 draft Liubing (Leo)
Re: [Anima-signaling] comments on GRASP-07 draft Brian E Carpenter