Re: sockets APIs extensions for Host Identity Protocol

Keith Moore <moore@cs.utk.edu> Fri, 11 May 2007 20:53 UTC

Return-path: <discuss-bounces@apps.ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1Hmc7i-0007ga-RT; Fri, 11 May 2007 16:53:58 -0400
Received: from discuss by megatron.ietf.org with local (Exim 4.43) id 1Hmc7h-0007gP-OI for discuss-confirm+ok@megatron.ietf.org; Fri, 11 May 2007 16:53:57 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1Hmc7h-0007gH-Ek for discuss@apps.ietf.org; Fri, 11 May 2007 16:53:57 -0400
Received: from ka.cs.utk.edu ([160.36.56.221]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1Hmc7g-0008Oi-TI for discuss@apps.ietf.org; Fri, 11 May 2007 16:53:57 -0400
Received: from localhost (localhost [127.0.0.1]) by ka.cs.utk.edu (Postfix) with ESMTP id 71600CB3E0; Fri, 11 May 2007 16:53:56 -0400 (EDT)
X-Virus-Scanned: by amavisd-new with ClamAV and SpamAssasin at cs.utk.edu
Received: from ka.cs.utk.edu ([127.0.0.1]) by localhost (ka.cs.utk.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6d0BsH7CgVmp; Fri, 11 May 2007 16:53:45 -0400 (EDT)
Received: from lust.indecency.org (user-119b1dm.biz.mindspring.com [66.149.133.182]) by ka.cs.utk.edu (Postfix) with ESMTP id 1E9A1CB3EB; Fri, 11 May 2007 16:53:44 -0400 (EDT)
Message-ID: <4644D7D7.50109@cs.utk.edu>
Date: Fri, 11 May 2007 16:53:43 -0400
From: Keith Moore <moore@cs.utk.edu>
User-Agent: Thunderbird 2.0.0.0 (Macintosh/20070326)
MIME-Version: 1.0
To: der Mouse <mouse@Rodents.Montreal.QC.CA>
Subject: Re: sockets APIs extensions for Host Identity Protocol
References: <Pine.SOL.4.64.0705041801060.14418@kekkonen.cs.hut.fi> <20070507082737.GB21759@nic.fr> <46413DD7.8020702@cs.utk.edu> <20070509121703.GA21070@nic.fr> <4641CA52.70504@cs.utk.edu> <Pine.LNX.4.64.0705091449360.26169@hermes-1.csi.cam.ac.uk> <4641D94C.9070304@cs.utk.edu> <Pine.SOL.4.64.0705102013550.10049@kekkonen.cs.hut.fi> <46436B10.5090706@cs.utk.edu> <Pine.SOL.4.64.0705102159020.10049@kekkonen.cs.hut.fi> <4643F873.3000501@cs.utk.edu> <Pine.SOL.4.64.0705110851440.24038@kekkonen.cs.hut.fi> <46442588.7020405@cs.utk.edu> <200705111314.JAA17866@Sparkle.Rodents.Montreal.QC.CA> <4644830D.7050302@cs.utk.edu> <200705111642.MAA19058@Sparkle.Rodents.Montreal.QC.CA>
In-Reply-To: <200705111642.MAA19058@Sparkle.Rodents.Montreal.QC.CA>
X-Enigmail-Version: 0.95.0
OpenPGP: id=E1473978
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
X-Spam-Score: 0.1 (/)
X-Scan-Signature: 963faf56c3a5b6715f0b71b66181e01a
Cc: discuss@apps.ietf.org
X-BeenThere: discuss@apps.ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: general discussion of application-layer protocols <discuss.apps.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/discuss>, <mailto:discuss-request@apps.ietf.org?subject=unsubscribe>
List-Post: <mailto:discuss@apps.ietf.org>
List-Help: <mailto:discuss-request@apps.ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/discuss>, <mailto:discuss-request@apps.ietf.org?subject=subscribe>
Errors-To: discuss-bounces@apps.ietf.org

der Mouse wrote:
>>> What do you see as wrong with [getaddrinfo()]?
>>>       
>> the nice thing about it is that it does two things for you.  one is
>> that it completely initializes sockaddr structures for you.   the
>> other is that (at least in recent versions?) it can take either a DNS
>> name or an address literal and do the right thing with it.
>>     
>
> The major thing *I* use it for is to insulate the application from
> knowledge of what address families are available.  
that's just fine for some corner cases, and if it works for your app,
great - but it's not the general case.  I like that getaddrinfo() can do
this, but I've found a surprising number of cases where I need to treat
IPv4 and IPv6 addresses differently.   there are several reasons for this:

- one is that the default address selection rules are just broken - or
at least, hopelessly naive.   for example it is nearly always better to
use a native IPv4 (non-private) address in preference to a 6to4 or
Teredo IPv6 address that would require use of a relay router - but
whether it's actually better or not depends on the application and how
it uses addresses (e.g. whether it does referrals). 

- another reason is that an increasing number of ISPs are supplying DNS
servers that lie about the contents of their zone and return bogus A
records for any zone that doesn't have A records - even if the zone does
have other valid records like AAAA records.  those bogus A records point
to http servers that will "helpfully" respond to a request for a web
page with that ( supposedly misspelled) domain and return pointers to
other web sites that might be the right one - along with advertisements
that make money for the ISP.  I've experimented with various kinds of
hacks in order to keep my apps from getting screwed by such ISPs, but
all of them require my app to be aware of address families.

- NATs, private addresses, link-local addresses, LLMNR all complicate
the picture for an app that's trying to work in a variety of
environments.  again, the app needs knowledge of address types in order
to try to sort things out. 
> If correctly written
> (getaddrinfo() can be used wrong, as most things can), an application
> can be totally new-AF-ready.  If I write an SMTP client with
> getaddrinfo() and then, a year later, someone defines SMTP-over-DECnet
> (or SMTP-over-IPv7, or whatever), my code will start trying DECnet
> without any changes whatsoever to the application.  (If, of course,
> it's running on a DECnet-aware - or IPv7-aware - system.)
>   
and the chances that this will actually work correctly are nil, because
every new kind of address family will be subtly different than the old
ones.  for instance, SMTP over DECnet (which does exist, or at least did
exist) imposed a maximum line length of 512 bytes.  to take a more
recent example, SCTP isn't quite a drop-in substitute for TCP even if
both ends support it, because it lacks a clean close and urgent data,
and there are apps that rely on each of these.

again, it's nice that getaddrinfo() gives you a way to pretend that IPv4
and IPv6 are interchangeable for the benefit of the apps that can use
that feature, but it's not as if all apps can or should work that way.
>> - it's actually more difficult to set the parameters for
>>    getaddrinfo() to get it to do what you want, than it is to
>>    initialize the fields in a sockaddr_in*
>>     
>
> Yes, if all you want is AF_INET - or AF_INET6 - it is.  But this is
> *not* true if "what you want" is for the app code to be AF-agnostic, as
> I sketched above.  I'm not even sure it's true if you want AF_INET
> *and* AF_INET6; I find it about evenly balanced whether getaddrinfo()
> or doing both INET* AFs is messier, more difficult, whatever.
>   
there are more quirks than that.  you have to know whether you have a
numeric address and port.
>> - getaddrinfo encourages use of string constants to specify ports,
>>    rather than numeric constants, which is wrong -
>>     
>
> No, this is very right. 
sorry, I've spent way too much time debugging brain-damaged code that
used getservbyname() and which failed when the NIS server went down or
the /etc/services file got corrupted EVEN THOUGH THE PORT TO BE USED WAS
A DEFINED CONSTANT IN THE PROTOCOL SPECIFICATION.   anytime you invoke a
function call that should always, always, always return a constant and
that function can possibly do anything else but return that constant,
you are basically asking for the sack.  and you deserve it.
>  The notion that "ports" are small integers is
> a *horribly* IP-centric notion.
and the vast majority of our protocols are, by definition, IP centric. 
the socket() interface can and has been used for non-IP networking, but
trying to make a name lookup function support the lookup semantics of
every possible future networking technology is just naive.  that kind of
generality almost never ends up being used.   what we need to make IP
applications be robust is a name lookup function that works very well
with DNS and DNS records.
>>    it adds an extra layer of indirection that can (and does) fail,
>>    and it tempts people to try to use that layer of indirection to do
>>    things that violate protocol specifications (like do a SRV lookup
>>    for protocols that aren't specified to use SRV)
>>     
>
> We must be talking about different getaddrinfo()s.  The one I'm used to
> doesn't do anything with SRV records, ever, as far as I can see (I
> checked both the manpage and the code).
>   
it's been proposed, more than once, because it seems "obvious" to those
who misunderstand SRV.  part of the problem is, the definition of the
function doesn't actually say what the function does with respect to DNS
because it's too vague - probably as a result of trying to take future
hypothetical networking stacks into account.
>> - it has no way to be used asynchronously other than to put each call
>>    to getaddrinfo in a separate thread,
>>     
>
> Neither does gethostbyname().  And, without a lot of callback or AIO
> scaffolding - or threading - it *can't*.
>   
I don't think it's quite as bad as you make it out to be.  but I need to
either write one or find an existing example to be sure.
>> - it's not specified to use DNS.
>>     
>
> It's not really specified clearly at all, it appears.  Again, a valid
> criticism, but a fixable one.
>
> But it can't really be specified to use DNS, since it is not restricted
> to address families that the DNS supports
that's a bug, not a feature.
> , and it also has to work on
> systems not connected to the Internet (/etc/hosts and its ilk).  

that's possibly also a bug, not a feature.  too many apps break when
address lookups on different hosts produce inconsistent results.
> If I
> had to set up a private DNS root just to use host names on an isolated
> subnet, I'd call that cripplingly broken.
>   
no, we'd just have servers that were purpose-built to do that.  or these
days we'd use LLMNR to do that.

unfortunately the details of how to write an app that works well on all
of: the public internet, private networks with some interconnection to
other private networks, and isolated subnets, don't seem to have ever
been worked out - not for name lookup, address selection, automatic host
configuration, or anything else.  so we have a hodgepodge of
half-solutions and no guidance for either software authors or network
operators.
>>    so if a protocol is specified to use DNS in some particular way,
>>    and the implementation of the protocol uses getaddrinfo(), the
>>    implementation may fail to strictly follow the protocol spec.
>>     
>
> This is not a problem with getaddrinfo(); this is a problem with
> implementors using inappropriate calls.  getaddrinfo() is not
> appropriate for implementing protocols defined must-use-DNS, because of
> the above issues.  
well, then getaddrinfo() is not appropriate for a large number of IETF
protocols, because these either explicitly or implicitly expect DNS
(e.g. if DNS is the means by which the protocol engines locate one
another in the absence of explicit agreement, as is the case for SMTP).
> (I'd actually argue that such specs are broken,
> because they mean the protocol is unreasonably difficult to use
> according to spec except on the public Internet.)
>   
IETF tends to define how protocols work on the public Internet, and to
not concern itself quite so much with what happens elsewhere between
consenting adults.   traditionally that's been the right answer, but
nowadays isolated/ad-hoc/private networks and private interconnections
are increasingly common - and we need for apps to be able to run in all
of those environments without changes or special configuration.
>> - the handling of v4 vs v6 vs mapped v4 addresses is confusing.
>>     
>
> No more than v4 vs v6 vs mapped v4 are to begin with.
>
> The rest of your points - including the pieces I cut from the oens I
> did address specifically - seem to fall into two categories: "no clear
> agreed-upon spec" and "buggy implementations are common".  While these
> are fair answers to my question, neither one seems to me like a reason
> to ditch the interface entirely when considering standardizing
> something for (say) HIP/HIT use

I've seen so much variation between implementations of getaddrinfo that
I've become convinced that it needs to be replaced for that reason
alone.  But the variation seems to have resulted from two things: the
vagueness of the spec (in turn the result of overgenerality)  and a
demand to get the interface nailed down before the requirements were
understood.

Keith