Re: What is the right way to do Web Services discovery?

Mark Andrews <marka@isc.org> Tue, 22 November 2016 22:38 UTC

Return-Path: <marka@isc.org>
X-Original-To: ietf@ietfa.amsl.com
Delivered-To: ietf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 30A6C129B91 for <ietf@ietfa.amsl.com>; Tue, 22 Nov 2016 14:38:47 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.675
X-Spam-Level:
X-Spam-Status: No, score=-6.675 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, RP_MATCHES_RCVD=-1.497, SPF_PASS=-0.001, URIBL_SBL=1.623, URIBL_SBL_A=0.1] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Rn3Vvo0W5zIa for <ietf@ietfa.amsl.com>; Tue, 22 Nov 2016 14:38:44 -0800 (PST)
Received: from mx.ams1.isc.org (mx.ams1.isc.org [IPv6:2001:500:60::65]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 43D15129440 for <ietf@ietf.org>; Tue, 22 Nov 2016 14:38:44 -0800 (PST)
Received: from zmx1.isc.org (zmx1.isc.org [149.20.0.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx.ams1.isc.org (Postfix) with ESMTPS id 9F80A1FCAB9; Tue, 22 Nov 2016 22:38:38 +0000 (UTC)
Received: from zmx1.isc.org (localhost [127.0.0.1]) by zmx1.isc.org (Postfix) with ESMTPS id 6B1D5160045; Tue, 22 Nov 2016 22:38:37 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1]) by zmx1.isc.org (Postfix) with ESMTP id 37C65160074; Tue, 22 Nov 2016 22:38:37 +0000 (UTC)
Received: from zmx1.isc.org ([127.0.0.1]) by localhost (zmx1.isc.org [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 8ZkZCCyNwX-e; Tue, 22 Nov 2016 22:38:37 +0000 (UTC)
Received: from rock.dv.isc.org (c27-253-115-14.carlnfd2.nsw.optusnet.com.au [27.253.115.14]) by zmx1.isc.org (Postfix) with ESMTPSA id E9974160045; Tue, 22 Nov 2016 22:38:35 +0000 (UTC)
Received: from rock.dv.isc.org (localhost [IPv6:::1]) by rock.dv.isc.org (Postfix) with ESMTP id 8763E5AC9016; Wed, 23 Nov 2016 09:38:31 +1100 (EST)
To: Phillip Hallam-Baker <phill@hallambaker.com>
From: Mark Andrews <marka@isc.org>
References: <CAMm+LwgtJuLdL_RKJNSVNGODGj8D25nfj0jkhnBLFS=aaXG+rA@mail.gmail.com>
Subject: Re: What is the right way to do Web Services discovery?
In-reply-to: Your message of "Tue, 22 Nov 2016 10:04:34 -0500." <CAMm+LwgtJuLdL_RKJNSVNGODGj8D25nfj0jkhnBLFS=aaXG+rA@mail.gmail.com>
Date: Wed, 23 Nov 2016 09:38:31 +1100
Message-Id: <20161122223831.8763E5AC9016@rock.dv.isc.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf/hpLGh3NoEe6HANwdvYFu_RyqkpQ>
Cc: IETF Discussion Mailing List <ietf@ietf.org>
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf/>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 22 Nov 2016 22:38:47 -0000

In message <CAMm+LwgtJuLdL_RKJNSVNGODGj8D25nfj0jkhnBLFS=aaXG+rA@mail.gmail.com>
, Phillip Hallam-Baker writes:
> 
> I am asking here as there seems to be a disagreement in HTTP land and DNS
> land.
> 
> Here are the constraints as I see them:
> 
> 0) Foir any discovery mechanism to be viable, it must work in 100% of
> cases. That includes IPv4, IPv6 and either with NAT.
> 
> 1) Attempting to introduce new DNS records is a slow process. For practical
> purposes, any discovery mechanism that requires more than SRV + TXT is not
> going to be widely used.

Absolute total garbage.

Introducing a new DNS record isn't slow.  It take a couple of weeks.
Really.  Thats how long it takes to allocate a code point.

RFC 1034 compliant recursive servers and resolver libraries should
handle it the moment you start to use it.  RFC 1034 bans compression
points in non well known types and records have a length field so
that they can be treated as opaque objects by recursive servers and
resolver libraries.  If your vendor does not a ship RFC 1034 compliant
recursive server or resolver library file a bug report and/or move
to a platform that is compliant and/or find a alternate resolver
library and use it.  There are plenty of open source resolver
libraries out there.  Some of them are 2+ decades old now that do
this right.

Authoritative servers can serve the new record immediately if they
support unknown record types which is now over a decade old (2003).

If you want to be able to use the presentation format there are
authoritative servers that are designed to make it easy to add new
record types.  This was the only step that was slow originally once
the code point was allocated.

If your DNS hosting service doesn't support unknown record types
file a bug report and find one that does or host the DNS service
yourself.

> 2) Apps area seems to have settled on a combination of SRV+TXT as the basis
> for discovery. But right now the way these are used is left to individual
> protocol designers to decide. Which is another way of saying 'we don't have
> a standard'.
> 
> 3) The DNS query architecture as deployed works best if the server can
> anticipate the further requests. So a system that uses only SRV+TXT allows
> for a lot more optimization than one using a large number of records.
> 
> 4) There are not enough TCP ports to support all the services one would
> want. Further keeping ports open incurs costs. Pretty much the only
> functionality from HTTP that Web Services make use of is the use of the URL
> stem to effectively create more ports. A hundred different Web services can
> all share port 80.
> 
> 5) The SRV record does not specify the URL stem though. Which means that
> either it has to be specified in some other DNS record (URI or TXT path) or
> it has to follow a convention (i.e. .well-known).
> 
> 6) Sometimes SRV records don't get through and so any robust service has to
> have a strategy for dealing with that situation.
> 
> 7) If we are going to get to a robust defense against traffic analysis, it
> has to be possible to secure the initial TLS handshake, i.e. before SNI is
> performed. This in turn means that it must be possible to pull information
> out of that exchange and into the DNS. Right now we don't know what that
> information is but this was not a use case considered by DANE.
> 
> 8) We are probably going to want to transition Web Services to 'something
> like QUIC' in the near future. Web Services really don't need a lot more
> than a TCP stream. Most of HTTP just gets in the way. But the multiplexing
> features in QUIC could be very useful.
> 
> 
> 
> 
> Right now we have different ideas on how this should work in the HTTP space
> and DNS space. And this appears to be fine with the two groups as they
> don't need to talk to each other. But it really isn't possible to build
> real systems unless you offend the purists in at least one camp. I think we
> should do better and offend both.
> 
> So here is my proposal for discovery of a service with IANA protocol label
> 'fred'
> 
> 
> First the service description records. This is a TXT record setting policy
> for all instances of the fred service and a set of SRV service
> advertisements:
> 
> _fred._tcp.example.com TXT "minv=1.2 maxv=3"
> _fred._tcp.example.com SRV 0 100 80 host1.example.com
> _fred._tcp.example.com SRV 0 100 80 host2.example.com
> 
> There is also a set of round robin A records for systems behind legacy NAT.
> You could do AAAA as well but these probably aren't needed as it is
> unlikely that a router blocking SRV will pass AAAA
> 
> fred.example.com A 10.0.0.1
> fred.example.com A 10.0.0.2
> 
> And finally, we have the host description entries
> 
> host1.example.com A 10.0.0.1
> _fred._tcp.host1.example.com TXT "minv=1.2 maxv=2 tls=1.2 path=/fred12"
> host2.example.com A 10.0.0.1
> _fred._tcp.host2.example.com TXT "tls=1.3"
> 
> So here we have some host level service description tags which obviously
> override the ones specified at the service level. With the proviso that a
> client might well abort if the service level description suggests there is
> no acceptable host. The path descriptor allows the use of the well known
> service to be avoided on host1. It defaults on host2
> 
> In the normal run of things, a DNS server would recognize that a request
> for _fred._tcp.example.com SRV was likely the start of a request chain and
> send all the records describing the service in a single bundle. This should
> usually fit in a single UDP response.
> 
> This approach gives us two levers allowing us to set policy for the
> service. We can define policy for all service instances or granular per
> host information.
> 
> 
> The bit that I have not got nailed down is what the HTTP URL should be
> after the service discovery is performed. My view is that they should be
> these:
> 
> http://host1.example.com/fred12
> http://host2.example.com/.well-known/fred
> 
> Which works nicely with the existing code and but not for TLS operations.
> We will either need certs for host1.example.com and host2.example.com or
> have to override the TLS stack to accept certs for example.com.
> 
> The problem becomes even more apparent if the redirects are to
> host1.cloudly.com and host2.cloudly.com where cloudly is a cloud service
> provider. So the alternative is to do this:
> 
> 
> http://example.com/fred12
> http://example.com/.well-known/fred
> 
> The problem is that it does not work well when trying to use this strategy
> with existing http clients built into scripting languages. Instead of just
> writing a module that does the SRV lookup and spits out the URLs and
> attributes, now we need to rewrite our client so it will hit the right DNS
> address.
> 
> 
> Given that most libraries seem to have hooks to allow a client to make its
> own TLS certificate path math choices, I am very strongly in favor of the
> first approach. But I am willing to be persuaded otherwise.
> 
> Comments?
> 
> --001a1147010626c41e0541e517e6
> Content-Type: text/html; charset=UTF-8
> Content-Transfer-Encoding: quoted-printable
> 
> <div dir=3D"ltr"><div class=3D"gmail_default" style=3D"font-size:small">I a=
> m asking here as there seems to be a disagreement in HTTP land and DNS land=
> .</div><div class=3D"gmail_default" style=3D"font-size:small"><br></div><di=
> v class=3D"gmail_default" style=3D"font-size:small">Here are the constraint=
> s as I see them:</div><div class=3D"gmail_default" style=3D"font-size:small=
> "><br></div><div class=3D"gmail_default" style=3D"font-size:small">0) Foir =
> any discovery mechanism to be viable, it must work in 100% of cases. That i=
> ncludes IPv4, IPv6 and either with NAT.</div><div class=3D"gmail_default" s=
> tyle=3D"font-size:small"><br></div><div class=3D"gmail_default" style=3D"fo=
> nt-size:small">1) Attempting to introduce new DNS records is a slow process=
> . For practical purposes, any discovery mechanism that requires more than S=
> RV + TXT is not going to be widely used.</div><div class=3D"gmail_default" =
> style=3D"font-size:small"><br></div><div class=3D"gmail_default" style=3D"f=
> ont-size:small">2) Apps area seems to have settled on a combination of SRV+=
> TXT as the basis for discovery. But right now the way these are used is lef=
> t to individual protocol designers to decide. Which is another way of sayin=
> g &#39;we don&#39;t have a standard&#39;.</div><div class=3D"gmail_default"=
>  style=3D"font-size:small"><br></div><div class=3D"gmail_default" style=3D"=
> font-size:small">3) The DNS query architecture as deployed works best if th=
> e server can anticipate the further requests. So a system that uses only SR=
> V+TXT allows for a lot more optimization than one using a large number of r=
> ecords.<br></div><div class=3D"gmail_default" style=3D"font-size:small"><br=
> ></div><div class=3D"gmail_default" style=3D"font-size:small">4) There are =
> not enough TCP ports to support all the services one would want. Further ke=
> eping ports open incurs costs. Pretty much the only functionality from HTTP=
>  that Web Services make use of is the use of the URL stem to effectively cr=
> eate more ports. A hundred different Web services can all share port 80.</d=
> iv><div class=3D"gmail_default" style=3D"font-size:small"><br></div><div cl=
> ass=3D"gmail_default" style=3D"font-size:small">5) The SRV record does not =
> specify the URL stem though. Which means that either it has to be specified=
>  in some other DNS record (URI or TXT path) or it has to follow a conventio=
> n (i.e. .well-known).=C2=A0</div><div class=3D"gmail_default" style=3D"font=
> -size:small"><br></div><div class=3D"gmail_default" style=3D"font-size:smal=
> l">6) Sometimes SRV records don&#39;t get through and so any robust service=
>  has to have a strategy for dealing with that situation.</div><div class=3D=
> "gmail_default" style=3D"font-size:small"><br></div><div class=3D"gmail_def=
> ault" style=3D"font-size:small">7) If we are going to get to a robust defen=
> se against traffic analysis, it has to be possible to secure the initial TL=
> S handshake, i.e. before SNI is performed. This in turn means that it must =
> be possible to pull information out of that exchange and into the DNS. Righ=
> t now we don&#39;t know what that information is but this was not a use cas=
> e considered by DANE.</div><div class=3D"gmail_default" style=3D"font-size:=
> small"><br></div><div class=3D"gmail_default" style=3D"font-size:small">8) =
> We are probably going to want to transition Web Services to &#39;something =
> like QUIC&#39; in the near future. Web Services really don&#39;t need a lot=
>  more than a TCP stream. Most of HTTP just gets in the way. But the multipl=
> exing features in QUIC could be very useful.=C2=A0</div><div class=3D"gmail=
> _default" style=3D"font-size:small"><br></div><div class=3D"gmail_default" =
> style=3D"font-size:small"><br></div><div class=3D"gmail_default" style=3D"f=
> ont-size:small"><br></div><div class=3D"gmail_default" style=3D"font-size:s=
> mall"><br></div><div class=3D"gmail_default" style=3D"font-size:small">Righ=
> t now we have different ideas on how this should work in the HTTP space and=
>  DNS space. And this appears to be fine with the two groups as they don&#39=
> ;t need to talk to each other. But it really isn&#39;t possible to build re=
> al systems unless you offend the purists in at least one camp. I think we s=
> hould do better and offend both.</div><div class=3D"gmail_default" style=3D=
> "font-size:small"><br></div><div class=3D"gmail_default" style=3D"font-size=
> :small">So here is my proposal for discovery of a service with IANA protoco=
> l label &#39;fred&#39;</div><div class=3D"gmail_default" style=3D"font-size=
> :small"><br></div><div class=3D"gmail_default" style=3D"font-size:small"><b=
> r></div><div class=3D"gmail_default" style=3D"font-size:small">First the se=
> rvice description records. This is a TXT record setting policy for all inst=
> ances of the fred service and a set of SRV service advertisements:</div><di=
> v class=3D"gmail_default" style=3D"font-size:small"><br></div><div class=3D=
> "gmail_default" style=3D"font-size:small">_fred._<a href=3D"http://tcp.exam=
> ple.com">tcp.example.com</a> TXT &quot;minv=3D1.2 maxv=3D3&quot;</div><div =
> class=3D"gmail_default" style=3D"font-size:small">_fred._<a href=3D"http://=
> tcp.example.com">tcp.example.com</a> SRV 0 100 80 <a href=3D"http://host1.e=
> xample.com">host1.example.com</a><br></div><div class=3D"gmail_default" sty=
> le=3D"font-size:small">_fred._<a href=3D"http://tcp.example.com">tcp.exampl=
> e.com</a> SRV 0 100 80 <a href=3D"http://host2.example.com">host2.example.c=
> om</a><br></div><div class=3D"gmail_default" style=3D"font-size:small"><br>=
> </div><div class=3D"gmail_default" style=3D"font-size:small">There is also =
> a set of round robin A records for systems behind legacy NAT. You could do =
> AAAA as well but these probably aren&#39;t needed as it is unlikely that a =
> router blocking SRV will pass AAAA</div><div class=3D"gmail_default" style=
> =3D"font-size:small"><br></div><div class=3D"gmail_default" style=3D"font-s=
> ize:small"><a href=3D"http://fred.example.com">fred.example.com</a> A 10.0.=
> 0.1</div><div class=3D"gmail_default" style=3D"font-size:small"><a href=3D"=
> http://fred.example.com">fred.example.com</a> A 10.0.0.2<br></div><div clas=
> s=3D"gmail_default" style=3D"font-size:small"><br></div><div class=3D"gmail=
> _default" style=3D"font-size:small">And finally, we have the host descripti=
> on entries</div><div class=3D"gmail_default" style=3D"font-size:small"><br>=
> </div><div class=3D"gmail_default" style=3D"font-size:small"><a href=3D"htt=
> p://host1.example.com">host1.example.com</a> A 10.0.0.1<br></div><div class=
> =3D"gmail_default" style=3D"font-size:small">_fred._<a href=3D"http://tcp.h=
> ost1.example.com">tcp.host1.example.com</a> TXT &quot;minv=3D1.2 maxv=3D2 t=
> ls=3D1.2 path=3D/fred12&quot;<br></div><div class=3D"gmail_default" style=
> =3D"font-size:small"><a href=3D"http://host2.example.com">host2.example.com=
> </a> A 10.0.0.1<br></div><div class=3D"gmail_default" style=3D"font-size:sm=
> all">_fred._<a href=3D"http://tcp.host2.example.com">tcp.host2.example.com<=
> /a> TXT &quot;tls=3D1.3&quot;<br></div><div class=3D"gmail_default" style=
> =3D"font-size:small"><br></div><div class=3D"gmail_default" style=3D"font-s=
> ize:small">So here we have some host level service description tags which o=
> bviously override the ones specified at the service level. With the proviso=
>  that a client might well abort if the service level description suggests t=
> here is no acceptable host. The path descriptor allows the use of the well =
> known service to be avoided on host1. It defaults on host2</div><div class=
> =3D"gmail_default" style=3D"font-size:small"><br></div><div class=3D"gmail_=
> default" style=3D"font-size:small">In the normal run of things, a DNS serve=
> r would recognize that a request for _fred._<a href=3D"http://tcp.example.c=
> om">tcp.example.com</a> SRV was likely the start of a request chain and sen=
> d all the records describing the service in a single bundle. This should us=
> ually fit in a single UDP response.</div><div class=3D"gmail_default" style=
> =3D"font-size:small"><br></div><div class=3D"gmail_default" style=3D"font-s=
> ize:small">This approach gives us two levers allowing us to set policy for =
> the service. We can define policy for all service instances or granular per=
>  host information.</div><div class=3D"gmail_default" style=3D"font-size:sma=
> ll"><br></div><div class=3D"gmail_default" style=3D"font-size:small"><br></=
> div><div class=3D"gmail_default" style=3D"font-size:small">The bit that I h=
> ave not got nailed down is what the HTTP URL should be after the service di=
> scovery is performed. My view is that they should be these:</div><div class=
> =3D"gmail_default" style=3D"font-size:small"><br></div><div class=3D"gmail_=
> default" style=3D"font-size:small"><a href=3D"http://host1.example.com/fred=
> 12">http://host1.example.com/fred12</a></div><div class=3D"gmail_default" s=
> tyle=3D"font-size:small"><a href=3D"http://host2.example.com/.well-known/fr=
> ed">http://host2.example.com/.well-known/fred</a><br></div><div class=3D"gm=
> ail_default" style=3D"font-size:small"><br></div><div class=3D"gmail_defaul=
> t" style=3D"font-size:small">Which works nicely with the existing code and =
> but not for TLS operations. We will either need certs for <a href=3D"http:/=
> /host1.example.com">host1.example.com</a> and <a href=3D"http://host2.examp=
> le.com">host2.example.com</a> or have to override the TLS stack to accept c=
> erts for <a href=3D"http://example.com">example.com</a>.</div><div class=3D=
> "gmail_default" style=3D"font-size:small"><br></div><div class=3D"gmail_def=
> ault" style=3D"font-size:small">The problem becomes even more apparent if t=
> he redirects are to <a href=3D"http://host1.cloudly.com">host1.cloudly.com<=
> /a> and <a href=3D"http://host2.cloudly.com">host2.cloudly.com</a> where cl=
> oudly is a cloud service provider. So the alternative is to do this:</div><=
> div class=3D"gmail_default" style=3D"font-size:small"><br></div><div class=
> =3D"gmail_default" style=3D"font-size:small"><br></div><div class=3D"gmail_=
> default" style=3D"font-size:small"><div class=3D"gmail_default"><a href=3D"=
> http://example.com/fred12">http://example.com/fred12</a></div><div class=3D=
> "gmail_default"><a href=3D"http://example.com/.well-known/fred">http://exam=
> ple.com/.well-known/fred</a></div></div><div class=3D"gmail_default" style=
> =3D"font-size:small"><br></div><div class=3D"gmail_default" style=3D"font-s=
> ize:small">The problem is that it does not work well when trying to use thi=
> s strategy with existing http clients built into scripting languages. Inste=
> ad of just writing a module that does the SRV lookup and spits out the URLs=
>  and attributes, now we need to rewrite our client so it will hit the right=
>  DNS address.</div><div class=3D"gmail_default" style=3D"font-size:small"><=
> br></div><div class=3D"gmail_default" style=3D"font-size:small"><br></div><=
> div class=3D"gmail_default" style=3D"font-size:small">Given that most libra=
> ries seem to have hooks to allow a client to make its own TLS certificate p=
> ath math choices, I am very strongly in favor of the first approach. But I =
> am willing to be persuaded otherwise.</div><div class=3D"gmail_default" sty=
> le=3D"font-size:small"><br></div><div class=3D"gmail_default" style=3D"font=
> -size:small">Comments?</div></div>
> 
> --001a1147010626c41e0541e517e6--
> 
-- 
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742                 INTERNET: marka@isc.org