Re: [dnssd] Working group last call for draft-ietf-dnssd-push

Ted Lemon <mellon@fugue.com> Fri, 26 October 2018 21:29 UTC

Return-Path: <mellon@fugue.com>
X-Original-To: dnssd@ietfa.amsl.com
Delivered-To: dnssd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1A695130DF6 for <dnssd@ietfa.amsl.com>; Fri, 26 Oct 2018 14:29:12 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Level:
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=fugue-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SXcbrwvZ43z3 for <dnssd@ietfa.amsl.com>; Fri, 26 Oct 2018 14:29:08 -0700 (PDT)
Received: from mail-qk1-x72a.google.com (mail-qk1-x72a.google.com [IPv6:2607:f8b0:4864:20::72a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B93F9124BAA for <dnssd@ietf.org>; Fri, 26 Oct 2018 14:29:07 -0700 (PDT)
Received: by mail-qk1-x72a.google.com with SMTP id q184-v6so1580105qkd.3 for <dnssd@ietf.org>; Fri, 26 Oct 2018 14:29:07 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fugue-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=FZaFYxlrmhA+VPYJ56fJKlB0J2fDArWjLvAfNJJa6os=; b=zJZdL0LJtkxe3o8fCaJBJcuUc82R+sZuTV0rAEvwrkKoN+tSG7ozh++sCVX6lyM73l T1jNNESJfHMgsqcGfocoHU/SefkM7WFClrxdS3Td9cBeK0yTYUTuE3QOTYK29+AUR4y2 VWLd2d3uPZJ9RHs2nBGa8FtQbyRDFH127PyQDRPu1rtJurpyfJb/IoXwMDGXYX/9MjLr F27enllmqI18aKiXoAtTFwHQsiAXhLUuSTVaEynwGuDTdc5WX2t6Quv+oRX0ITM0Fm9w yI2Zy8U9OxZowFBdVZ2SpmA2FG2u3t2pPQVVPZkvROrjLgBu/PBmm6o2UiUWOVoMSLJu WPLQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=FZaFYxlrmhA+VPYJ56fJKlB0J2fDArWjLvAfNJJa6os=; b=Po6yTDBfr38/v0RYf7JYtbYYxTLJ+zoxciq+aQGrYzufN96RZE0pLTP05ZByWi0f2a mHpTvWlXDzDD39I2SsAJGZO3URKJe9IHR1rSSOjDQ/sUPHAWzSBaUJAjWgTBrMaavgtO Pe2Z8qggc63TpkJIRIF/Lk3NABTGO8mbun9ghQDgg4xD74H43yWoQ7HVwxIQnNKyQ36T PWX7EdJJuTp6zZAZu/F2dlILrddcSab7+MB/Fc3fsc8btwfmLu3F+g5iZMl8R+Zl22bP B4oe9YbFlVCwnYnNUL95S7wQLei8acJlh6Iro/+91aKv8aHOEo3QDMSKW1Y+D1XNsNqO paTg==
X-Gm-Message-State: AGRZ1gLkF4XShuBMWK9Oa6JXoJ1law2yPWlW+cxKJ1IJl+3KlFbHqEOW dthn/UTzuq7Cl6WImuLkRjoyvUfTAGDRk58A0GJyGQ==
X-Google-Smtp-Source: AJdET5fLQcatbtq0GDaYlbpegj+P8kx+oYCR8yyni9ovepMJsvG1PcmbbCtSI4sTJv4MMETJt//eTL9ZQ6Qzf4EsD1s=
X-Received: by 2002:a37:614d:: with SMTP id v74-v6mr4636972qkb.208.1540589346584; Fri, 26 Oct 2018 14:29:06 -0700 (PDT)
MIME-Version: 1.0
References: <9EDAA7B4-BB78-4CCC-BE0E-A47EF3E0A4A6@apple.com> <DD18BDD4-FFDF-43BC-97E1-8BB846F15702@bangj.com> <C4802C62-E94C-48AE-867F-9A4743A4AEA2@cisco.com>
In-Reply-To: <C4802C62-E94C-48AE-867F-9A4743A4AEA2@cisco.com>
From: Ted Lemon <mellon@fugue.com>
Date: Fri, 26 Oct 2018 17:28:29 -0400
Message-ID: <CAPt1N1m1d5Vj1ueC17ksfP7j9+23s0ATtxUwrTnCwmjqEQgtUQ@mail.gmail.com>
To: "Jan Komissar (jkomissa)" <jkomissa@cisco.com>
Cc: Tom Pusateri <pusateri@bangj.com>, David Schinazi <dschinazi@apple.com>, Stuart Cheshire <cheshire@apple.com>, Tim Wicinski <tjw.ietf@gmail.com>, dnssd <dnssd@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000bf8fc8057928681c"
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnssd/6dHVQn4aBQbAuQPCCS7ZHZb7lmc>
Subject: Re: [dnssd] Working group last call for draft-ietf-dnssd-push
X-BeenThere: dnssd@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussion of extensions to DNS-based service discovery for routed networks." <dnssd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnssd>, <mailto:dnssd-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnssd/>
List-Post: <mailto:dnssd@ietf.org>
List-Help: <mailto:dnssd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnssd>, <mailto:dnssd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 26 Oct 2018 21:29:12 -0000

I'm in favor of advancing the document.   I have a few editorial comments:

On Page 6:

   For example, if a user presses the "Print"
   button on their smartphone, and then leaves the phone showing the
   printer discovery screen until the phone goes to sleep, then the
   printer discovery screen should be automatically dismissed as the
   device goes to sleep.  If the user does still intend to print, this
   will require them to press the "Print" button again when they wake
   their phone up.


I don't think this is the right advice to give—it's not necessary to
dismiss the UI.   It always surprises me when a context switch results
in something in the UI changing without me changing it.    The less
surprising behavior would be to simply stop doing these queries while
the dialog isn't showing.   The dialog isn't showing when the phone is
asleep.   So maybe this text would accomplish the same purpose without
recommending a particular UI flow?

   For example, if a user presses the "Print"
   button on their smartphone, and then leaves the phone showing the
   printer discovery screen until the phone goes to sleep, or switches

   to a different app, then the
   push subscription should (SHOULD?) be discontinued.   When the
phone wakes up,

   or the user switches back to the application that is showing the print

   dialog, the subscription could be reinstated, perhaps after a brief wait

   to allow the user to dismiss the dialog if they no longer intend to print.


Section 3 says:

   Standard DNS Queries MAY be sent over a DNS Push Notification
   connection, provided that these are queries for names falling within
   the server's zone (the <zone> in the "_dns-push-tls._tcp.<zone>" SRV
   record).  The RD (Recursion Desired) bit MUST be zero.  If a query is
   received with the RD bit set, matching records for names falling
   within the server's zones should be returned with the RA (Recursion
   Available) bit clear.  If the query is for a name not in the server's
   zone, an error with RCODE NOTAUTH (Not Authoritative) should be

   returned.


Why is this?   What if this is a hybrid authoritative/caching resolver?
 Also, later on we do actually specify that this can work with the local
resolver.   ISTM you could say this instead and capture what is necessary:

   Standard DNS Queries MAY be sent over a DNS Push Notification
   connection.   For any zone for which the server is authoritative, it

   MUST respond authoritatively for queries on names falling within

   that zone (e.g., the <zone> in the "_dns-push-tls._tcp.<zone>" SRV
   record) both for DNS Push Notification queries and for normal DNS

   queries.   For names for which the server is acting as a caching

   resolver, e.g. when the server is the local resolver, for any query

   for which it supports DNS Push Notifications, it MUST also support

   standard queries.


Section 4 says:

   In keeping with the more recent precedent, DNS Push Notification is
   defined only for TCP.  DNS Push Notification clients MUST use DNS
   Stateful Operations (DSO) [DSO
<https://tools.ietf.org/html/draft-ietf-dnssd-push-15#ref-DSO>]
running over TLS over TCP [RFC7858
<https://tools.ietf.org/html/rfc7858>].


But section 7 says:


   The Strict Privacy Usage Profile for DNS over TLS is strongly
   recommended for DNS Push Notifications as defined in "Usage Profiles
   for DNS over TLS and DNS over DTLS" [RFC8310
<https://tools.ietf.org/html/rfc8310>].


Which is it?   :)

Section 5 says:

   Token bucket rate limiting schemes are also effective
   in providing fairness by a server across numerous client requests.


[Citation Needed]

Is there any reason to say this?


   DNS Push Notification clients and servers MUST support DSO, but (as
   stated in the DSO specification [DSO
<https://tools.ietf.org/html/draft-ietf-dnssd-push-15#ref-DSO>]) the
server SHOULD NOT issue
   any DSO messages until after the client has first initiated an
   acknowledged DSO message of its own.  A single server can support DNS
   Queries, DNS Updates, and DNS Push Notifications (using DSO) on the
   same TCP port, and until the client has sent at least one DSO
   message, the server does not know what kind of client has connected
   to it.  Once the client has indicated willingness to use DSO by
   sending one of its own, either side of the session may then initiate
   further DSO messages at any time.


It's just reiterating what the DNS Stateful Operations document says, and
what was said previously about updates and so on.   Less text better?

Section 6.1 says:

   The client begins by opening a DSO Session to its normal configured
   DNS recursive resolver and requesting a Push Notification
   subscription.  If this is successful, then the recursive resolver
   will make appropriate Push Notification subscriptions on the client's
   behalf, and the client will receive appropriate results.  If the
   recursive resolver does not support Push Notification subscriptions,
   then it will return an error code, and the client should proceed to
   discover the appropriate server for direct communication.  The client
   MUST also determine which TCP port on the server is listening for
   connections, which need not be (and often is not) the typical TCP
   port 53 used for conventional DNS, or TCP port 853 used for DNS over

   TLS [RFC7858 <https://tools.ietf.org/html/rfc7858>].


This is inconsistent with the earlier assertion that the server we are
talking to is an authoritative server.   How do we know what zone, if any,
the default resolver is authoritative for?   What if it can support some
push notifications we want, but for others we need to talk to the
authoritative server?   What about TLS?

I think this text was added based on a comment I made a while back; I think
that the behavior defined here may be okay, but there should be some
additional text:

   The client begins by opening a DSO Session to its normal configured
   DNS recursive resolver and requesting a Push Notification
   subscription.  This connection is made to the default DNS-over-TLS

   port.   If this connection is successful, then the recursive resolver
   will make appropriate Push Notification subscriptions on the client's
   behalf, and the client will receive appropriate results.


   In many contexts, the local recursive resolver will be able to handle

   push notifications for all zones that the client may need to follow.

   In other cases, the client may require Push Notifications from more

   than one zone, and those zones may be served by different servers.

   It is assumed here, therefore, that the client may need to maintain

   connections to more than one DNS Push server.


   In some cases,

   the recursive resolver may not be able to get answers for a particular

   zone.   In this case, rather than returning SERVFAIL, the resolver

   returns NOTAUTH.   This signals the client that queries for this zone

   can't be handled by the local caching resolver.   For that zone, the

   client SHOULD contact the zone's DNS Push server itself, even if

   all other DNS Push queries can be handled by the local resolver.

   This may be necessary in cases where the client is connected to a VPN,

   for example, or where the client has a pre-established trust relationship

   with the owner of the zone that allows the client, but not the local

   resolver, to successfully get answers for queries in that zone.


   If the
   recursive resolver does not support Push Notification subscriptions,
   then it will return an error code, DSONOTIMPL.   This occurs when the

   local resolver follows the procedure below and does not find an SRV

   record indicating support for DNS Push Notifications.


   In case of either failure, the client should proceed to
   discover the appropriate server for direct communication.  The client

   MUST also determine which TCP port on the server is listening for
   connections, which need not be (and often is not) the typical TCP
   port 53 used for conventional DNS, or TCP port 853 used for DNS over

   TLS [RFC7858 <https://tools.ietf.org/html/rfc7858>].


Later in 6.1:

   3.  If the requested SOA record does not exist, the client will get
       back a NOERROR/NODATA response or an NXDOMAIN/Name Error
       response.  In either case, the local resolver SHOULD include the
       SOA record for the zone of the requested name in the Authority
       Section.


That SHOULD is updating RFC1035, I think, although I realize that there's
text later claiming it doesn't.   How about "would normally"?   Given that
we specify how clients handle the exceptional case, there's no reason to
get fussy about this here.   BTW, we really ought to have a document that
describes this stuff, so that we can reference it.   It's a fairly common
operation.

At the bottom of Page 16 (the end of section 6.2.1) it might be good to add
some text acknowledging the recent deprecation of ANY queries, and
explicitly saying why they are not deprecated here.   This avoids the risk
of DNSOP experts tripping on this, although they will probably see the
utility.

The table in 6.2.2 is introduced by saying "Supported RCODEs are as
follows:" but then lists NXDOMAIN, for the purpose of explicitly
repeating that it is not supported.   Subsequent text also refers to
the table as if all the RCODEs are permitted.   This needs to be
fixed.   I would take NXDOMAIN out of the table and just be really
explicit about how it's not allowed.   6.5.2 has the same issue, with
the same cure.


Also in 6.2.2:


      For RCODE = 2 (SERVFAIL) the delay should be chosen according to
      the level of server overload and the anticipated duration of that
      overload.  By default, a value of one minute is RECOMMENDED.  If a
      more serious server failure occurs, the delay may be longer in
      accordance with the specific problem encountered.


In this case ideally there is more than one server, and the client can try
the next one in the list, rather than repeatedly connecting to the same
overloaded server.   Or are we assuming the client will look the server up
again, and that round-robining will take care of this?

Also in 6.2.2:

      This is a misconfiguration, since this server is listed in a
      "_dns-push-tls._tcp.<zone>" SRV record, but the server itself is
      not currently configured to support DNS Push Notifications for
      that zone.  Since it is possible that the misconfiguration may be
      repaired at any time, the retry delay should not be set too high.
      By default, a value of 5 minutes is RECOMMENDED.


Probably ought to say this:

      If the server being queried is not the local resolver, rhis is a

      misconfiguration, since this server is listed in a
      "_dns-push-tls._tcp.<zone>" SRV record, but the server itself is
      not currently configured to support DNS Push Notifications for
      that zone.  Since it is possible that the misconfiguration may be
      repaired at any time, the retry delay should not be set too high.
      By default, a value of 5 minutes is RECOMMENDED.


At the end of 6.3.1:

   The TTL of an added record is stored by the client and decremented as
   time passes, with the caveat that for as long as a relevant
   subscription is active, the TTL does not decrement below 1 second.
   For as long as a relevant subscription remains active, the client
   SHOULD assume that when a record goes away the server will notify it
   of that fact.  Consequently, a client does not have to poll to verify
   that the record is still there.  Once a subscription is cancelled
   (individually, or as a result of the DSO session being closed) record
   aging resumes and records are removed from the local cache when their

   TTL reaches zero.

There's a slight problem with this: if the caching resolver is doing these
queries on behalf of a client, then it shouldn't be decrementing the TTL.
 Also, if we haven't received an update from the server saying that the TTL
has a new value, then the TTL doesn't have a new value—it's still whatever
the server sent last time.   So it would actually be correct to never
decrement the TTL while a subscription is active.   So I would suggest:

   The TTL of an added record is stored by the client.   While the subscription

   is active, the TTL is not decremented, because a change to the TTL would

   produce a new update.

   For as long as a relevant subscription remains active, the client
   SHOULD assume that when a record goes away the server will notify it
   of that fact.  Consequently, a client does not have to poll to verify
   that the record is still there.  Once a subscription is cancelled
   (individually, or as a result of the DSO session being closed) record
   aging for records covered by the subscription resumes and records are

   removed from the local cache when their

   TTL reaches zero.


Section 7.4 talks about TLS session resumption and that subscriptions have
to be reinstantiated, but implies without stating explicitly that closing a
TLS session closes the DSO session.   It might be worth saying that
explicitly.   Something like:

   TLS Session Resumption is permissible on DNS Push Notification

   servers.  The server may keep TLS state with Session IDs [RFC5246
<https://tools.ietf.org/html/rfc5246>] or
   operate in stateless mode by sending a Session Ticket [RFC5077
<https://tools.ietf.org/html/rfc5077>] to
   the client for it to store.  However, closing the TLS connection

   terminates the  the DSO session.  When the TLS session is
   resumed, the DNS Push Notification server will not have any
   subscription state and will proceed as with any other new DSO
   session.  Use of TLS Session Resumption allows a new TLS connection
   to be set up more quickly, but the client will still have to recreate
   any desired subscriptions.


In section 8, you probably ought to use two tables rather than one.   Also,
I don't think you've given enough information for the service name
registration.