[Tzdist] AD review of draft-ietf-tzdist-service-07

Barry Leiba <barryleiba@computer.org> Thu, 07 May 2015 08:57 UTC

Return-Path: <barryleiba@gmail.com>
X-Original-To: tzdist@ietfa.amsl.com
Delivered-To: tzdist@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D9DF31A6FEB; Thu, 7 May 2015 01:57:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 1.422
X-Spam-Level: *
X-Spam-Status: No, score=1.422 tagged_above=-999 required=5 tests=[BAYES_50=0.8, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FM_FORGED_GMAIL=0.622, FREEMAIL_FROM=0.001, SPF_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id byj8_Xi8V-Rz; Thu, 7 May 2015 01:57:10 -0700 (PDT)
Received: from mail-wg0-x22b.google.com (mail-wg0-x22b.google.com [IPv6:2a00:1450:400c:c00::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B89931A0033; Thu, 7 May 2015 01:57:06 -0700 (PDT)
Received: by wgin8 with SMTP id n8so36541430wgi.0; Thu, 07 May 2015 01:57:05 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:date:message-id:subject:from:to:cc:content-type; bh=Z9ghzMVRskom/O//VUmUER6LwRBhb6ILvzK/CJn0Kw0=; b=NHQQvrrGXoEczUF/ixP2i3Al3M0UoJjF8aQa3MUnYrzWeKV5kepAeNpakbpO3msHrI hknQkWZuJnWDnklsD4HMHDjmlQUYNLCvP6U+koZkl8jwiSWDBJpH6XyHX0yJwDeN+fZa 7lqz+QrDieBc5SBEpdSEXDAmzFvfL9xH2CXc6fHdexM+hv7UN53ly2K3aD9/QyfHijYc k3Yv7rwQZqukrIowXVTIPSMmnUAjIYU3aqfWvAFl75kd94a/BESNB0ldOEjd1ABx1C0A hl7lXsbZK7umOtZTikFF8/F8xdfDDRaTom1iB3jLK/uZckHR2rXF7n7GRxakfk5SrwxR PHrQ==
MIME-Version: 1.0
X-Received: by 10.180.107.70 with SMTP id ha6mr4649362wib.20.1430989025412; Thu, 07 May 2015 01:57:05 -0700 (PDT)
Sender: barryleiba@gmail.com
Received: by 10.194.237.234 with HTTP; Thu, 7 May 2015 01:57:05 -0700 (PDT)
Date: Thu, 07 May 2015 09:57:05 +0100
X-Google-Sender-Auth: QHl7hyt60BBK9zKvJzAm39bhuNE
Message-ID: <CALaySJKUcgkMNsFPk0X6ur-Fw0LrB0-miQvAKYJD2rMCEFpBSQ@mail.gmail.com>
From: Barry Leiba <barryleiba@computer.org>
To: draft-ietf-tzdist-service@ietf.org
Content-Type: text/plain; charset="ISO-8859-1"
Archived-At: <http://mailarchive.ietf.org/arch/msg/tzdist/pYrfCyDEXh3bfpFD-oPBWloKLos>
Cc: tzdist@ietf.org
Subject: [Tzdist] AD review of draft-ietf-tzdist-service-07
X-BeenThere: tzdist@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <tzdist.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tzdist>, <mailto:tzdist-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tzdist/>
List-Post: <mailto:tzdist@ietf.org>
List-Help: <mailto:tzdist-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tzdist>, <mailto:tzdist-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 07 May 2015 08:57:14 -0000

Here are my review comments.  There are a lot of them; I've marked the
most important ones with "***".  I'd like to try to work through many
of these before I request last call, but (1) we don't need to resolve
everything, and (2) you should push back at me if you disagree with
something I say there.  I'd like to actively discuss some of this.

-- Section 3.3 --
Nit: Change "including" to "included".

-- Section 3.5 --
Is this trying to say that each entire period that uses the same UTC
offset is "an observance", and that the union of all observances
defines the range of validity?  If that's right, you should improve on
"Such periods of time are call observances," by saying it more like
what I say above.

-- Section 3.9 --

   two years in the past on into the future, as users typically create
   only new events for the present and future.

I think the "only" is misplaced: of course we create only new events,
because it doesn't make sense to create existing events.  I think you
mean "users typically create new events only for the present and
future."  Yes?

   might be concerned only with a smaller range into the future, and
   data past that point might be redundant.

Nit: I don't think "redundant" is the right word; I think
"unnecessary" or "unused" is.

   The truncation points at the start and end of a range are always a
   UTC date-time value, with the start point being "inclusive" to the
   overall range, and the end point being "exclusive" to the overall
   range (i.e., the end value is just past the end of the last valid
   value in the range).

This is as good a choice as any, but I'm curious: why did you not make
it inclusive on both ends?

-- Section 4.1 --

   Most security considerations are already handled adequately by HTTP.
   However, given the nature of the data being transferred and the
   requirement it be correct, all interactions between client and server
   SHOULD use an HTTP connection protected with TLS [RFC5246] as defined
   in [RFC2818].

When one reads the definitions above about providers (which refer to
"servers"), this appears to be recommending TLS only between providers
and clients, and not between providers and providers (even though the
provider making the request has the client role in that transaction).
I *hope* you're meaning to recommend use of TLS always (if not, please
explain why not), and I see no value in qualifying it as you do.  So I
suggest removing "between client and server".

-- Section 4.1.4 --
I find the lengthy discussion confusing, as it doesn't seem to get to
the point(s) it's making directly.  As I understand it, they key
points are these:

1. When the time zone list is sent, it has (a) an opaque token that
can be used to subset the list, and (b) an ETag value for each time
zone, which can be used to determine whether some data about that time
zone has actually changed.

2. The most efficient way to update one's list is to (a) get a new
list, using the old opaque token, (b) save the new opaque token for
next time, (c) check each time zone in the returned list to see if its
ETag value has changed, and (d) only request new time zone data for
each time zone whose ETag value has changed.

3. Note that ETag values can sometimes change because of metadata
changes that don't matter to the requester.  That will cause extra
fetching, but can't be avoided.  (An example, with a monolithic
publisher, is given.)

It would help, I think, if those key points are presented more that
way, and less as a kind of rambling discussion that doesn't seem to
have focus.  What do you think?

   Clients SHOULD poll for changes, using an appropriate conditional
   request, at least once a day.  A server acting as a secondary
   provider, caching time zone data from another server, SHOULD poll for
   changes once per hour.

*** This really sounds excessive, so please talk with me about it (I'm
sure it was discussed in the working group, and I probably just need
to understand better).

For the first sentence, it would seem to me that a client that's
keeping the time zone information only for itself would (1) only need
to check for changes for those time zones it cares about, and (2)
almost never actually need to retrieve changes -- certainly not
anywhere close to daily.  My calendar, which is quite a complicated
one, has entries rooted in maybe six time zones (besides UTC, which
should never change).  And in any case, I can't see how this is a 2119
SHOULD, as it doesn't affect anything except user experience.

For the second sentence,"SHOULD" is appropriate... but every HOUR,
really?  I understand the issue that some time zone information might
change with only a few hours' notice, but I can't imagine that that
happens very often, and does that really make it worth having all
secondary providers poll hourly?  Wouldn't it have not been better to
have a means where secondaries can register with primaries and have
them push changes, perhaps with far-less-frequent polling as a backup?

-- Section 4.2.1.3 --

   Servers SHOULD set an appropriate Cache-Control header field value
   (as per Section 5.2 of [RFC7234]) in the redirect response to ensure
   caching occurs as needed, or as required by the type of response
   generated.  For example, if it is anticipated that the location of
   the redirect might change over time, then a "no-cache" value would be
   used.

*** For that last sentence: You really want to recommend "no-cache",
rather than "an appropriate max-age value"?  Do you really want
clients going back to discovery for *every request*, just to leave
open the possibility of changing the location?

   To facilitate "context path's" that might differ from user to user,
   the server MAY require authentication when a client tries to access
   the ".well-known" URI

Given that this possibility exists, it might be worth mentioning it in
4.2.1.2, saying that if the server's context paths (no apostrophe,
please) can differ from user to user, TXT RR is not an appropriate
mechanism, and the server needs to use .well-known.

It appears that .well-known always has to work, even when there is a
TXT RR that, as 4.2.1.2 says, MUST be used.  It would be good to say
that explicitly.

*** I do find the combination of mechanisms interesting, though: I
MUST do an extra query for a TXT RR, which might or might not be
there.  If it's not there, I've wasted a query, and then I use an HTTP
request to .well-known, which I *know* will always work.  Clearly, the
query for TXT RR is lighter weight.  But if RTT is my concern...

-- Section 4.2.1.3.1 --

   the server would issue an HTTP 301 redirect
   response with a Location response header field using the path
   "/servlet/timezone".  The client would then "follow" this redirect to
   the new resource and continue making HTTP requests there.

I would add a sentence to the end of this, such as this:

ADD
The client would also cache the redirect information -- subject to any
Cache-Control directive -- for use in subsequent requests.
END

-- Section 4.2.2.1 --

   The client would preserve the returned
   opaque token for subsequent use.

Quite.  But it isn't untili section 6.2 that you finally tell me how
the token is returned (and you have an example in 5.2.1).  I suggest
saying that here -- something like this:

NEW
   The client would preserve the returned
   opaque token (see "synctoken" in the example in Section 5.2.1)
   for subsequent use.
END

-- Section 5 subsections --
*** Some of the URI templates (I'm thinking of 5.4 in particular) look
confusing being split across lines.  I think you can get rid of that
effect by starting the template on a new line from the heading.  Like
this:

OLD (5.1)
   Request-URI Template:  {/service-prefix}/capabilities
NEW
   Request-URI Template:
     {/service-prefix}/capabilities
END

OLD (5.2)
   Request-URI Template:  {/service-prefix,data-
      prefix}/zones{?changedsince}
NEW
   Request-URI Template:
     {/service-prefix,data-prefix}/zones{?changedsince}
END

OLD (5.3)
   Request-URI Template:  {/service-prefix,data-
      prefix}/zones{/tzid}{?start,end}
NEW
   Request-URI Template:
     {/service-prefix,data-prefix}/zones{/tzid}{?start,end}
END

OLD (5.4)
   Request-URI Template:  {/service-prefix,data-prefix}/zones{/tzid}

      /observances{?start,end}
NEW
   Request-URI Template:
     {/service-prefix,data-prefix}/zones{/tzid}/observances{?start,end}
END

OLD (5.5)
   Request-URI Template:  {/service-prefix,data-prefix}/zones{?pattern}
NEW
   Request-URI Template:
     {/service-prefix,data-prefix}/zones{?pattern}
END

OLD (5.6)
   Request-URI Template:  {/service-prefix,data-prefix}/leapseconds
NEW
   Request-URI Template:
     {/service-prefix,data-prefix}/leapseconds
END

(I think it's best to be consistent with that, even for the ones that
will fit on one line.)

*** The >> Request << examples in Sections 5.3.4 and 5.4.1 are a bit
more problematic.  The way you've broken it across multiple lines
isn't consistent.  Take 5.4.1:

   >> Request <<

   GET /zones/America%2FNew_York/observances
                      ?start=2008-01-01T00:00:00Z
                      &end=2009-01-01T00:00:00Z
                      HTTP/1.1

The problem is that in the real protocol, there's no space before the
"?" or the "&", but there is one before "HTTP/1.1"... and the way it's
written gives no idea of that.  I'm not *too* worried about that,
because it's an example, and because readers really do need to already
know how HTTP GET requests work.  But it might be nice to try to think
of a way to be clearer, without getting all wound up.

Maybe it just works to merge the last three lines and reduce the
indentation, and then add an explanation that the resulting two bits
are meant to be one line, with no space between them.  As there are
only two occurrences of this situation, it's not too cumbersome.  Like
this:

OLD
   In this example the client requests a time zone in the expanded form.

   >> Request <<

   GET /zones/America%2FNew_York/observances
                      ?start=2008-01-01T00:00:00Z
                      &end=2009-01-01T00:00:00Z
                      HTTP/1.1
   Host: tz.example.com

NEW
   In this example the client requests a time zone in the expanded form.
   (In the actual protocol, the "?start" follows "observances" on the
   same line, with no intervening space.)

   >> Request <<

   GET /zones/America%2FNew_York/observances
     ?start=2008-01-01T00:00:00Z&end=2009-01-01T00:00:00Z HTTP/1.1
   Host: tz.example.com

END

What do you think?

-- Section 5.1.1 --
Is it really likely that servers will redirect "/.well-known/timezone"
to "/" ?  Might it not be better to say in Section 5 something like,
<< The examples in the following subsections presume that the timezone
context path has been discovered to be "/servlet/timezone" (as in the
example in Section 4.2.1.3.1). >>, and then to change the examples
accordingly:

OLD
   GET /capabilities HTTP/1.1
   Host: tz.example.com
NEW
   GET /servlet/timezone/capabilities HTTP/1.1
   Host: tz.example.com
END

...and so on?

-- Section 5.2 --

   Parameters:
      changedsince  OPTIONAL, but MUST occur only once.

Oops; no.  It is not true that it MUST occur:

NEW
   Parameters:
      changedsince  OPTIONAL, and MUST NOT occur more than once.
END

Similarly in Section 5.3.

-- Section 5.2.1 --
Entirely optional, but I think it would help the readability of the
example to put blank lines before and after the ellipsis, and perhaps
to change the ellipsis to "...other time zones...".  As it is, it's
easy to miss the "..." by eye (at least I found it so, with my aging
eyes).  (FWIW, the ellipses in 5.3.1 work fine for me, probably
because they're not surrounded by punctuation and indentation.)

-- Section 5.3 --

      The "tzid" variable value is REQUIRED to distinguish this action
      from the "list" action.

It's easy to read this as saying that the value has to do something.
Maybe make it "is REQUIRED, in order to".

-- Section 5.3.5 --
A nit, but the title doesn't make much sense:

OLD
5.3.5.  Example: Get a non-existent time zone data

   In this example the client requests the time zone with a specific
   time zone identifier to be returned.
NEW
5.3.5.  Example: Request data for a non-existent time zone

   In this example the client requests the time zone with a specific
   time zone identifier to be returned.  As it turns out, no time
   zone exists with that identifier.
END

-- Section 5.5 --
The explanation of pattern matching with "*" does not explain what
happens if you put the wildcard character in the middle of the string.
If I use "x*z", is that an error (with
urn:ietf:params:tzdist:error:invalid-pattern
)?  If so, you should say that explicitly.  If not, what does it do?

      In addition, when matching, underscore characters (0x5F) SHOULD be
      mapped to a single space character (0x20) prior to string
      comparison.  This allows time zone identifiers such as "America/
      New_York" to match a query for "*New York*".  ASCII characters in
      the range 0x41 ("A") through 0x5A ("Z") SHOULD be mapped to their
      lowercase equivalents.

*** Why are these "SHOULD"s instead of "MUST"s?  That seems to be an
interop problem, because "*new york*" can return various things,
depending upon whether "_" mapping is or isn't used and whether case
mapping is or isn't used.  The same query that worked for years could
stop working because we switched to a new server or because the server
software was changed.  Please explain and discuss.

      To match characters 0x2A ("*") and 0x5C
      ("\") in the pattern, a single 0x5C ("\") is prepended to act as
      an "escaping" mechanism. i.e., a pattern "Test\*" implies an exact
      match test against the string "Test*".

*** How is a query for "*New York*" (in the text above) done?  You
can't send the space character in the GET command without making it
"%20", and the mechanism for sending patterns doesn't appear to allow
%-encoding.

Also, the "i.e." needs to be "e.g.", please.

-- Section 8 --

   Clients that support transport layer security as defined by [RFC2818]
   SHOULD use the "_timezones" service, but MAY use "_timezone" service.

SHOULD/MAY error #1.  The MAY makes << use "_timezone" service >>
entirely optional either way... which contradicts the SHOULD.  I think
that what you mean here is that you must use one of these, and there's
a SHOULD-level preference for a particular one, right?  So:

NEW
   Clients that support transport layer security as defined by [RFC2818]
   SHOULD use the "_timezones" service.  It is permissible to use the
   "_timezone" service instead, but "_timezones" is strongly preferred.
END

But a question:

*** What does this really mean?  I don't think you're really
separating the _timezones SRV record from the use of TLS, and
similarly for _timezone.  I think a client should be using _timezones
if it's going to use https, and _timezone if it's going to use http
(non-"s"), right?  You're basically saying that the client SHOULD use
TLS, and, therefore, the _timezones service.  And the protocol advice
doesn't really have to do with whether the client *supports* TLS, does
it?  The client SHOULD use TLS, and one reason not to is that the code
doesn't support it.  Also, I wonder whether you want to allow for DANE
here (and perhaps something else that might come later?).

I think something like this is more that you mean here, yes?  (But, of
course, correct me if this is wrong.):

NEWER (with the subsequent text)
   Clients SHOULD use transport layer security as defined by [RFC2818],
   unless they are specifically configured otherwise.  Clients that have
   been configured to use the TLS-based service, MUST NOT fall back to
   using the non-TLS service if the TLS-based service is not available.
   In addition, clients MUST NOT follow HTTP redirect requests from a
   TLS service to a non-TLS service.  When using TLS, clients MUST verify
   the identity of the server, using a standard, secure mechanism such as
   the certificate verification process specified in [RFC6125] or DANE
   [RFC6698].
END

   Time zone data servers SHOULD protect themselves against errant or
   malicious clients by throttling high request rates

I may have lost this battle, but "errant", in its traditional use,
means "itinerant" or "wandering", not "erroneous".  I think what you
really mean here is "buggy" (or maybe "badly written", but that's
harder to put into a document), and maybe you should say that instead.
But I'm not dying on that hill.  (Similarly for the second use of
"errant" later in the paragraph.)

   As such, servers MAY require HTTP-
   based authentication as per [RFC7235].

What's the point of this sentence?  You've already said that they "MAY
require some form of authentication", and, well, this is HTTP.  I'd
drop that sentence, because I think it's unnecessary and a
distraction.

-- Section 9 --

   3.  Always fetch and synchronize the entire set of time zone data to
       avoid leaking information about which time zones are actually in
       use by the client.

*** Really?  Wow.  That's sorta like saying that web clients should
fetch hundreds of randomly selected web sites along with the one
they're actually intending to retrieve, in order to hide what the user
is asking for.  It sounds like really bad advice to me.  If you're
using TLS, don't you have this taken care of that way?  Is it *really*
good advice for the client to fetch a lot of time zone data that it
doesn't and probably never will need?

Similarly for other advice: if the client follows (1), does it really
need to worry about (4)?  Or (5) or (6)?

I don't understand (7) at all: won't the client use authenticated HTTP
requests if and only if the server requires them?

What threat is (9) really trying to address, which isn't already
exposed by the fact that the server has the client's IP address and
might even be requiring the client to log in?

-- Section 10.1 --
*** This appears to be creating a registry, but you haven't told IANA
that, nor given it a name, nor told them where to put it.  I think you
should say something more like this:

NEW
IANA is asked to create a new top-level category called "Time Zone
Distribution Service (TZDIST) Parameters", and to put all the
registries created herein into that category.

IANA is asked to create a new registry called "TZDIST Service
Actions", as defined below.
END

-- Section 10.1.1 --
Nit: "Standards Track", not "Standard Track".

*** You're specifying a Standards Track RFC *and* review by a
designated expert.  Why do you need both?  Presumably, time zone
experts in the IETF (such as the participant whom the IESG might
designate) will be participating in the consensus process anyway.  Why
is it good to set up a situation wherein one individual is empowered
to override (rather than to participate in and influence) IETF
consensus?

Small thing: If you want a Standards Track RFC to be required, you
should say that the registration policy is "Standards Action", and
reference RFC 5226.  If you do *also* want a designated expert, you
should use "Standards Action and Expert Review", and you should
specify instructions for the designated expert.

*** The instructions to the DE should explain what the DE should be
considering, and when it's appropriate to decline a registration
request ("reject with cause", in your terminology).  It's not
acceptable to give a designated expert license to reject a request
without any guidance about when that's appropriate.

-- Section 10.2 --
I'm confused about what this is: isn't this giving more information
about the registry that's in 10.1 and its subsections?  Why is this
separate from that?  I think 10.2 should be deleted, and 10.2.1 should
become 10.1.3.

-- Section 10.3 --
It's always helpful to make things clear to IANA in the text, and not
require them to guess from the sectino titles:

NEW
IANA is asked to make the following registration in the "Well-Known
URIs" registry:
END

-- Section 10.4 --
Similarly here:

OLD
   This document registers two new service names as per [RFC6335].  Both
   are defined within this document.
NEW
   IANA is asked to add two new service names to the "Service Name
   and Transport Protocol Port Number Registry" [RFC6335], as
   defined below.
END

-- Section 10.6 --
And here:

OLD
   This document defines the following new iCalendar properties to be
   added to the registry defined in Section 8.2.3 of [RFC5545]:
NEW
   This document defines the following new iCalendar properties to be
   added to the "Properties" registry under "iCalendar Element
   Registries" [RFC5545]:
END

--
Barry, Applications AD