Re: [Tzdist] AD review of draft-ietf-tzdist-service-07 - Sections 3 - 4

Cyrus Daboo <cyrus@daboo.name> Fri, 08 May 2015 17:43 UTC

Return-Path: <cyrus@daboo.name>
X-Original-To: tzdist@ietfa.amsl.com
Delivered-To: tzdist@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 467281ACDE6; Fri, 8 May 2015 10:43:44 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.788
X-Spam-Level:
X-Spam-Status: No, score=0.788 tagged_above=-999 required=5 tests=[BAYES_50=0.8, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7-ZebEjpE-oR; Fri, 8 May 2015 10:43:40 -0700 (PDT)
Received: from daboo.name (daboo.name [173.13.55.49]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4FDD91B2E19; Fri, 8 May 2015 10:43:34 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by daboo.name (Postfix) with ESMTP id 8EAB01356163; Fri, 8 May 2015 13:43:33 -0400 (EDT)
X-Virus-Scanned: amavisd-new at example.com
Received: from daboo.name ([127.0.0.1]) by localhost (daboo.name [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VmFurxpy113j; Fri, 8 May 2015 13:43:32 -0400 (EDT)
Received: from [17.45.162.184] (unknown [17.45.162.184]) by daboo.name (Postfix) with ESMTPSA id 98A911356155; Fri, 8 May 2015 13:43:31 -0400 (EDT)
Date: Fri, 08 May 2015 13:43:26 -0400
From: Cyrus Daboo <cyrus@daboo.name>
To: Barry Leiba <barryleiba@computer.org>, draft-ietf-tzdist-service@ietf.org
Message-ID: <261532677658A4DDDF1A0BAA@cyrus.local>
In-Reply-To: <CALaySJKUcgkMNsFPk0X6ur-Fw0LrB0-miQvAKYJD2rMCEFpBSQ@mail.gmail.com>
References: <CALaySJKUcgkMNsFPk0X6ur-Fw0LrB0-miQvAKYJD2rMCEFpBSQ@mail.gmail.com>
X-Mailer: Mulberry/4.1.0b1 (Mac OS X)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format="flowed"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline; size="11127"
Archived-At: <http://mailarchive.ietf.org/arch/msg/tzdist/yFB_fu7FPEg5DbrjiGs3IYa4Gb8>
Cc: tzdist@ietf.org
Subject: Re: [Tzdist] AD review of draft-ietf-tzdist-service-07 - Sections 3 - 4
X-BeenThere: tzdist@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <tzdist.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tzdist>, <mailto:tzdist-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tzdist/>
List-Post: <mailto:tzdist@ietf.org>
List-Help: <mailto:tzdist-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tzdist>, <mailto:tzdist-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 08 May 2015 17:43:44 -0000

Hi Barry,
Replying to comments on Sections 3 & 4 only.

Note: one area that needs more WG discussion here is the suggestion that 
there be a push mechanism for primary servers to notify secondary servers 
of changes, rather than require secondaries to poll once an hour.

--On May 7, 2015 at 9:57:05 AM +0100 Barry Leiba <barryleiba@computer.org> 
wrote:

> -- Section 3.3 --
> Nit: Change "including" to "included".

Fixed.

> -- Section 3.5 --
> Is this trying to say that each entire period that uses the same UTC
> offset is "an observance", and that the union of all observances
> defines the range of validity?  If that's right, you should improve on
> "Such periods of time are call observances," by saying it more like
> what I say above.

Re-worded.

> -- Section 3.9 --
>
>    two years in the past on into the future, as users typically create
>    only new events for the present and future.
>
> I think the "only" is misplaced: of course we create only new events,
> because it doesn't make sense to create existing events.  I think you
> mean "users typically create new events only for the present and
> future."  Yes?

Fixed.

>    might be concerned only with a smaller range into the future, and
>    data past that point might be redundant.
>
> Nit: I don't think "redundant" is the right word; I think
> "unnecessary" or "unused" is.

Fixed.

>    The truncation points at the start and end of a range are always a
>    UTC date-time value, with the start point being "inclusive" to the
>    overall range, and the end point being "exclusive" to the overall
>    range (i.e., the end value is just past the end of the last valid
>    value in the range).
>
> This is as good a choice as any, but I'm curious: why did you not make
> it inclusive on both ends?

Inclusive start/exclusive end, is consistent with the DTSTART/DTEND 
property behavior used in iCalendar VEVENTs, and I felt it was better to 
maintain that consistency.

> -- Section 4.1 --
>
>    Most security considerations are already handled adequately by HTTP.
>    However, given the nature of the data being transferred and the
>    requirement it be correct, all interactions between client and server
>    SHOULD use an HTTP connection protected with TLS [RFC5246] as defined
>    in [RFC2818].
>
> When one reads the definitions above about providers (which refer to
> "servers"), this appears to be recommending TLS only between providers
> and clients, and not between providers and providers (even though the
> provider making the request has the client role in that transaction).
> I *hope* you're meaning to recommend use of TLS always (if not, please
> explain why not), and I see no value in qualifying it as you do.  So I
> suggest removing "between client and server".

Actually I think I am going to remove that last sentence and instead just 
point to both the Security and Privacy sections which cover all the details 
(including the point that secondary servers MUST use TLS when fetching from 
primary servers).

> -- Section 4.1.4 --
> I find the lengthy discussion confusing, as it doesn't seem to get to
> the point(s) it's making directly.  As I understand it, they key
> points are these:
>
> 1. When the time zone list is sent, it has (a) an opaque token that
> can be used to subset the list, and (b) an ETag value for each time
> zone, which can be used to determine whether some data about that time
> zone has actually changed.
>
> 2. The most efficient way to update one's list is to (a) get a new
> list, using the old opaque token, (b) save the new opaque token for
> next time, (c) check each time zone in the returned list to see if its
> ETag value has changed, and (d) only request new time zone data for
> each time zone whose ETag value has changed.
>
> 3. Note that ETag values can sometimes change because of metadata
> changes that don't matter to the requester.  That will cause extra
> fetching, but can't be avoided.  (An example, with a monolithic
> publisher, is given.)
>
> It would help, I think, if those key points are presented more that
> way, and less as a kind of rambling discussion that doesn't seem to
> have focus.  What do you think?

A fair bit of work did go into that section, but a more "procedural" 
description might be clearer. I'll see what I can do and others can review 
it.

>    Clients SHOULD poll for changes, using an appropriate conditional
>    request, at least once a day.  A server acting as a secondary
>    provider, caching time zone data from another server, SHOULD poll for
>    changes once per hour.
>
> *** This really sounds excessive, so please talk with me about it (I'm
> sure it was discussed in the working group, and I probably just need
> to understand better).

In some places, the time zone definition (specifically when daylight saving 
time transitions occur) change with very little notice at the whim of 
politicians (e.g., a change to a rule can occur just a few days before the 
changed transition comes into effect). That gives very little time for 
clients and users to correct problems caused by the unexpected transition 
(e.g., deal with events that may now be double-booked). So we want clients 
to quickly recognize any change that impacts them. Similarly, we want 
secondary servers to update even quicker than that to reduce the overall 
latency for a client making use of a secondary server.

It is true that in some parts of the world, time zone definitions are 
relatively stable, and thus polling once a day would seem over the top, but 
there is no easy way to tell which definitions fall into that category. I 
guess a server could look at the history of each time zone and assign a 
"half-life" to it (e.g., the probability that it might change within the 
next month, based on historical trends) and clients could then use that to 
adjust to longer/shorter polling for those time zones it cares most about, 
but that is quite tricky.

> For the first sentence, it would seem to me that a client that's
> keeping the time zone information only for itself would (1) only need
> to check for changes for those time zones it cares about, and (2)
> almost never actually need to retrieve changes -- certainly not
> anywhere close to daily.  My calendar, which is quite a complicated
> one, has entries rooted in maybe six time zones (besides UTC, which
> should never change).  And in any case, I can't see how this is a 2119
> SHOULD, as it doesn't affect anything except user experience.

First off, the IANA data does not change every day, so most of the time the 
client request (with ?changedsince) will result in nothing being returned. 
When there is a change, and it is to time zones a client does not care 
about, then there will be no follow-up requests to download data either. So 
for you, your client would be polling daily, and, say, once a month would 
see an actual change, and in all likelihood would have nothing to update.

> For the second sentence,"SHOULD" is appropriate... but every HOUR,
> really?  I understand the issue that some time zone information might
> change with only a few hours' notice, but I can't imagine that that
> happens very often, and does that really make it worth having all
> secondary providers poll hourly?  Wouldn't it have not been better to
> have a means where secondaries can register with primaries and have
> them push changes, perhaps with far-less-frequent polling as a backup?

Well a push mechanism might be nice - I don't think we ever considered that 
for secondaries. It might be worth having the WG revisit that. What do 
others think?

> -- Section 4.2.1.3 --
>
>    Servers SHOULD set an appropriate Cache-Control header field value
>    (as per Section 5.2 of [RFC7234]) in the redirect response to ensure
>    caching occurs as needed, or as required by the type of response
>    generated.  For example, if it is anticipated that the location of
>    the redirect might change over time, then a "no-cache" value would be
>    used.
>
> *** For that last sentence: You really want to recommend "no-cache",
> rather than "an appropriate max-age value"?  Do you really want
> clients going back to discovery for *every request*, just to leave
> open the possibility of changing the location?

Agreed, "max-age" seems a better choice. I will make that change.

>    To facilitate "context path's" that might differ from user to user,
>    the server MAY require authentication when a client tries to access
>    the ".well-known" URI
>
> Given that this possibility exists, it might be worth mentioning it in
> 4.2.1.2, saying that if the server's context paths (no apostrophe,
> please) can differ from user to user, TXT RR is not an appropriate
> mechanism, and the server needs to use .well-known.

Actually TXT RR is just as viable - the server will simply require 
authentication on the path the TXT RR points to. So I think the fix here is 
to clone the text your quoted for .well-known for use in the TXT RR 
section. I will make that change.

> It appears that .well-known always has to work, even when there is a
> TXT RR that, as 4.2.1.2 says, MUST be used.  It would be good to say
> that explicitly.

I will add that.

> *** I do find the combination of mechanisms interesting, though: I
> MUST do an extra query for a TXT RR, which might or might not be
> there.  If it's not there, I've wasted a query, and then I use an HTTP
> request to .well-known, which I *know* will always work.  Clearly, the
> query for TXT RR is lighter weight.  But if RTT is my concern...

TXT RR was added in RFC6764 after quite some debate with various IETF DNS 
experts who were quite insistent that it was not appropriate to rely on 
.well-known alone. I think one argument was that it would be possible to 
have multiple TXT RRs for the same SRV to indicate multiple possible paths 
to use on the server. I never really got what the benefits of that was, but 
deferred to the experts.

As regards RTTs - typically the DNS server will send the matching TXT as 
additional data in the response to the SRV query, so the TXT will be in the 
clients cache avoiding the need for another DNS query.

> -- Section 4.2.1.3.1 --
>
>    the server would issue an HTTP 301 redirect
>    response with a Location response header field using the path
>    "/servlet/timezone".  The client would then "follow" this redirect to
>    the new resource and continue making HTTP requests there.
>
> I would add a sentence to the end of this, such as this:
>
> ADD
> The client would also cache the redirect information -- subject to any
> Cache-Control directive -- for use in subsequent requests.
> END

Fixed.

> -- Section 4.2.2.1 --
>
>    The client would preserve the returned
>    opaque token for subsequent use.
>
> Quite.  But it isn't untili section 6.2 that you finally tell me how
> the token is returned (and you have an example in 5.2.1).  I suggest
> saying that here -- something like this:
>
> NEW
>    The client would preserve the returned
>    opaque token (see "synctoken" in the example in Section 5.2.1)
>    for subsequent use.
> END

Fixed.

-- 
Cyrus Daboo