[rtcweb] Data on travel times

Eric Rescorla <ekr@rtfm.com> Mon, 09 April 2012 13:58 UTC

Return-Path: <ekr@rtfm.com>
X-Original-To: rtcweb@ietfa.amsl.com
Delivered-To: rtcweb@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AD8A821F8735 for <rtcweb@ietfa.amsl.com>; Mon, 9 Apr 2012 06:58:23 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -100.377
X-Spam-Level:
X-Spam-Status: No, score=-100.377 tagged_above=-999 required=5 tests=[BAYES_50=0.001, FM_FORGED_GMAIL=0.622, RCVD_IN_DNSWL_LOW=-1, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zWZkZLTPCnuW for <rtcweb@ietfa.amsl.com>; Mon, 9 Apr 2012 06:58:22 -0700 (PDT)
Received: from mail-vx0-f172.google.com (mail-vx0-f172.google.com [209.85.220.172]) by ietfa.amsl.com (Postfix) with ESMTP id 7024B21F8622 for <rtcweb@ietf.org>; Mon, 9 Apr 2012 06:58:22 -0700 (PDT)
Received: by vcbfk13 with SMTP id fk13so2162193vcb.31 for <rtcweb@ietf.org>; Mon, 09 Apr 2012 06:58:22 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-originating-ip:from:date:message-id:subject:to :content-type:x-gm-message-state; bh=fiywv+Q/m/HJqg5agLOipDxGjO1ExmTSCq1JM2PkrTo=; b=nhUKa3EoyFegeWLYkp1Mtm3WfpBkL3tp1Sr65sb42/68JWKDDK7OBlLPNNZzu5t3ph Nl5Zdw5qpOf5NbHf0SL7OuLHoCSsbpnuugcUPdAAHuhmyrp09z180ErfPipnO2x7wiEx MsatEAasx3XxcGfGE7Q1GV7N25zcZJs0WoSMLBWnwXxmaIlYWAJbdjxd4pHDlCWECip2 CrKc2tSlWPskpt6d8CehKvbsqcSegerVe7LRtL2TmO7Ih8DSlLwqrTiC+7eBcH2NQw/W vxF4JRBFfraU0Qc4w4Jd/RQE1bXauT9ir/c3jzAmTBSxGIlDCtll1k7PKwSVwMyIg/c+ LjJQ==
Received: by 10.220.57.205 with SMTP id d13mr3626002vch.53.1333979901940; Mon, 09 Apr 2012 06:58:21 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.52.19.233 with HTTP; Mon, 9 Apr 2012 06:57:39 -0700 (PDT)
X-Originating-IP: [74.95.2.173]
From: Eric Rescorla <ekr@rtfm.com>
Date: Mon, 09 Apr 2012 06:57:39 -0700
Message-ID: <CABcZeBPDpguge1zT5JyDk+tohMn1_av4jgdgDhNLnXMFKNzcbg@mail.gmail.com>
To: rtcweb@ietf.org, public-webrtc@w3.org
Content-Type: multipart/mixed; boundary="00235445b92231536004bd3f663c"
X-Gm-Message-State: ALoCoQkuB+WJrbgYP9snEwSLtV0mWAYV0bJaBK9oinlBpcqHH5q3lGugBVCHWXNxq9mXYwDMjPwy
Subject: [rtcweb] Data on travel times
X-BeenThere: rtcweb@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <rtcweb.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtcweb>
List-Post: <mailto:rtcweb@ietf.org>
List-Help: <mailto:rtcweb-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 09 Apr 2012 13:58:23 -0000

Hi folks,

Since it seems like we're going to be having a large number of
interims, I thought it might be instructive to try to analyze a bunch
of different locations to figure out the best strategy. My first cut
analysis is below.

Note that I'm not trying to make any claims about what the best set of
venues is. It's obviously easy to figure out any statistic we want
about each proposed venue, but how you map that data to "best" is up
to you. In particular, there's some tradeoff between minimal total
travel time and a "fair" distribution of travel times (not that I
claim to know what that means).


METHODOLOGY
The data below is derived by treating both people and venues as
airport locations and using travel time as our primary instrument.

1. For each responder for the current Doodle poll, assign a home
   airport based on their draft publication history.  We're missing a
   few people but basically it should be pretty complete. Since
   these people responded before the venue is known, it's at
   least somewhat unbiased.

2. Compute the shortest advertised flight between each home airport
   and the locations for each venue by looking at the shortest
   advertised Kayak flights around one of the proposed interim
   dates (6/10 - 6/13), ignoring price, but excluding "Hacker fares".
   [Thanks to Martin Thomson or helping me gather these.]

This lets us compute statistics for any venue and/or combination
of venues, based on the candidate attendee list.

The three proposed venues:

- San Francisco (SFO)
- Boston (BOS)
- Stockholm (ARN)

Three hubs not too distant from the proposed venues:

- London (LHR)
- Frankfurt (FRA)
- New York (NYC) [0]

Also, Calgary (YYC), since the other two chair locations (BOS and SFO)
were already proposed as venues, and I didn't want Cullen to feel
left out.


RESULTS
Here are the results for each of the above venues, measured in total
hours of travel (i.e., round trip).

Venue         Mean         Median           SD
----------------------------------------------
SFO           13.5             11         12.2
BOS           12.3             11          7.5
ARN           17.0             21         10.7
FRA           14.8             17          7.3
LHR           13.3             14          7.5
NYC           11.5             11          5.8
YYC           14.9             13         10.2
SFO/BOS/ARN   14.3             13          3.6
SFO/NYC/LHR   12.7             11.3        3.7

XXX/YYY/ZZZ a three-way rotation of XXX, YYY, and ZZZ. Obviously, mean
and median are intended to be some sort of aggregate measure of travel
time. I don't have any way to measure "fairness", but SD is intended
as some metric of the variation in travel time between attendees.

The raw data and software are attached. The files are:

  home-airports     -- the list of people's home airports
  durations.txt     -- the list of airport-airport durations
  doodle.txt        -- the attendees list
  pairings.py       -- the software to compute travel times
  doodle-out.txt -- the computed travel times for each attendee

Obviously, there could be an error in the raw data or the software.
Please feel free to send corrections, especially if you find
something material.


OBSERVATIONS
Obviously, it's hard to know what the optimal solution is without
some model for optimality, but we can still make some observations
based on this data:

1. If we're just concerned with minimizing total travel time, then we
would always in New York, since it has both the shortest mean travel
time and the shortest median travel time, but as I said above, this
arguably isn't fair to people who live either in Europe or California,
since they always have to travel.

2. Combining West Coast, East Coast, and European venues has
comparable (or at least not too much worse) mean/median values than
NYC with much lower SDs. So, arguably that kind of mix is more fair.

3. There's a pretty substantial difference between hub and non-hub
venues. In particular, LHR has a median travel time 7 hours less than
ARN, and the SFO/NYC/LHR combination has a median/mean travel time
about 2 hours less than SFO/BOS/ARN (primarily accounted for by the
LHR/ARN difference). [Full disclosure, I've favored Star Alliance hubs
here, but you'd probably get similar results if, for instance, you
used AMS instead of LHR.]


Obviously, your mileage may vary based on your location and feelings
about what's fair, but based on this data, it looks to me like a
three-way rotation between West Coast, East Coast, and European hubs
offers a good compromise between minimum cost and a flat distribution
of travel times.

Personally, whatever we decide to do I'd ask that the WG settle now on
a pattern going forward so that we can predictably budget our travel
time and dollars.


[0] Treating all three NYC airports as a single location.