Re: [rtcweb] Data on travel times
Eric Rescorla <ekr@rtfm.com> Mon, 09 April 2012 16:03 UTC
Return-Path: <ekr@rtfm.com>
X-Original-To: rtcweb@ietfa.amsl.com
Delivered-To: rtcweb@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C96EC21F8753 for <rtcweb@ietfa.amsl.com>; Mon, 9 Apr 2012 09:03:22 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.977
X-Spam-Level:
X-Spam-Status: No, score=-102.977 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, RCVD_IN_DNSWL_LOW=-1, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tJPIF2uw48Ca for <rtcweb@ietfa.amsl.com>; Mon, 9 Apr 2012 09:03:21 -0700 (PDT)
Received: from mail-vb0-f44.google.com (mail-vb0-f44.google.com [209.85.212.44]) by ietfa.amsl.com (Postfix) with ESMTP id A2BDD21F8720 for <rtcweb@ietf.org>; Mon, 9 Apr 2012 09:03:21 -0700 (PDT)
Received: by vbbez10 with SMTP id ez10so2785194vbb.31 for <rtcweb@ietf.org>; Mon, 09 Apr 2012 09:03:21 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-originating-ip:in-reply-to:references:from:date :message-id:subject:to:cc:content-type:content-transfer-encoding :x-gm-message-state; bh=4x6yI90r4nVRFELvADA733gj7ghqLXgACz7bijSd9X4=; b=B1ZKQLGGbkKi1IJoeJOQXb31maBuj/oDpCc8Z1ehbIgMcYBRSCFZ4Ri6kpIeNRsrrF 4iXU5zbaJLabh72NIHFZHT2G6aOb5DOP4W66mDLlBvprxHycFPlpGxLF5qCezDkiYuqC gpzvxnKvzGDoCe16SIokSH+3+aLWmIh676CCKUqCB4LGFObkdWPcS9HX45AFuPH/23Jr vbj7OxwnJSqSdlVbH97zJV7e6fwM4Z7AVJzjzfV8/Qrl2U1kNkblEVWgE5MB8KYY54KU mUNBZV+G8Jn9id1q5d2Mgz4K0gnst255caMszn+714l1PaGlYED0S8UkH2Mfh7vDlqIx p8LA==
Received: by 10.220.153.8 with SMTP id i8mr3840272vcw.73.1333987401157; Mon, 09 Apr 2012 09:03:21 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.52.19.233 with HTTP; Mon, 9 Apr 2012 09:02:41 -0700 (PDT)
X-Originating-IP: [63.245.220.224]
In-Reply-To: <CAJNg7VLfrn_SkTXHQYmR52NP5sxpO-03swiC4RBSDpwgOOt6cg@mail.gmail.com>
References: <CABcZeBPDpguge1zT5JyDk+tohMn1_av4jgdgDhNLnXMFKNzcbg@mail.gmail.com> <CAJNg7VLfrn_SkTXHQYmR52NP5sxpO-03swiC4RBSDpwgOOt6cg@mail.gmail.com>
From: Eric Rescorla <ekr@rtfm.com>
Date: Mon, 09 Apr 2012 09:02:41 -0700
Message-ID: <CABcZeBO85+MuNshMYfF2qxU3ws7EiuHSY9Gvh0mUE7i7ot8=FQ@mail.gmail.com>
To: Marshall Eubanks <marshall.eubanks@gmail.com>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
X-Gm-Message-State: ALoCoQkoRJoPQsGGoFAKTlhD/iyte5PprletJQW7dxurPWSq+88Ho7QunAVA4iqvb4hdx+6Drxno
Cc: rtcweb@ietf.org, public-webrtc@w3.org
Subject: Re: [rtcweb] Data on travel times
X-BeenThere: rtcweb@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <rtcweb.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtcweb>
List-Post: <mailto:rtcweb@ietf.org>
List-Help: <mailto:rtcweb-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 09 Apr 2012 16:03:22 -0000
On Mon, Apr 9, 2012 at 8:35 AM, Marshall Eubanks <marshall.eubanks@gmail.com> wrote: > I really like this analysis. Some questions. > > 2012/4/9 Eric Rescorla <ekr@rtfm.com>: >> Hi folks, >> >> Since it seems like we're going to be having a large number of >> interims, I thought it might be instructive to try to analyze a bunch >> of different locations to figure out the best strategy. My first cut >> analysis is below. >> >> Note that I'm not trying to make any claims about what the best set of >> venues is. It's obviously easy to figure out any statistic we want >> about each proposed venue, but how you map that data to "best" is up >> to you. In particular, there's some tradeoff between minimal total >> travel time and a "fair" distribution of travel times (not that I >> claim to know what that means). >> >> >> METHODOLOGY >> The data below is derived by treating both people and venues as >> airport locations and using travel time as our primary instrument. >> >> 1. For each responder for the current Doodle poll, assign a home >> airport based on their draft publication history. We're missing a >> few people but basically it should be pretty complete. Since >> these people responded before the venue is known, it's at >> least somewhat unbiased. >> >> 2. Compute the shortest advertised flight between each home airport >> and the locations for each venue by looking at the shortest >> advertised Kayak flights around one of the proposed interim >> dates (6/10 - 6/13), ignoring price, but excluding "Hacker fares". >> [Thanks to Martin Thomson or helping me gather these.] >> > > 1.) Why are some fields doubled ? I.e., > > ARN SFO 14 13 > > Are these counted twice ? That would, of course, give more weight to > those records. Laziness. When I started recording flight times, I used the total time and then later realized that what I wanted was to break them out by out and back, but I was too lazy to go back and fix the earlier ones. > 2.) At any rate, I couldn't quite match your numbers. For SFO, for > example, I got > > # SFO > > Records 29 | > Mean 12.52 | > RMS 15.34 | > Std Dev 8.55 | > Minimum 1.00 | > Maximum 34.00 | > > This assumes that each doubled entry counts as 2 separate entries. If > the second entries are ignored, I get I'm not sure what procedure you are following here, but if it's taking the SD of the data in durations.txt, that's not what I did. That's just the input data. The summary data that I am showing is produced by weighting by participant from each home airport. The script to generate that is pairings.py and the results are found in doodle-out.txt. Of course, it could still all be wrong. FWIW, I'm using R's sd() which uses n-1. -Ekr > # SFO > > Records 21 | > Mean 14.05 | > RMS 17.05 | > Std Dev 9.14 | > Minimum 1.00 | > Maximum 34.00 | > > If two entries are averaged together (when present) > > # SFO > Records 21 | > Mean 13.93 | > RMS 16.97 | > Std Dev 9.18 | > Minimum 1.00 | > Maximum 34.00 | > > None of these 3 options match your > > Venue Mean Median SD > ---------------------------------------------- > SFO 13.5 11 12.2 > > In particular, your SD value seems high. > > (Note, I use the SD = root mean square /(n-1) not / n convention, but > that won't explain the difference. ) > > Regards > Marshall > > >> This lets us compute statistics for any venue and/or combination >> of venues, based on the candidate attendee list. >> >> The three proposed venues: >> >> - San Francisco (SFO) >> - Boston (BOS) >> - Stockholm (ARN) >> >> Three hubs not too distant from the proposed venues: >> >> - London (LHR) >> - Frankfurt (FRA) >> - New York (NYC) [0] >> >> Also, Calgary (YYC), since the other two chair locations (BOS and SFO) >> were already proposed as venues, and I didn't want Cullen to feel >> left out. >> >> >> RESULTS >> Here are the results for each of the above venues, measured in total >> hours of travel (i.e., round trip). >> >> Venue Mean Median SD >> ---------------------------------------------- >> SFO 13.5 11 12.2 >> BOS 12.3 11 7.5 >> ARN 17.0 21 10.7 >> FRA 14.8 17 7.3 >> LHR 13.3 14 7.5 >> NYC 11.5 11 5.8 >> YYC 14.9 13 10.2 >> SFO/BOS/ARN 14.3 13 3.6 >> SFO/NYC/LHR 12.7 11.3 3.7 >> >> XXX/YYY/ZZZ a three-way rotation of XXX, YYY, and ZZZ. Obviously, mean >> and median are intended to be some sort of aggregate measure of travel >> time. I don't have any way to measure "fairness", but SD is intended >> as some metric of the variation in travel time between attendees. >> >> The raw data and software are attached. The files are: >> >> home-airports -- the list of people's home airports >> durations.txt -- the list of airport-airport durations >> doodle.txt -- the attendees list >> pairings.py -- the software to compute travel times >> doodle-out.txt -- the computed travel times for each attendee >> >> Obviously, there could be an error in the raw data or the software. >> Please feel free to send corrections, especially if you find >> something material. >> >> >> OBSERVATIONS >> Obviously, it's hard to know what the optimal solution is without >> some model for optimality, but we can still make some observations >> based on this data: >> >> 1. If we're just concerned with minimizing total travel time, then we >> would always in New York, since it has both the shortest mean travel >> time and the shortest median travel time, but as I said above, this >> arguably isn't fair to people who live either in Europe or California, >> since they always have to travel. >> >> 2. Combining West Coast, East Coast, and European venues has >> comparable (or at least not too much worse) mean/median values than >> NYC with much lower SDs. So, arguably that kind of mix is more fair. >> >> 3. There's a pretty substantial difference between hub and non-hub >> venues. In particular, LHR has a median travel time 7 hours less than >> ARN, and the SFO/NYC/LHR combination has a median/mean travel time >> about 2 hours less than SFO/BOS/ARN (primarily accounted for by the >> LHR/ARN difference). [Full disclosure, I've favored Star Alliance hubs >> here, but you'd probably get similar results if, for instance, you >> used AMS instead of LHR.] >> >> >> Obviously, your mileage may vary based on your location and feelings >> about what's fair, but based on this data, it looks to me like a >> three-way rotation between West Coast, East Coast, and European hubs >> offers a good compromise between minimum cost and a flat distribution >> of travel times. >> >> Personally, whatever we decide to do I'd ask that the WG settle now on >> a pattern going forward so that we can predictably budget our travel >> time and dollars. >> >> >> [0] Treating all three NYC airports as a single location. >> >> _______________________________________________ >> rtcweb mailing list >> rtcweb@ietf.org >> https://www.ietf.org/mailman/listinfo/rtcweb >>
- [rtcweb] Data on travel times Eric Rescorla
- Re: [rtcweb] Data on travel times Marshall Eubanks
- Re: [rtcweb] Data on travel times Eric Rescorla
- Re: [rtcweb] Data on travel times Igor Faynberg