Re: [rtcweb] Data on travel times
Marshall Eubanks <marshall.eubanks@gmail.com> Mon, 09 April 2012 15:36 UTC
Return-Path: <marshall.eubanks@gmail.com>
X-Original-To: rtcweb@ietfa.amsl.com
Delivered-To: rtcweb@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A384C21F874A for <rtcweb@ietfa.amsl.com>; Mon, 9 Apr 2012 08:36:01 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -103.582
X-Spam-Level:
X-Spam-Status: No, score=-103.582 tagged_above=-999 required=5 tests=[AWL=0.017, BAYES_00=-2.599, RCVD_IN_DNSWL_LOW=-1, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LmSMYmrzrkJB for <rtcweb@ietfa.amsl.com>; Mon, 9 Apr 2012 08:36:00 -0700 (PDT)
Received: from mail-lb0-f172.google.com (mail-lb0-f172.google.com [209.85.217.172]) by ietfa.amsl.com (Postfix) with ESMTP id 5CB6B21F86FF for <rtcweb@ietf.org>; Mon, 9 Apr 2012 08:36:00 -0700 (PDT)
Received: by lbok13 with SMTP id k13so2055890lbo.31 for <rtcweb@ietf.org>; Mon, 09 Apr 2012 08:35:59 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=2IAlMV+0la8FHaSuoCcwdDgbYv0Fyy163Lp0goHpqR0=; b=SF0c1ON+TElUTsM8YD45cCLDXTGRtG8PkOXQj0Mo5M8ReJFT0kxj7B6epkvXU1y4ue VDMAq2dVm4IQVtOg032YO+Aeia4/Vn3zTgIyldsVKAaVOUDMx0B8dXkcdlCoDzZ19FkZ IBS4zThIgoh6bEssSsy3nWRJynm0sgVS6Kel7o707rZ6fXfyC0QLEMSqpSYTFQu3vXqA E5tjG/hXB9KwKIUpoc6bYM2t1MRo+DAneKdmo6ydwVChfYEYzf9O6XZfEc6WB8TOGCY8 zeVSDTWpHhRi1d53rawqkZ0VNyz2tu+DQenj3iQiAKyC6kANeO3fsOXgGcfZ99b4GcX7 Kc/g==
MIME-Version: 1.0
Received: by 10.152.105.19 with SMTP id gi19mr12101175lab.11.1333985759187; Mon, 09 Apr 2012 08:35:59 -0700 (PDT)
Received: by 10.112.46.4 with HTTP; Mon, 9 Apr 2012 08:35:59 -0700 (PDT)
In-Reply-To: <CABcZeBPDpguge1zT5JyDk+tohMn1_av4jgdgDhNLnXMFKNzcbg@mail.gmail.com>
References: <CABcZeBPDpguge1zT5JyDk+tohMn1_av4jgdgDhNLnXMFKNzcbg@mail.gmail.com>
Date: Mon, 09 Apr 2012 11:35:59 -0400
Message-ID: <CAJNg7VLfrn_SkTXHQYmR52NP5sxpO-03swiC4RBSDpwgOOt6cg@mail.gmail.com>
From: Marshall Eubanks <marshall.eubanks@gmail.com>
To: Eric Rescorla <ekr@rtfm.com>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
Cc: rtcweb@ietf.org, public-webrtc@w3.org
Subject: Re: [rtcweb] Data on travel times
X-BeenThere: rtcweb@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <rtcweb.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtcweb>
List-Post: <mailto:rtcweb@ietf.org>
List-Help: <mailto:rtcweb-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 09 Apr 2012 15:36:01 -0000
I really like this analysis. Some questions. 2012/4/9 Eric Rescorla <ekr@rtfm.com>: > Hi folks, > > Since it seems like we're going to be having a large number of > interims, I thought it might be instructive to try to analyze a bunch > of different locations to figure out the best strategy. My first cut > analysis is below. > > Note that I'm not trying to make any claims about what the best set of > venues is. It's obviously easy to figure out any statistic we want > about each proposed venue, but how you map that data to "best" is up > to you. In particular, there's some tradeoff between minimal total > travel time and a "fair" distribution of travel times (not that I > claim to know what that means). > > > METHODOLOGY > The data below is derived by treating both people and venues as > airport locations and using travel time as our primary instrument. > > 1. For each responder for the current Doodle poll, assign a home > airport based on their draft publication history. We're missing a > few people but basically it should be pretty complete. Since > these people responded before the venue is known, it's at > least somewhat unbiased. > > 2. Compute the shortest advertised flight between each home airport > and the locations for each venue by looking at the shortest > advertised Kayak flights around one of the proposed interim > dates (6/10 - 6/13), ignoring price, but excluding "Hacker fares". > [Thanks to Martin Thomson or helping me gather these.] > 1.) Why are some fields doubled ? I.e., ARN SFO 14 13 Are these counted twice ? That would, of course, give more weight to those records. 2.) At any rate, I couldn't quite match your numbers. For SFO, for example, I got # SFO Records 29 | Mean 12.52 | RMS 15.34 | Std Dev 8.55 | Minimum 1.00 | Maximum 34.00 | This assumes that each doubled entry counts as 2 separate entries. If the second entries are ignored, I get # SFO Records 21 | Mean 14.05 | RMS 17.05 | Std Dev 9.14 | Minimum 1.00 | Maximum 34.00 | If two entries are averaged together (when present) # SFO Records 21 | Mean 13.93 | RMS 16.97 | Std Dev 9.18 | Minimum 1.00 | Maximum 34.00 | None of these 3 options match your Venue Mean Median SD ---------------------------------------------- SFO 13.5 11 12.2 In particular, your SD value seems high. (Note, I use the SD = root mean square /(n-1) not / n convention, but that won't explain the difference. ) Regards Marshall > This lets us compute statistics for any venue and/or combination > of venues, based on the candidate attendee list. > > The three proposed venues: > > - San Francisco (SFO) > - Boston (BOS) > - Stockholm (ARN) > > Three hubs not too distant from the proposed venues: > > - London (LHR) > - Frankfurt (FRA) > - New York (NYC) [0] > > Also, Calgary (YYC), since the other two chair locations (BOS and SFO) > were already proposed as venues, and I didn't want Cullen to feel > left out. > > > RESULTS > Here are the results for each of the above venues, measured in total > hours of travel (i.e., round trip). > > Venue Mean Median SD > ---------------------------------------------- > SFO 13.5 11 12.2 > BOS 12.3 11 7.5 > ARN 17.0 21 10.7 > FRA 14.8 17 7.3 > LHR 13.3 14 7.5 > NYC 11.5 11 5.8 > YYC 14.9 13 10.2 > SFO/BOS/ARN 14.3 13 3.6 > SFO/NYC/LHR 12.7 11.3 3.7 > > XXX/YYY/ZZZ a three-way rotation of XXX, YYY, and ZZZ. Obviously, mean > and median are intended to be some sort of aggregate measure of travel > time. I don't have any way to measure "fairness", but SD is intended > as some metric of the variation in travel time between attendees. > > The raw data and software are attached. The files are: > > home-airports -- the list of people's home airports > durations.txt -- the list of airport-airport durations > doodle.txt -- the attendees list > pairings.py -- the software to compute travel times > doodle-out.txt -- the computed travel times for each attendee > > Obviously, there could be an error in the raw data or the software. > Please feel free to send corrections, especially if you find > something material. > > > OBSERVATIONS > Obviously, it's hard to know what the optimal solution is without > some model for optimality, but we can still make some observations > based on this data: > > 1. If we're just concerned with minimizing total travel time, then we > would always in New York, since it has both the shortest mean travel > time and the shortest median travel time, but as I said above, this > arguably isn't fair to people who live either in Europe or California, > since they always have to travel. > > 2. Combining West Coast, East Coast, and European venues has > comparable (or at least not too much worse) mean/median values than > NYC with much lower SDs. So, arguably that kind of mix is more fair. > > 3. There's a pretty substantial difference between hub and non-hub > venues. In particular, LHR has a median travel time 7 hours less than > ARN, and the SFO/NYC/LHR combination has a median/mean travel time > about 2 hours less than SFO/BOS/ARN (primarily accounted for by the > LHR/ARN difference). [Full disclosure, I've favored Star Alliance hubs > here, but you'd probably get similar results if, for instance, you > used AMS instead of LHR.] > > > Obviously, your mileage may vary based on your location and feelings > about what's fair, but based on this data, it looks to me like a > three-way rotation between West Coast, East Coast, and European hubs > offers a good compromise between minimum cost and a flat distribution > of travel times. > > Personally, whatever we decide to do I'd ask that the WG settle now on > a pattern going forward so that we can predictably budget our travel > time and dollars. > > > [0] Treating all three NYC airports as a single location. > > _______________________________________________ > rtcweb mailing list > rtcweb@ietf.org > https://www.ietf.org/mailman/listinfo/rtcweb >
- [rtcweb] Data on travel times Eric Rescorla
- Re: [rtcweb] Data on travel times Marshall Eubanks
- Re: [rtcweb] Data on travel times Eric Rescorla
- Re: [rtcweb] Data on travel times Igor Faynberg