Re: Google Scholar, was How to pay $47 for a copy of RFC 793

John C Klensin <john-ietf@jck.com> Tue, 10 May 2011 20:08 UTC

Return-Path: <john-ietf@jck.com>
X-Original-To: ietf@ietfa.amsl.com
Delivered-To: ietf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6AF0DE06C4 for <ietf@ietfa.amsl.com>; Tue, 10 May 2011 13:08:54 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.67
X-Spam-Level:
X-Spam-Status: No, score=-102.67 tagged_above=-999 required=5 tests=[AWL=-0.071, BAYES_00=-2.599, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fXgvD29Wbrq5 for <ietf@ietfa.amsl.com>; Tue, 10 May 2011 13:08:53 -0700 (PDT)
Received: from bs.jck.com (ns.jck.com [209.187.148.211]) by ietfa.amsl.com (Postfix) with ESMTP id 86504E0805 for <ietf@ietf.org>; Tue, 10 May 2011 13:08:52 -0700 (PDT)
Received: from [127.0.0.1] (helo=localhost) by bs.jck.com with esmtp (Exim 4.34) id 1QJtEY-000OAO-E9; Tue, 10 May 2011 16:08:42 -0400
Date: Tue, 10 May 2011 16:08:41 -0400
From: John C Klensin <john-ietf@jck.com>
To: Harald Alvestrand <harald@alvestrand.no>, Paul Hoffman <paul.hoffman@vpnc.org>
Subject: Re: Google Scholar, was How to pay $47 for a copy of RFC 793
Message-ID: <457217AD26982E40EEF2C057@PST.JCK.COM>
In-Reply-To: <4DC9824C.2070109@alvestrand.no>
References: <20110510152851.40727.qmail@joyce.lan> <4DC95CBE.60304@alvestrand.no> <1C26E7D5-1810-4B13-B51B-A1220121531F@vpnc.org> <4DC9824C.2070109@alvestrand.no>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Cc: John Levine <johnl@iecc.com>, ietf@ietf.org
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ietf>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 10 May 2011 20:08:54 -0000

--On Tuesday, May 10, 2011 20:22 +0200 Harald Alvestrand
<harald@alvestrand.no> wrote:

>> If only there was someone who worked at Google on this list
>> who could send an internal message to get this rectified....
>> :-)
>  From what I could tell from the instructions, Scholar is
> using some heuristics to figure out that "this is a paper" and
> "this is not a paper". The highest one on the list was a
> 3-slide presentation that really didn't say very much - I
> think this is one where heuristics had failed.

> I think someone at the site could help them a lot more.

Harald, 

I'm not sure what you mean by "someone at the site".  Certainly,
various of us could explain to them why the series should be
more comprehensibly indexed.  But with Maps as a notable
exception, I've found that suggesting that a particular
heuristic is failing, or that something should have been indexed
that isn't, is most likely to get a response whose essence is
the Google folks and their algorithms are ever so much smarter
then us lusers, so what could we possibly know?

Of course, my personal heuristic, and that of many folks I know
who use Scholar much more intensely than I do, is that if a
Scholar search fails or produces nonsense, I go to the
general-purpose search engine.   For RFCs, it tends to do very
well, both at finding the right stuff and at ranking the RFC
text itself near the top. 

So, other than being lazy about not doing the second search,
pedantic about what Scholar should be indexing and how, or
demanding and expecting a more perfect universe, I'm not sure I
see a real problem in this.

    john