Re: [weirds] [Regops] Search Engines Indexing RDAP Server Content

"John R Levine" <johnl@taugh.com> Fri, 29 January 2016 17:30 UTC

Return-Path: <johnl@taugh.com>
X-Original-To: weirds@ietfa.amsl.com
Delivered-To: weirds@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EAB621A88C1 for <weirds@ietfa.amsl.com>; Fri, 29 Jan 2016 09:30:17 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 1.063
X-Spam-Level: *
X-Spam-Status: No, score=1.063 tagged_above=-999 required=5 tests=[BAYES_20=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HELO_MISMATCH_COM=0.553, HOST_MISMATCH_NET=0.311, KHOP_DYNAMIC=0.001, MIME_8BIT_HEADER=0.3, SPF_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2CeNeTa18-BC for <weirds@ietfa.amsl.com>; Fri, 29 Jan 2016 09:30:17 -0800 (PST)
Received: from miucha.iecc.com (abusenet-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:1126::2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E549D1A88D0 for <weirds@ietf.org>; Fri, 29 Jan 2016 09:30:16 -0800 (PST)
Received: (qmail 84918 invoked from network); 29 Jan 2016 17:30:15 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=simple; d=iecc.com; h=date:message-id:from:to:cc:subject:in-reply-to:references:mime-version:content-type:user-agent; s=14bb4.56aba1a7.k1601; bh=E/dbfVJrzxMds0b65c14mVbqNlvM9SwGJw1eWih0gpo=; b=1GNy9b+teBXEhdvG3nCzlnxPzD152JYb8IS0lO1LE/chT2derP8I4j1Fygx5D80MOXhjr1NV1KIf4TM4XuIWLf2F95weuIC2GUOQGI+B6Pw40LTgDE0pyjRSmmTT0ZxUHXE2u+apqfSpv4FVQuHCkZ+w1d6UDQJAZksiJrjqEUHRevPqSgvw0liP0myq2CZHSrsC3pXe6SW4oxqfHmc2KvM3HjC4DQWYiFX+DMTaJamgcnoMxfsF65criKk3/8uT
DKIM-Signature: v=1; a=rsa-sha256; c=simple; d=taugh.com; h=date:message-id:from:to:cc:subject:in-reply-to:references:mime-version:content-type:user-agent; s=14bb4.56aba1a7.k1601; bh=E/dbfVJrzxMds0b65c14mVbqNlvM9SwGJw1eWih0gpo=; b=efKAep6Jqwax7Z5fydGrnt0KBkFL487l0c+9Lo+acIXr6Q3BWqTEwxMYK1aH7tcWsPQIuZsG7Ir57Qx96hGWbqNesooD/knai3NQ2WxPPIk/HWluN3FDxqgBwOycdmQYA8ZRQuSAlX90quTqE3eLmOmrhfycfuEqj5D7mQTXmBcAur1c0ghsZQi08woYNK5uK8orUijyUubeIDY0qKsxaL/4PR8LlP2bUxY7u50ZYIKnCW7xQ7fgwYwptNDW5sPp
Received: from localhost ([IPv6:2001:470:1f07:1126::78:696d:6170]) by imap.iecc.com ([IPv6:2001:470:1f07:1126::78:696d:6170]) with ESMTPS (TLS1.0/X.509/SHA1) via TCP6; 29 Jan 2016 17:30:15 -0000
Date: 29 Jan 2016 12:30:15 -0500
Message-ID: <alpine.OSX.2.11.1601291227520.27475@ary.lan>
From: "John R Levine" <johnl@taugh.com>
To: "=?UTF-8?Q?Luis_E=2E_Mu=C3=B1oz?=" <lem@uniregistry.link>
In-Reply-To: <3F908F99-CC9C-4D7B-83CB-B6F8A8B16EA3@uniregistry.link>
References: <831693C2CDA2E849A7D7A712B24E257F4A149BDE@BRN1WNEXMBX01.vcorp.ad.vrsn.com> <3F908F99-CC9C-4D7B-83CB-B6F8A8B16EA3@uniregistry.link>
User-Agent: Alpine 2.11 (OSX 23 2013-08-11)
MIME-Version: 1.0
Content-Type: MULTIPART/MIXED; BOUNDARY="0-1363064330-1454088615=:27475"
Archived-At: <http://mailarchive.ietf.org/arch/msg/weirds/fLLAevOMc6Bp54nahpMhRNUPMTs>
Cc: "regops@nlnetlabs.nl" <regops@nlnetlabs.nl>, "gtld-tech@icann.org" <gtld-tech@icann.org>, "weirds@ietf.org" <weirds@ietf.org>
Subject: Re: [weirds] [Regops] Search Engines Indexing RDAP Server Content
X-BeenThere: weirds@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "WHOIS-based Extensible Internet Registration Data Service \(WEIRDS\)" <weirds.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/weirds>, <mailto:weirds-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/weirds/>
List-Post: <mailto:weirds@ietf.org>
List-Help: <mailto:weirds-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/weirds>, <mailto:weirds-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 29 Jan 2016 17:30:18 -0000

>> It also begs the question of the need for a BCP describing operational 
>> practices for server operators. There are ways for web servers to influence 
>> or restrict crawler behavior, but what's appropriate in this context?
>
> It would be good to keep in mind that respecting mechanisms such as 
> robots.txt is entirely voluntary on the crawler’s side.

Right -- the only reasonable assumption is that if casual users can find 
it with http, evil search engines (of which there are plenty) will too.

The search engines that most people use such as Google and Bing all obey 
robots.txt so it will keep info out of casual searches.  But I'd want to 
understand what the threat model is before inventing solutions.

Regards,
John Levine, johnl@taugh.com, Taughannock Networks, Trumansburg NY
Please consider the environment before reading this e-mail.