Re: [dispatch] Proposal for scantxt

worley@ariadne.com Sun, 04 December 2022 19:24 UTC

Return-Path: <worley@alum.mit.edu>
X-Original-To: dispatch@ietfa.amsl.com
Delivered-To: dispatch@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C0C37C14F748 for <dispatch@ietfa.amsl.com>; Sun, 4 Dec 2022 11:24:30 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.981
X-Spam-Level:
X-Spam-Status: No, score=-5.981 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_SOFTFAIL=0.665, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=comcastmailservice.net
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NkQrjzpE6IIk for <dispatch@ietfa.amsl.com>; Sun, 4 Dec 2022 11:24:26 -0800 (PST)
Received: from resqmta-c1p-023832.sys.comcast.net (resqmta-c1p-023832.sys.comcast.net [96.102.19.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D3EB0C14F736 for <dispatch@ietf.org>; Sun, 4 Dec 2022 11:24:26 -0800 (PST)
Received: from resomta-c1p-022590.sys.comcast.net ([96.102.18.239]) by resqmta-c1p-023832.sys.comcast.net with ESMTP id 1uPnpYL7KpWEE1uZEpRTKg; Sun, 04 Dec 2022 19:22:24 +0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcastmailservice.net; s=20211018a; t=1670181744; bh=7P1vjuVe8Vei7O0wNK4UOzqfwv290/N+Yv2x+1DpQCE=; h=Received:Received:Received:Received:From:To:Subject:Date: Message-ID:Xfinity-Spam-Result; b=rUL2wTy/qNZjFcj7BhV1sh7VZQeEAb3vf25fIKkFcBUsCKwSGqtHq5ojWRfzbzZ0n 6XqtJqBKckgjo+e8ywFHCECDuBuZWdLUMkNOs4W5pDLsQx81F1/VkJG/oXAqDcWSZl dspEJ1d/M8cGbmQ2aL1XmMAxbpgkecR1KoCT1B9Uok3LBkdlqk61LPj4ZasR5aylll jlshUTEzBMSdkHK3KdFajpmVsflQnSCHLM1tTPWjHG4zUqTdDHddyOORUfYXGVHdc4 W278ftAeG2bQmT7Bj9cu2zK3CvPcdDOLvEJKhwY2zjVWnLv7X2el1Jr5QbApzH4KYc AOStWUlCSmASA==
Received: from hobgoblin.ariadne.com ([IPv6:2601:192:4a00:430::da2d]) by resomta-c1p-022590.sys.comcast.net with ESMTPA id 1uZCp0YJxhP6a1uZDpMiMA; Sun, 04 Dec 2022 19:22:24 +0000
X-Xfinity-VMeta: sc=-100.00;st=legit
Received: from hobgoblin.ariadne.com (localhost [127.0.0.1]) by hobgoblin.ariadne.com (8.16.1/8.16.1) with ESMTPS id 2B4JMKp2027898 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Sun, 4 Dec 2022 14:22:20 -0500
Received: (from worley@localhost) by hobgoblin.ariadne.com (8.16.1/8.16.1/Submit) id 2B4JMJSx027895; Sun, 4 Dec 2022 14:22:19 -0500
X-Authentication-Warning: hobgoblin.ariadne.com: worley set sender to worley@alum.mit.edu using -f
From: worley@ariadne.com
To: Ollie IETF <ietf=40olliejc.uk@dmarc.ietf.org>
Cc: dispatch@ietf.org
In-Reply-To: <Pz04VxP2fVxjR8KuzgdQMGsk7cFWlEmb9yHyM6_DVhtPs--WQVWJ1ZlFbhzNWWtXd5M_ipGJw1LmBAE4ulr8vCd7nKcL-t8tBaBtPGyWZzY=@olliejc.uk> (ietf=40olliejc.uk@dmarc.ietf.org)
Sender: worley@ariadne.com
Date: Sun, 04 Dec 2022 14:22:19 -0500
Message-ID: <87o7sjj7o4.fsf@hobgoblin.ariadne.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/dispatch/x6XAfg7q0I5b7l7uH_730Wx36Aw>
Subject: Re: [dispatch] Proposal for scantxt
X-BeenThere: dispatch@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: DISPATCH Working Group Mail List <dispatch.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dispatch>, <mailto:dispatch-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dispatch/>
List-Post: <mailto:dispatch@ietf.org>
List-Help: <mailto:dispatch-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dispatch>, <mailto:dispatch-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 04 Dec 2022 19:24:30 -0000

It's always useful to ask "Who is going to buy this?"  That is, who has
an unsatisfied need that this satisfies and what is that need?

As others have noted, the web-wide scanning done by search engines is a
type of scanning which the targets in general find desirable (it
delivers value to them) but robots.txt seems to be adequate to support
all of those use-cases.  (Actually, search engines have discovered that
web site owners attempt to lie to the search engines about what is in
their pages, and search engines have been working on disguising their
scans.)

What other wide-area scans are there that host owners would find useful?
Indeed, what wide-area scans deliver enough value that host owners would
be willing to spend some effort to configure how the scan is done
(rather than just opt-out by default)?

Certainly, adoption will be quicker if the first tranche of value-added
for host owners can be obtained with minimal effort.  Keep one eye on
having the system be able to encode anything that might be needed, but
keep the other eye on minimizing what needs to be understood for the
first use.

Dale