[Web-bot-auth] Re: Thank you and follow up
Sarah McKenna <sarah.mckenna@sequentum.com> Fri, 07 November 2025 15:22 UTC
Return-Path: <sarah.mckenna@sequentum.com>
X-Original-To: web-bot-auth@mail2.ietf.org
Delivered-To: web-bot-auth@mail2.ietf.org
Received: from localhost (localhost [127.0.0.1]) by mail2.ietf.org (Postfix) with ESMTP id E0DE9856159B for <web-bot-auth@mail2.ietf.org>; Fri, 7 Nov 2025 07:22:34 -0800 (PST)
X-Virus-Scanned: amavisd-new at ietf.org
X-Spam-Flag: NO
X-Spam-Score: -2.099
X-Spam-Level:
X-Spam-Status: No, score=-2.099 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: mail2.ietf.org (amavisd-new); dkim=pass (1024-bit key) header.d=sequentum.com
Received: from mail2.ietf.org ([166.84.6.31]) by localhost (mail2.ietf.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7N6TKH083R6h for <web-bot-auth@mail2.ietf.org>; Fri, 7 Nov 2025 07:22:32 -0800 (PST)
Received: from mail-ej1-x632.google.com (mail-ej1-x632.google.com [IPv6:2a00:1450:4864:20::632]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by mail2.ietf.org (Postfix) with ESMTPS id 61B09856158E for <web-bot-auth@ietf.org>; Fri, 7 Nov 2025 07:22:32 -0800 (PST)
Received: by mail-ej1-x632.google.com with SMTP id a640c23a62f3a-b72db05e50fso41780566b.0 for <web-bot-auth@ietf.org>; Fri, 07 Nov 2025 07:22:32 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sequentum.com; s=google; t=1762528951; x=1763133751; darn=ietf.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=flU1MwcJzDFJbHdBFGrBo/m4rHptWHBL+5SucNumidA=; b=l72n0J/Xe+z5COk/K0w1+u3EaiiVhxvXcNXMEHFLzXa0QoE9mT4vCr7INDtP8HatMV 8baU8mNZ37TDcWuhbQ+bNpbd+HtWHF2KytOjfzvgn4FIQza8lVaDX7YQLU/PEfnMygH8 Ef67R17iVDgQPoWaSb4iAdXx2zqq46cCgXpvI=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1762528951; x=1763133751; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=flU1MwcJzDFJbHdBFGrBo/m4rHptWHBL+5SucNumidA=; b=abUWo1gnaJUTPH4+kGiSaTFyZ+++ET3ZTW+ZtcO8jc0+4xt0j2LairkHEeVYSKxJdC isvC17iCPuWfrqRQJOW4rjks7/B7zghneZ9wZswyr/sg0yMBem2m6J6SMA2aI556GBl4 0kQA2dG5jPTURxpn2jWu5/yVtj360Agec8RhVBG36il2pJsHA5HhSoAqG/KM1YtXNa2E hIxwv9VcS/YkKYGawdoinaNdry+/EvxMXl/qNbEwp1BEM+Gx3QuEg87cXTQQ4d1TUZx2 pgcrxs2H3Qb0ad+Cv00SQk6py1H58Yel3SPt8+obhZq+oBmOjZbfxp+W65+FJ91h2t25 mb3g==
X-Forwarded-Encrypted: i=1; AJvYcCWZZpugz4qqgYIQdeYguxJshu5CPzgsuHTddTZUuWzwT+hOb6xki8mtytB8QZR7Dc8InqRpdYf7Z8gPwuQ=@ietf.org
X-Gm-Message-State: AOJu0YxsFbVA81nejhWTXSHs/BhvftP+4IpIMDCLT9BSVPwj6nepxfM/ +5D4t4kvfKmd++ULyR3hnHErJY8jxkLW+l/bcIbfmNhPhRhXMH4C/Kqc/hGuMz48s0X/Ne0xE/X srckhr/yf9I6yOFqCCkqsmkppBPNSZs/eQ8gqpHoYCA==
X-Gm-Gg: ASbGncvdegCXBGBoY4I+I8XhREMYt+Q74zpNhnJh6l4ym2iXiRyyi+Q8KNo8SbCIS2H J63zQ5IBlCCp2XgwpXizYzSvOVbtK3R1Cn6UURkRcekj3zfBAGRXW/ldc2DzOyM20t9z1JA4cw/ 1txSaKtvIAMPTsIX+qDcniBOgE1QbQkhvtbiPeQlN53Nb0VD7DK58SE0hJgM4KyMgrEJVBlCTj5 nIaAqZL0DmCQEnRB7Fj3lzir/cbTMwhZryK9hSMOTQE/urhOmbP0QXG16UHDr5xI70I5Jqpk7w5 Xdz3oho7mLwcz8wQKw==
X-Google-Smtp-Source: AGHT+IGUN8DDdwgfdeRrZuIy7fcmzf1/Vm7NmzDJ4Y65JRcTamWXmU1yQ4w3/UUva6O50yoE359Bqa3VzWRxSEvEaLU=
X-Received: by 2002:a17:907:1b0e:b0:b33:a2ef:c7 with SMTP id a640c23a62f3a-b72c0d93469mr317276366b.55.1762528951239; Fri, 07 Nov 2025 07:22:31 -0800 (PST)
MIME-Version: 1.0
References: <E3CFB858-4319-4245-A066-ECB37613E065@mnot.net> <1541B02A-AE1D-4883-A8B1-1F4E868884CC@mnot.net> <CAGJaBrCwTW7CmziHtXuw=tBJS_nRJPiMnwPWD0v+Xbbc1zqP=g@mail.gmail.com> <217591.1762467555@dyas> <CAGJaBrCBdd9C-573nKmx7+_8_df-4z=XCmNqhNy-dH4WwNw5Tg@mail.gmail.com> <CAE+sOjmbTdX40hFWB-2R8W-bcsTF=U6mzEj96Rro9vvqRHFHNQ@mail.gmail.com> <244131.1762479226@dyas> <MN2PR17MB3263C8C2B098CC8BA31D512CB5C3A@MN2PR17MB3263.namprd17.prod.outlook.com>
In-Reply-To: <MN2PR17MB3263C8C2B098CC8BA31D512CB5C3A@MN2PR17MB3263.namprd17.prod.outlook.com>
From: Sarah McKenna <sarah.mckenna@sequentum.com>
Date: Fri, 07 Nov 2025 10:22:21 -0500
X-Gm-Features: AWmQ_bnZNexfSqC59nxRRHTJQFx-pmN_YDAh4eLT8vRPj7_ml6kybrcMWknvhAA
Message-ID: <CAGJaBrDAv4Pj1=2eXXbUGq2utTCS3PEwVSk=A=3MPw6anw0J5w@mail.gmail.com>
To: "Maynard, Brent" <bmaynard=40akamai.com@dmarc.ietf.org>
Content-Type: multipart/alternative; boundary="0000000000000bb74a064302c0df"
Message-ID-Hash: XPXWRM55MYF6D47PZ7XNB6WHAOK7C6SS
X-Message-ID-Hash: XPXWRM55MYF6D47PZ7XNB6WHAOK7C6SS
X-MailFrom: sarah.mckenna@sequentum.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: Michael Richardson <mcr@sandelman.ca>, "web-bot-auth@ietf.org" <web-bot-auth@ietf.org>
X-Mailman-Version: 3.3.9rc6
Precedence: list
Subject: [Web-bot-auth] Re: Thank you and follow up
List-Id: Authentication of non-human users to human-oriented Web sites <web-bot-auth.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/web-bot-auth/ab2PdK8OvvW82Ey2KqpL_4Dg2Bg>
List-Archive: <https://mailarchive.ietf.org/arch/browse/web-bot-auth>
List-Help: <mailto:web-bot-auth-request@ietf.org?subject=help>
List-Owner: <mailto:web-bot-auth-owner@ietf.org>
List-Post: <mailto:web-bot-auth@ietf.org>
List-Subscribe: <mailto:web-bot-auth-join@ietf.org>
List-Unsubscribe: <mailto:web-bot-auth-leave@ietf.org>
It might be helpful for those that missed the in person session this week to watch the recording of my talk on our customer use cases. This shows why a bot won't self identify in many legitimate cases. https://youtu.be/yAwxJMxT9z4 If there is a published policy or signal (i.e., "I am not healthy now"), bots could follow it. Robots.txt only addresses crawlers and doesn't address scrapers and ai agents that navigate and interact with a site like a human does. A typical robots.txt will tell crawlers not to access the /search directory, for example, when that is the entry point for ai agents and scrapers. Alot of folks are judgemental but likely just don't understand the universe of legitimate workloads and use cases for web data. Scraping is the interoperability layer of the internet and this is what ai agents use too. On Thu, Nov 6, 2025 at 10:00 PM Maynard, Brent <bmaynard= 40akamai.com@dmarc.ietf.org> wrote: > I agree that “misbehave” is too broad to define in any useful way. It > might be more productive to describe how automated agents show consent, > identity, and backoff. Those are the things operators can actually observe > and build controls around. > > In practice, the question is simple. Does the agent respect published > policy, identify itself consistently, and react appropriately when a site > signals that it is overloaded or asks it to stop? Focusing on those > mechanics keeps us neutral on intent while setting clear expectations. > > A few small standards could help, such as a well-known policy path for > permissions and contact details, a lightweight signal for health or > overload, and a defined way for agents to slow down when needed. Even if > only the cooperative agents follow them, reducing that load often > stabilizes systems enough to handle the rest. > > Instead of trying to define misbehavior, it may be better to share > contrasting examples. Ignoring policy versus honoring it. Rotating IPs > versus using a stable identity. Fixed-rate scraping versus adaptive > backoff. Continuing during an outage versus pausing when signaled. That > kind of framing is simple, testable, and flexible enough for both > traditional crawlers and newer AI-driven agents. > > > Best, > > > Brent > > *From: *Michael Richardson <mcr@sandelman.ca> > *Date: *Thursday, November 6, 2025 at 3:34 PM > *To: *web-bot-auth@ietf.org <web-bot-auth@ietf.org> > *Subject: *[Web-bot-auth] Re: Thank you and follow up > > > Farzaneh Badiei <farzaneh@digitalmedusa.org> wrote: > > what can be done at the IETF if anything. The term “misbehave” is > > ambiguous. Misbehaving implies so many things. Is that a technical > > term? > > You are right. > There are a hundred ways to misbehave. > > There are only a few ways to behave. > (One of them is to obey robots.txt) > I don't think we should make any normative statements about good behaviour, > but we probably should have informative examples. > > > On Thu, Nov 6, 2025 at 5:27 PM Sarah McKenna <sarah.mckenna= > > 40sequentum.com@dmarc.ietf.org> wrote: > > >> I imagine in the cat and mouse game, the bots persisting after an > >> unhealthy signal is sent would be much easier to identify. Would be > >> interesting to see if a little signalling dropped traffic > effectively, > >> even traffic considered to be "human". > >> > >> > >> > >> On Thu, Nov 6, 2025 at 5:19 PM Michael Richardson > >> <mcr+ietf@sandelman.ca> wrote: > >> > >>> > >>> Sarah McKenna <sarah.mckenna=40sequentum.com@dmarc.ietf.org> > wrote: > > >>> 1. Simple idea -- what if we gave websites a simple way to > express > > >>> their health. Not a signal that privileged any one bot but which > > >>> answered a yes / no question "Are you healthy?" When a site is > >>> getting > hit too hard this gives them a way to alert all bots that > >>> they need > them to slow down or halt. Could this be a quick win > for > >>> overloaded > sites while we work out all the nuances we have > started > >>> discussing? I > imagine this would be simple for scrapers to > >>> implement if the signal > were in effect. > >>> > >>> It sounds simple, but it will be only the well-behaved bots that > look > >>> for and listen to this signal. The site will still have to defend > >>> against poorly behaved bots... > >>> > >>> That's why the group wants to annoit well-behaved bots with a > >>> blessing that allows them to be excluded from the heavy defenses. > >>> > >>> > 2. > >>> > > >>> > https://www.conserver.io/deep-dives/vcon-lifecycle-management-using-scitt > >>> > -- I had never heard of SCITT and it was suggested by two > >>> independent > people as a good way to record datasets, their > >>> provenance and > potentially as a basis for an accreditation / > >>> standards effort. At > ARDC > >>> <https://responsibledatacollection.org/>we have been looking at > > >>> croissant as a dataset standard for web data and have done a bit > of > > >>> work to this end for a combined dataset documentation standard for > >>> both > crawlers and scrapers but that's in draft still (we may need > >>> two, one > for crawling and a separate one for scraping) I will > also > >>> try to wrap > my mind around SCITT to see how it might fit as a > >>> solution. > >>> > >>> Does not seem to fit within web-bot-auth itself to me. Obviously, > >>> vCon and SCITT *are* existing WGs producing specifications, which > >>> maybe some crawlers would want to leverage in some way. > >>> > >>> > >>> -- > >>> Michael Richardson <mcr+IETF@sandelman.ca>, Sandelman Software > Works > >>> -= IPv6 IoT consulting =- *I*LIKE*TRAINS* > >>> > >>> > >>> > >>> _______________________________________________ > >> Web-bot-auth mailing list -- web-bot-auth@ietf.org To unsubscribe > send > >> an email to web-bot-auth-leave@ietf.org > >> > > > ---------------------------------------------------- > > Alternatives: > > > ---------------------------------------------------- > _______________________________________________ > Web-bot-auth mailing list -- web-bot-auth@ietf.org > To unsubscribe send an email to web-bot-auth-leave@ietf.org >
- [Web-bot-auth] Side meeting: Future of the Open W… Mark Nottingham
- [Web-bot-auth] Re: Side meeting: Future of the Op… Mark Nottingham
- [Web-bot-auth] Thank you and follow up Sarah McKenna
- [Web-bot-auth] Re: Thank you and follow up Michael Richardson
- [Web-bot-auth] Re: Thank you and follow up Sarah McKenna
- [Web-bot-auth] Re: Thank you and follow up Farzaneh Badiei
- [Web-bot-auth] Re: Thank you and follow up Michael Richardson
- [Web-bot-auth] Re: Thank you and follow up Maynard, Brent
- [Web-bot-auth] Re: Thank you and follow up Sarah McKenna
- [Web-bot-auth] Re: Thank you and follow up Thibault Meunier
- [Web-bot-auth] Re: Thank you and follow up Sarah McKenna
- [Web-bot-auth] Re: Thank you and follow up Gary Illyes
- [Web-bot-auth] Re: Thank you and follow up Sarah McKenna
- [Web-bot-auth] Re: Thank you and follow up Greg Lindahl
- [Web-bot-auth] Re: Thank you and follow up Liuchunchi(Peter)
- [Web-bot-auth] Re: Thank you and follow up Greg Lindahl
- [Web-bot-auth] Re: Thank you and follow up Gaurav Shukla
- [Web-bot-auth] Re: Thank you and follow up Jo Levy