Re: [Int-dir] Intdir last call review of draft-koster-rep-08

Gary Illyes <garyillyes@google.com> Thu, 02 June 2022 21:50 UTC

Return-Path: <illyes@google.com>
X-Original-To: int-dir@ietfa.amsl.com
Delivered-To: int-dir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AEDAAC157B56 for <int-dir@ietfa.amsl.com>; Thu, 2 Jun 2022 14:50:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -17.108
X-Spam-Level:
X-Spam-Status: No, score=-17.108 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id GIeeR2o9zq5U for <int-dir@ietfa.amsl.com>; Thu, 2 Jun 2022 14:49:57 -0700 (PDT)
Received: from mail-lj1-x234.google.com (mail-lj1-x234.google.com [IPv6:2a00:1450:4864:20::234]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1CA21C157B39 for <int-dir@ietf.org>; Thu, 2 Jun 2022 14:49:57 -0700 (PDT)
Received: by mail-lj1-x234.google.com with SMTP id g12so6631442lja.3 for <int-dir@ietf.org>; Thu, 02 Jun 2022 14:49:56 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=iHK/xbfZzCcsTzemMLCmHg6jQy8SwPWCnRm34OfnHnA=; b=hklTiQWeLg6Q8Au1yZMC3cD7SvwQLekcqOicgM4dvbqM6A6z82dVKs6Ze4f1slXIPx /41SgY2Em1y9pliTGOn9DWoH3dcAgCMeTaXDGGF3gqKPWGGd8b8FP68iAa1jEnLe44v/ zgCQ/FaMAr9kejhZV+bAsl0SP7ShY8KjxicofbOVUaVMAy7k0zadNxHlAFNVIgleCGYw fOIZ1vLQGUtb9Yjn9nwKybl4bNAGjr1MgO7IRP/1zt/QxkISLleX/IlA2ghFpjVtIJvO BlGyYz+v5K/Nl+2BLnFZURD2nHJc750xAcobwbUU4pfUXSJseTdxjY0IWTqOVY08D8gP WiYQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=iHK/xbfZzCcsTzemMLCmHg6jQy8SwPWCnRm34OfnHnA=; b=euBNNN+kIultbWuUWNOVYHkCkATCItkGgaDn3Haxzkw1zqiQfJEbMsPQMyAhaBxxN0 e4o59ccz1FwaJ7EaBTz5XzfE9B0MuQnXR2GObO056Om8mxBrZy4G13cIdfpRo3B9WbGa sILOlUVz1S7+CZM+KXWH2PbTGz9MngS/sDKGZAlf2P4HR+M0xgBcbMWLhj5fEBw6pRBy N6z4DwxM+BXbgIdZQqu41ZOIvlJXvQbb/6FVBiQQ2hHBw/KBLMivCokVbs8aTuVHg8qG alsAbUHI1/TA8pxRDwtEhbl+LnVTqazFSqU2/1Sp2CHlxtS62Ypt+bIZ/9ccjQltR8AR 7jiw==
X-Gm-Message-State: AOAM533R5uEYSMtW3jOG7+BxP6nEg/cUoVlnRpwvtwaLKPPAlgugRjAQ nm+CUJt46UUCoKKNHAoeAS2kGV/10xe5muHGQFOTRQ==
X-Google-Smtp-Source: ABdhPJz1RrbAyzWJMLBlOR0wv+idVSBW8QwpOR6jjq/eFH3WuFffnDiHXtExDurxLco9D1dFMRvomk8i1M3/IGdebPI=
X-Received: by 2002:a2e:2ac1:0:b0:255:7677:97f3 with SMTP id q184-20020a2e2ac1000000b00255767797f3mr2679002ljq.513.1654206594837; Thu, 02 Jun 2022 14:49:54 -0700 (PDT)
MIME-Version: 1.0
References: <165383408693.50938.9762995976740517128@ietfa.amsl.com>
In-Reply-To: <165383408693.50938.9762995976740517128@ietfa.amsl.com>
From: Gary Illyes <garyillyes@google.com>
Date: Thu, 02 Jun 2022 23:49:43 +0200
Message-ID: <CADTQi=ed4+SEYxi35+zOJM1ujo1gGbwDjHxVVNLzY-CxDjMtHw@mail.gmail.com>
To: Ralf Weber <rweber@akamai.com>
Cc: int-dir@ietf.org, draft-koster-rep.all@ietf.org, last-call@ietf.org
Content-Type: multipart/alternative; boundary="00000000000079052305e07dfcc1"
Archived-At: <https://mailarchive.ietf.org/arch/msg/int-dir/3N6m0FFHC4oSl9MVg3Psxd_Tcms>
Subject: Re: [Int-dir] Intdir last call review of draft-koster-rep-08
X-BeenThere: int-dir@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "This list is for discussion between the members of the Internet Area directorate." <int-dir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/int-dir>, <mailto:int-dir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/int-dir/>
List-Post: <mailto:int-dir@ietf.org>
List-Help: <mailto:int-dir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/int-dir>, <mailto:int-dir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 02 Jun 2022 21:50:00 -0000

Thank you for the actionable comments, Ralf!

I think I addressed them and pushed it to github for your viewing pleasure:
https://github.com/google/robotstxt/blob/master/protocol-draft/draft-koster-rep-09.xml

Note on the examples in section 5: The examples on robotstxt.org are more
geared towards site owners, while the draft's examples towards
parser/matcher implementers, hence the significant differences. That said,
I agree that we should have had a global user agent (*) example and so I
extended the section 5.1 example with that.

On Sun, May 29, 2022 at 4:21 PM Ralf Weber via Datatracker <noreply@ietf.org>
wrote:

> Reviewer: Ralf Weber
> Review result: Ready with Issues
>
> Moin!
>
> I am an assigned INT directorate reviewer for draft-koster-rep.
> These comments were written primarily for the benefit of the Internet Area
> Directors. Document editors and shepherd(s) should treat these comments
> just
> like they would treat comments from any other IETF contributors and resolve
> them along with any other Last Call comments that have been received. For
> more
> details on the INT Directorate, see
> https://datatracker.ietf.org/group/intdir/about/
>
> While the document technically defines the content of the robots.txt files
> it
> could do a better job in describing with examples the semantic of how
> robots
> interpret them. Especially in 2.2.1 the notation of "Crawlers MUST find the
> group that matches the product token exactly" should be better explained. I
> assume it does not mean being fully equal but instead a substring match in
> the
> User-Agent Header, so in the example would also match a http user agent of
> ExampleBotnet/1.2. Is that understanding correct at least?
>
> Also the examples in 5 seem a lot more arbirtary than what the ROBOTSTXT
> website has and it should explain all the outcomes, e.g in 5.1  it would
> allow
> access to all crawlers and all paths, but the foobot, barbot and bazbot
> /example/disallowed.gif. An example with a * group would be better and more
> realistic.
>
> So long
> -Ralf
>
>
>
>