Hi Gary, Thanks for taking the time to do this. It's not glamorous work by any means, but it's worth writing down this stuff. Some minor comments here on the draft, which overall looks like it is worth doing. Section 2 could benefit from a brief description of what the protocol is and how it is used. That is, there is a text file that describes what a robot is discouraged from accessing. That file comprises multiple groups, with each group applying to one or more robots (as identified by the User-Agent value it advertises), or all robots (*). Please be doubly, extra-specific about where case-sensitivity applies. These textual formats are really bad for case sensitivity issues, particularly where some parts ("allow") are insensitive and other parts (URLs) are sensitive. There's probably security considerations here for servers that serve case insensitive resources, because this format might fail to properly exclude paths in that case. ; parser implementors: add additional lines you need (for ; example Sitemaps), and be lenient when reading lines that don't ; conform. Apply Postel's law. Sorry, but this is a red flag to me. When someone says that, I interpret this as "sorry, but we were too lazy to write a proper spec here". I know that's not the intent, and the draft already basically says what you need it to say. Section is most of the way there, though a SHOULD leaves too much wiggle room to be an effective specification. I would suggest saying explicitly: Lines that a parser does not understand MUST be ignored. And maybe "For example, the "sitemap" rule applies. That probably also comes with some better definition of lines (in prose, maybe with reference to the EOL rule; btw, check that rule, I think it's a run-on) and how groups are defined. The latter is, as far as I can see, a problem with this format. An important question you then need to consider is whether this is one group or two: User-Agent : foo garbage: garbage User-Agent: bar disallow: /off-limits That is, when garbage is ignored, does that mean that "foo" would be required to respect the disallow rule? Section 5 of includes advice on construction of ABNF you might want to read. That doesn't need to be an RFC for you to follow its advice. When talking about the /robots.txt resource, it might pay to cite RFC 7320/BCP 190 and explain why this document doesn't follow the advice there. On Mon, Jul 8, 2019, at 06:45, Gary Illyes wrote: > Hi ART, > > As you may have seen, a group has gotten together with Martijn Koster > to create an internet-draft which restates the Robots Exclusion > Protocol using updated ABNF and prescriptions. The current draft is > (draft-koster-rep). If you get a chance, we'd appreciate your > confirming that it matches your understanding of current practice and > that the specification is clear even for edge cases. You can send mail > to me to forward on to the author team or address it to the emails > listed in the draft. > > Thanks for your help, > Gary Illyes > Google Switzerland > > > _______________________________________________ > art mailing list > > >
