Re: [apps-discuss] [link-relations] Fwd: I-D Action: draft-ohye-canonical-link-relation-00.txt

Julian Reschke <> Sun, 03 July 2011 07:56 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 9CB4621F86F1 for <>; Sun, 3 Jul 2011 00:56:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -104.299
X-Spam-Status: No, score=-104.299 tagged_above=-999 required=5 tests=[AWL=-2.300, BAYES_00=-2.599, J_CHICKENPOX_39=0.6, USER_IN_WHITELIST=-100]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id eYdFbLtTYLTa for <>; Sun, 3 Jul 2011 00:56:12 -0700 (PDT)
Received: from ( []) by (Postfix) with SMTP id 6BB1721F86EE for <>; Sun, 3 Jul 2011 00:56:12 -0700 (PDT)
Received: (qmail invoked by alias); 03 Jul 2011 07:56:07 -0000
Received: from (EHLO []) [] by (mp056) with SMTP; 03 Jul 2011 09:56:07 +0200
X-Authenticated: #1915285
X-Provags-ID: V01U2FsdGVkX18VimYpohYVGr3N4AfhWVBKp77MfZ6W0nMzTY/G2j 0Fy2I1UYwUPICt
Message-ID: <>
Date: Sun, 03 Jul 2011 09:55:56 +0200
From: Julian Reschke <>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20110624 Thunderbird/5.0
MIME-Version: 1.0
To: Maile Ohye <>
References: <> <> <> <> <> <> <> <1309613470.2807.17.camel@mackerel> <> <>
In-Reply-To: <>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 8bit
X-Y-GMX-Trusted: 0
Cc: "" <>,, IETF Apps Discuss <>, Bjartur Thorlacius <>
Subject: Re: [apps-discuss] [link-relations] Fwd: I-D Action: draft-ohye-canonical-link-relation-00.txt
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: General discussion of application-layer protocols <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sun, 03 Jul 2011 07:56:13 -0000

On 2011-07-03 03:23, Maile Ohye wrote:
> 1. OPEN. F. Ellermann:
> A relative canonical URL can't be a good idea.  If there is more thanone
> "content URL" (in the terminology of the draft) this would resultin more
> than one canonical URL, defeat the purpose, and worse, thiscould make
> googlebot angry.
> --response by F. Ellermann: “... But now I see that relativecan be
> perfectly fine if and only if all incarnations exist on the sameserver,
> e.g., http://example/xyzzy.html?any-querycould get a relativecanonical
> URL xyzzy.html?default or similar.  A better example in thedraft
> explaining when a relative canonical URL is okay could help.”
> --response by M. Ohye: “We could add a relative URL in the Examples section:
> Then duplicate content URIs such as:
> <>
> <>
> may designate the canonical link relation in HTML as specified in
>    [RFC5988]:
> <link rel="canonical"
>            href="" />
> or <link rel="canonical” href="page.php?item=purse" />
> or alternatively, in the HTTP Header... “


> 2. OPEN. F. Ellermann:
> The draft could s/SHOULD NOT/MUST NOT/, I don't see any good reasonto
> violate a SHOULD NOT, and if that's correct MUST NOT is clearer.
> --seconded by M. Yevstifeyev
> --response by M. Ohye and J. Kupke: “We prefer SHOULD NOT for a few
> reasons. 1) If we outlawed multiple canonicals using MUST NOT, we would
> effectively call the HTML invalid. In reality, the HTML will still be
> processed, though it’s likely that search engines will ignore both/all
> rel=canonicals. 2) Worse, for the cases where somebody might
> rel=canonical to a 404, etc., if we use MUST NOT, it would place a huge
> (and entirely unrealistic) burden on the site owner to ensure that
> search engines recrawl pages in such an order that all rel=canonical
> sources are updated before a page may become a 404.”

I'm not sure I understand the response. Is there a use case where an 
author would legitimately add multiple instances of the link relation?

 > ...
> 4. OPEN. M. Yevstifeyev:
>>  The canonical link relation specifies the preferred version of a URI
> I think some introductory text on linking, probably based on RFC 5988,
> should go here.
> --response by J. Reschke "Why? It defines a link relation as defined by
> RFC 5988, so why repeat text from over there?"
> --response by M. Yevstifeyev "It should be mentioned (1) what is link
> relation at all and (2) that RFC 5988 is a specification of that
> technology which this document depends on.  RFC 5988 is first mentioned
> in Examples."
> --response by M. Ohye. “We could modify to:
> “The canonical link relation (Link Relation Types reference <xref
> target="RFC5988"/>) specifies the preferred version of a URI...”


> 5. OPEN. M. Yevstifeyev:
>>  Presence of the canonical link relation indicates to applications,
> such as search engines, that they MAY:
> I wonder why it's MAY; in this case implementations (explicitly, those
> apps which interpret Link: headers and corresponding construction in
> HTML) will be free to ignore it.  I think normative SHOULD should be OK
> (sorry for pun).
> --response by J. Reschke "I think this link relation is purely advisory,
> so a better approach might be to replace "MAY" by "can"."
> --response by M. Yevstifeyev "Yes, advisory, which suits RFC 2119
> definition for SHOULD: 'SHOULD   This word, or the adjective
> "RECOMMENDED", mean that there may exist valid reasons in particular
> circumstances to ignore a particular item, but the full implications
> must be understood and carefully weighed before choosing a different
> course.'
> and natural meaning of should - advice/recommendation."
> --response by M. Ohye: “Thanks, in discussion with Joachim Kupke.”

No, it's really not a SHOULD.

> ...
> 7. OPEN. M. Yevstifeyev:
>    o Exist on a different protocol: http to https, or vice versa
> You probably meant URI scheme here, since https isn't a separate
> protocol.  As before these points we had "The value of the
> target/canonical URI MAY" or, if you consider my comment above, "The
> target/canonical URI MAY", this point may be reworded as "Have different
> scheme names" (which suits the second variant of a preface to this list
> better).
> --agreed by J. Reschke
> --response by M. Ohye/J. Kupke: “Good catch, Mykyta. We’re fine to
> change the draft to “scheme”:
> Have different scheme names: such as http to https, or vice versa
> Do we now need to expand the draft for ftp:// and gopher:// URIs? For
> example, ftp:// and gopher:// URIs”
> 1) Do not come with the equivalent of RFC 5988, so a non-HTML document
> available at any such URI won't be available to make use of <link
> rel="canonical">.
> 2) Have corresponding GOPHER error code (item type 3) or an FTP error
> 550, which like HTTP 404, is forbidden from being served for the target
> of a <link rel="canonical">.

a) A non-HTML document at a non-HTTP(s) URI may not be able to specify 
the link relation, but it could be the target of a link, relation, 
right? (that could be an example)

b) Future protocols might have other means to specify a link relation, btw.

> 8. OPEN. M. Yevstifeyev:
> Reading section 3 and 5 of the draft, it seems that is mandates use of
> HTTP when referring to canonical URIs.  And what is the situation when
> target URI is a 'ftp' or 'gopher' URI?  Section 3 allows different
> scheme names in context/target URIs, if I understand it correctly.
>   Therefore, unless it is deliberately, I think any mention of HTTP
> should be replaced by more generic regulations.
> --response by J. Reschke "Nope; I think the HTTP examples are very
> useful. But maybe we can have an additional statement that the link
> relation isn't specific to HTTP."
> --response by M. Yevstifeyev"Currently we have normative reference to
> RFC 2616 and normative requirements with respect to HTTP.  HTTP examples
> are OK; but it's redundant in Section 3.  I suppose in Section 3 we may
> replace HTTP-related stuff with something in the way like:
> Old:
>    o  The source URI of a "300 Multiple Choices" URI (Section 10.3.1 of
>       [RFC2616]) or a permanent redirect (Section 10.3.2 of [RFC2616]).
> New:
>    o  The source URI, which defines a resource which provides choice
>       in different represntations of a given resource, ientified by
>       the context URI, or is a link which has been permanently replaced
>       by an other one.
> etc."
> --response by B. Thorlacius: “Your wording seems overly confusing. Which
> is the resource that "provides choice in different represntations of a
> given resource?" A standard could be assigned the URI
> <>. An HTTP GET /spec might be responded with an
> HTTP/1.1 300 choice, and an entity linking to /spec.node.html,
> /spec.html, /spec.pdf, and /spec.txt. The resource (the standard, that
> is) would in no way provide this choice. The HTTP server simply offered
> multiple representations.”
> --response by M. Yevstifeyev: “First, this was an example only.  Next,
> my point was that the document makes HTTP/'http' scheme mandatory in
> context/target URIs, which I don't think is appropriate, since canonical
> URI may refer to a resource accessible via other protocol.  Even though
> HTTP is going to be the most often use case of canonical link relation,
> we shouldn't exclude other protocols.”
> --response by B. Thorlacius: “I agree. However, I don't understand the
> need for forbidding canonical links to resources with multiple
> representations. Are there not to be canonical links from
> representations of a resource to the resource (i.e. from /spec.html and
> /spec.txt to /spec)?”
> --response by M. Yevstifeyev: “Probably such restriction is set because
> multiple representation choice may ultimately refer the user to a
> resource which is not canonical.  A _definite_ canonical resource is
> necessary and required.”
> ... general feeling here is that having specific HTTP examples is 
good, as long as the spec doesn't make the reader think it's the only 
protocol that qualifies.

Best regards, Julian