Re: Comments/Issues on P1

Amos Jeffries <squid3@treenet.co.nz> Tue, 24 April 2012 03:59 UTC

To: ietf-http-wg@w3.org
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Date: Tue, 24 Apr 2012 15:56:52 +1200
From: Amos Jeffries <squid3@treenet.co.nz>
In-Reply-To: <2D4DB009-5EB1-41CB-854D-641E40C3010C@niven-jenkins.co.uk>
References: <2D4DB009-5EB1-41CB-854D-641E40C3010C@niven-jenkins.co.uk>
Message-ID: <6fd54a61c60ef8cb0ccb7a9a1ab3099e@treenet.co.nz>
User-Agent: Roundcube Webmail/0.7.1
Received-SPF: pass client-ip=58.28.153.233; envelope-from=squid3@treenet.co.nz; helo=treenet.co.nz
Subject: Re: Comments/Issues on P1
Archived-At: <http://www.w3.org/mid/6fd54a61c60ef8cb0ccb7a9a1ab3099e@treenet.co.nz>
Resent-From: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
Resent-Message-Id: <E1SMWst-0003t8-G8@frink.w3.org>
Resent-Date: Tue, 24 Apr 2012 03:57:47 +0000

On 24.04.2012 06:11, Ben Niven-Jenkins wrote:
> Hi,
>
> As part of reviewing the HTTPBIS documents we had the following
> comments/issues on P1
>
>
> 1) On page 11 it states:
>
>    All HTTP requirements
>    applicable to an origin server also apply to the outbound
>    communication of a gateway. [...]
>
>    However, an HTTP-to-HTTP gateway that wishes to interoperate with
>    third-party HTTP servers MUST conform to HTTP user agent 
> requirements
>    on the gateway's inbound connection
>
> This is probably a good default assumption, but it is not always 
> true:
>
> * A User-Agent can, either automatically or under interactive
>   user direction, decide to retry non-idempotent requests. An
>   intermediate must never do this.

Why do you say "never"?
  there are several cases where intermediaries can and should re-try 
automatically. TCP connection handshake failures for example. All cases 
which require user interaction they can relay back to their own client 
for that interaction.

>
> * An origin server is always authoritative, an intermediate is
>   not and so can sometimes not make certain decisions that an
>   origin could. (See If-Match below for an example.)

An intermediary can produce authoritative answers inferred from known 
state. Not in as many cases as an origin which has access to all state, 
true. But still some cases can be responded.

Overall a proxy has all of the relevant role, with additional 
limitations on how it handles things "internally" to meet those external 
server/client role requirements.

>
> 2) On page 13 it states:
>
>    thereby letting the recipient know that more advanced
>    features can be used in response (by servers) or in future 
> requests
>    (by clients).
>
> and on page 56 it states:
>
>      o  Proxies SHOULD maintain a record of the HTTP version numbers
>         received from recently-referenced next-hop servers.
>
> A server does not appear to be committed to supporting the same HTTP
> version from request to request, or from one URL to another on the
> same server. (As an example at the same address and under the same
> vhost, some URLs might be served by the "real" http server which 
> fully
> supports HTTP/1.1, and some by CGI scripts which might only support
> HTTP/1.0.)

section 2.6 paragraph 7
   "Intermediaries that process HTTP messages (i.e., all intermediaries
    other than those acting as tunnels) MUST send their own HTTP-version
    in forwarded messages."

In the example you put forward, the vhost is a gateway intermediary for 
the CGI script origin. The CGI has its own Server: header and version. 
The gateway vhost has its own Server: header and version. Te relaying 
vhost is responsible for upgrading the CGI responses to its 1.1 version.

The vhost being a gateway intermediary for the CGI is required to 
downgrade the request for the CGI capabilities, and upgrade the 
responses with its 1.1 version.

There is one exception. And that is where the client request arrives 
with an older version. The server MAY downgrade its version to that 
supported by the client.

>
> Therefore it seem unwise to rely on the upstream server supporting a
> particular version from request to request, unless there is a tighter
> coupling between them (for example, under the same organisational
> control) and the intermediate can be configured to specifically 
> assume
> a given HTTP version for a given upstream.
>
> The above is also unclear about what constitutes a "next-hop server":
> if the hostname resolves to multiple addresses, they are not all
> necessarily running the same version of the server, or even the same
> server software at all. The same is probably even more likely for
> different ports on the same machine.

They are all potential next-hop, and apparently not the same server.

>
> Different vhosts might be directed to different backends with
> different capabilities, and as mentioned the same is true for
> different URLs even on the same address/port/vhost. The most
> conservative interpretation would be to match on address, port, vhost
> and URL, but that seems to require remembering an excessive amount of
> history.

Whatever software is doing the directing is responsible for ensuring 
consistency in capabilities support and version advertisements.

>
> 3) On page 18 it states:
>
>    Resources made available via the "https" scheme have no shared
>    identity with the "http" scheme even if their resource identifiers
>    indicate the same authority (the same host listening to the same 
> TCP
>    port).  They are distinct name spaces and are considered to be
>    distinct origin servers.
>
> While this is true for a separate shared cache, a common deployment
> might be gateways employed as protocol translators/accelerators. A
> small number of backend origins without enough oomph to serve HTTPS
> directly behind a larger set of more powerful HTTPS enabled gateways
> under the same organisational control, linked over an internal-only
> private network. Here the internal traffic is "http://" requests, but
> the external traffic is "https://" with the obvious direct mapping of
> pathuris from one to the other.

Stop right there. This assumed URL-rewriting is directly violating the 
above clause.

The HTTP compliant behaviour in this TLS/no-TLS gateway intermediary is 
to map HTTPS (port 443) traffic to https:// URLs sent in plain HTTP 
requests over the internal non-TLS connection.

Any client facing http://example.com/ and https://example.com/ can and 
SHOULD assume they are different as are ftp://example.com/ and 
http://example.com/. Any server internal URL equivalence is exactly that 
*internal* to the server and must not be assumed by the client.

AYJ

Re: Comments/Issues on P1 Amos Jeffries
Comments/Issues on P1 Ben Niven-Jenkins
Re: Comments/Issues on P1 Willy Tarreau
Re: Comments/Issues on P1 Amos Jeffries
Re: Comments/Issues on P1 Willy Tarreau
Re: Comments/Issues on P1 Mark Nottingham
Re: Comments/Issues on P1 Ben Niven-Jenkins