Re: [hrpc] first screening of RFC7230 for Human Rights leads

Niels ten Oever <niels@article19.org> Mon, 09 February 2015 07:01 UTC

Message-ID: <54D85B3A.2050106@article19.org>
Date: Mon, 09 Feb 2015 07:01:14 +0000
From: Niels ten Oever <niels@article19.org>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Icedove/31.4.0
MIME-Version: 1.0
To: hrpc@article19.io
References: <54D0FCD7.60302@article19.org> <87siem8tv6.fsf@alice.fifthhorseman.net>
In-Reply-To: <87siem8tv6.fsf@alice.fifthhorseman.net>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Subject: Re: [hrpc] first screening of RFC7230 for Human Rights leads
Precedence: list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi dkg,

Thanks for your elaborate response. No problems with a critical
stance, even for the sake of the argument, there is no progress
without critical thinking. Please be harsh :)

Further reaction inline:


On 02/03/2015 11:20 PM, Daniel Kahn Gillmor wrote:
> Hi Niels--
> 
> Thanks for this effort!  A few thoughts from me below, spurred by
> your initial commentary.  I apologize in advance that (when trying
> to read critically at least) i adopt something of a devil's
> advocate position. My comments are supportive in that i'd like to
> see this HR analysis proceed from a strong and well-thought-through
> perspective.
> 
> On Tue 2015-02-03 11:52:39 -0500, Niels ten Oever wrote:
>> I went through RFC7230 [0] over the last days and found some
>> hooks that might be interesting.
> [...]
>> This is relevant for our research because it helps us to look at
>> both what is organized in protocols, but maybe even more, in what
>> is not regulated or standardize which is what makes the Internet
>> an enabling environment for freedom of expression.
> 
> The idea here i think might be framed as the network being 
> "content-neutral".  This is supported in some sense by the
> satirical April Fools' Day RFC 3514, which introduces the "evil
> bit":
> 
> https://tools.ietf.org/html/rfc3514
> 
>>> Because NAT [RFC3022] boxes modify packets, they SHOULD set the
>>> evil bit on such packets.  "Transparent" http and email proxies
>>> SHOULD set the evil bit on their reply packets to the innocent
>>> client host.
> 
> There are more serious RFCs that take the idea of a content-neutral
> or content-agnostic network as a given too (for one thing, it's a
> common engineering practice to layer designs by introducing
> deliberate agnosticism about the layers above or below the system
> being specified)

This is a great lead. Do you have any leads on where
content-agnosticism has been the best described in an RFC ? Content
agnosticism, together with connectivity could be the basis of
enabling of freedom of expression on the network.

> 
> Unfortunately, not all proposed standards hold this line on
> content neutrality:
> 
> https://tools.ietf.org/html/draft-nottingham-safe-hint-05
> 
> (of course, the above draft isn't an official RFC yet -- should we
> be comparing drafts-that-didn't make it with drafts that ended up
> becoming RFCs?)
> 

I agree that it would be interesting to also look at RFCs that did
not make it, or ones that are still drafts, but I would like to keep
that for later and first see what we can still from the existing
body of RFCs, because that is already a lot.

Are there any RFCs that you know of that breach content agnosticism?
I imagine this could happen in congestion protocols, no? Or would
that be on the implementation level?

> With a corpus as large as the RFCs, how should this project avoid 
> confirmation bias?  If we look for the things we want to find, it
> seems like we should probably also make sure to look for things we
> *don't* want to find, to see whether they're there too. (see also:
> Biblical Exegesis ☺)
> 

I completely agree. But knowing what we are looking for, will also
help us finding the things we don't want to see.

> 
>> 1. In the abstract, HTTP is described as a protocol for
>> distributed and collaborative information systems. This has clear
>> technical implications, but it could also described equality of
>> nodes, which might make this translatable in rights implications.
>> I'm especially thinking about the right to receive and impart
>> information and ideas through any media and regardless of
>> frontiers. A distributed system affirms and enables that by
>> design.
>> 
>> Perhaps a good start is to note the words that keep on coming
>> back in a rights context, and word mine all RFCs for these
>> words?
>> 
>> Some of these words could be: connectivity, distributed, 
>> collabarative, reliable, scalable, caching
>> 
>> This could be one way of selecting new and more RFCs, and
>> perhaps auto-grouping them per theme.


Adding stateless, statefull, content-neutral, transparent, robust,
user-centric and content agnostic to this list and removing caching
(as per discussion below).

New list:
connectivity, distributed, collaborative, reliable, scalable,
stateless, statefull, content-neutral, content agnostic, transparent,
robust, robustness, user-centric.

>> 
>> 2. [self-descriptive message payloads] -> This means that both
>> content and description are to defined by the author/host system,
>> so people can categorize, frame and describe their content
>> themselves, instead of it being auto categorized.
> 
> i find it a bit funny that in 1. above you're proposing
> auto-grouping the RFCs themselves (presumably after fetching them
> via HTTP), and in 2. here you're saying that the protocol is
> designed in opposition to "auto categorization".
> 
> I'm wary of the term "auto" in places like this, because i think
> it removes agency.  who is doing the categorization?  even if it's 
> "automatic", it's under the control of someone.

Of course, just meant it to make it easier for us to select relevant
RFCs, and overcoming the confirmation bias to some level.
> 
> I think the point you're trying to make is that the definition of
> HTTP represents a communication between peers, and peers get to
> have the conversation without having to talk to (or get approval
> from) anyone else if they don't want to.
> 
> (technically, this might not be entirely true: DNS and (for HTTPS)
> the X.509 certificate authority cartel are in some ways mandatory 
> "brokers" when negotiating the creation of an HTTP session, even
> if they don't get a say about the content of the communication once
> the session is created)
> 
> 
>> 3. [QUOTE] Likewise, servers do not need to be aware of each
>> client's purpose: an HTTP request can be considered in isolation
>> rather than being associated with a specific type of client or a
>> predetermined sequence of application steps. [/QUOTE]
>> 
>> This supports innovation, development, and flexibility. Would
>> this have any rights implications? Or is this just very practical
>> way of defining a broadly used protocol? Could be linked to  'IP 
>> disinterestness' (as mentioned above), which creates space for
>> freedom of expression and freedom of assembly, by supplying tools
>> but not defining the way in which it needs to be done.
> 
> This property is usually called "statelessness" for the server (and
> not in a "smash the state" sense!).  Statelessness a useful
> property from a technical perspective because it means you can have
> the server crash and not have to worry about what happens to the
> client when it comes back up (the client can just carry on as it
> was).
> 
> In practice, of course, everyone wants to introduce state because
> it makes certain kinds of workflows (e.g. multipage forms,
> logged-in accounts, widespread user surveillance) much more
> convenient.  Hence cookies and other similar mechanisms.
> 
> But the point of defining HTTP as a stateless protocol is so that 
> servers *can* be implemented statelessly, for those who have
> engineering constraints that preclude keeping state on the server
> side (e.g. a machine with no way to write internal storage).
> 
> NFS (the "network filesystem") is another protocol that has jumped 
> through many hoops to keep "statelessness" for the server.  see:
> 
> https://tools.ietf.org/html/rfc1094#section-1.3



Fascinating that the argument for statelessness is reliability and
stability and introducing the concept of 'idempotence'. I would
almost like to ask the same questions as before: is there a
description of engineering standards (preferably in an RFC) that
says that a protocol should be stateless unless there are other
compelling reasons not to?
> 
> However, NFS as of version 4 has gradually acquired server-side
> state (that is, state shared between the client and the server),
> while retaining some mechanisms aimed at easing this requirement
> for servers that fit certain profiles:
> 
> https://tools.ietf.org/html/rfc3530#section-8.14
> 
> 


Under 8.14.1 it is mentioned that the transition needs to be done
transparent for the client, but transparency doesn't necessary imply
consent, right? Might be interesting to dig a bit deeper in what
transparency means on this level (and how it relates to consent).

> otoh, aiming for statelessness itself doesn't have to be motivated 
> purely by technical goals.  For example, if you design a protocol
> that *requires* the server to maintain state about its users (e.g.
> internet relay chat (IRC) servers retain state about who is
> connected and what channels they're connected to), you make it
> impossible for someone who *doesn't* want to track their users to
> implement the protocol in a non-tracking way.
> 
> Whether the push for statelessness in some internet protocols
> derives in part from this urge to safeguard against ubiquitous
> surveillance is pretty hard to say, of course.
> 


I could imagine there are also plenty of other reasons to be for a
more stateless architecture (like the ones mentioned in the NFS
RFC), all seem equally valid to me, am not sure intentionality is
crucial here.

> 
>> 4. [QUOTE] An HTTP "client" is a program that establishes a
>> connection to a server for the purpose of sending one or more
>> HTTP requests. [/QUOTE]
>> 
>> Interestingly, as interaction starts with the request from a
>> client. The primacy of every action lies with the client. Which
>> could point to souvereignty, autonomy and/or freedom of choice.
>> Are all services based on a request? Are all protocols initiated
>> by request? Would be interesting to have a statement about the
>> primacy of the client. How does this relate to cookies, consent,
>> etc? In other words: does all automation start with clients?
> 
> I'm not sure this is anything but a technical label.  In the 
> client/server network communications model, the server is defined
> as being the "listener".  the client is the one that initiates a 
> connection.
> 

OK, that makes my remarks a tautology, thanks for the clarification!

> Not all protocols are client/server, though the stuff in the IETF
> tends to be client-server because it's simpler to describe.
> 
> peer-to-peer protocols like bittorrent aren't client/server, for 
> example.  But i don't think the IETF has ever even tried to
> standardize bittorrent. And while the protocol at a high level
> might not be client/server, each individual communication that
> happens during a bittorrent session (i'm not sure i'm using the
> right BT terms here -- i don't know much about the protocol) is
> probably using a client/server model, where one peer (the client at
> that moment) sends a message to another peer (the server at that
> moment).
> 
> TCP itself supports a "simultaneous open" mode, where neither side
> is the client or the server:
> 
> https://tools.ietf.org/html/rfc793#page-32
> 
> But there are very few attempts to use simultaneous open in the
> wild (i think that STUN or TURN might use it, but i don't recall
> the details)
> 
> Some lower-level protocols like Ethernet (also not standardized by
> the IETF) are by definition broadcast -- everyone in a given
> broadcast domain receives every message, and the recipients are
> just expected to filter out traffic that isn't aimed at them.
> 
> The IETF has some protocols like IP multicast that enable
> subscription mechanisms that might take advantage of this
> lower-level broadcast technique:
> 
> https://tools.ietf.org/html/rfc1112
> 

We have been thinking of including multicast in the research, but it
seems (but correct me if I am wrong) that multicast does not see a lot
of implementation in the wild, or am I overseeing something?

<<snip>>

>> This seems to point to deepening of this topic:
>> 
>> [QUOTE] The implementation diversity of HTTP means that not all
>> user agents can make interactive suggestions to their user or
>> provide adequate warning for security or privacy concerns. 
>> [/QUOTE]
>> 
>> Eventhough security and privacy concerns are valid (one does not
>> want to give away more information than necessary), this could
>> also fit within a freedom of expression context where a user is
>> free to hold an opinions (and thus not hold or impart others!).
> 
> If we're reading this in respect to human rights, i'd be more
> inclined to take it from a disability rights perspective; you can't
> specify something into the protocol that assumes that the end user
> has a visual display that works for them, or is physically capable
> of selecting choices from a presented menu, etc.
> 

Probably you're right, so probably we should keep this aside for a
moment so we keep the focus on Freedom of Expression and Freedom of
Assembly.

> 
>> 5. In the client request Accept-Languages are defined. Perhaps we
>> can relate this to the research into IDNs (see draft) and/or use
>> this [QUOTE] to show the ambition of the Internet community to 
>> reflect the diversity of users and to be in line with Article 2
>> of the Universal Declaration of Human Rights which clearly
>> stipulates that 'everyone is entitles to all rights and freedoms
>> [..], without distinction of any kind, such as [..] language
>> [..]. [/QUOTE from ID]
> 
> I agree with this view.  You can also argue it from the reverse,
> which is that traditionally, Internet protocols concerned
> themselves only with characters expressable by US-ASCII (which
> limits to languages that use the latin alphabet), and the story of
> protocol development has been one of expansion that covers more of
> the diversity of human communications for the content that the
> protocol transmits.
> 
> Interestingly, though, most protocols retain the ASCII-only
> simplicity for protocol messages themselves.  For example, HTTP
> headers are all defined in ASCII.  HTML tags are all named in
> ASCII, even when the content of the page is entirely in ideograms.
> And the framing messages (e.g. EHLO, DATA, etc) in SMTP are still
> ASCII and will probably always be.  There's a subtle substrate of
> linguistic dominance threaded in there if you want to go looking
> for it.
> 
> For that matter, the RFCs themselves are all written in English (or
> some weird and formalized approximation thereof).
> 

Excellent points, adding this to the analysis of Article 2 issues.

> 
>> 6. Caching is crucial for enabling better access to information
>> in areas with slow connection. Could we state that through
>> caching access to information is improved? Further research to be
>> done in RFC7234.
> 
> Caching is also a place where intermediaries are introduced into
> what would otherwise be a peering relationship, though.  Caching
> proxies can modify content, spoof content outright, or refuse to
> serve content.
> 
> the httpbis working group regularly fends off proposals for 
> machine-in-the-middle (MitM) caching proxies for https, which come 
> complete with arguments very similar to "enabling better access to 
> information":
> 
> https://tools.ietf.org/html/draft-loreto-httpbis-explicitly-auth-proxy
>
>  actually says "possibility to enhance delivery performance" :/
> 
> I consider these arguments to be dangerous to the idea of network 
> security, and i'm glad that the IETF has avoided standardizing this
> sort of thing so far.
> 

Good point, will probably be too hard to balance rights here, so
dropping caching.

> 
>> 7. What (social) requirements could be meant here? :
>> 
>> [QUOTE] Additional (social) requirements are placed on
>> implementations, resource owners, and protocol element
>> registrations when they apply beyond the scope of a single
>> communication. [/QUOTE]
> 
> i'm having a hard time parsing this myself, but i think they're
> saying "most of the requirements we state in this document have to
> do with what happens explicitly in a single connection.  some other
> requirements, though, have scope larger than a single
> communication, such as how to add a new element to this protocol,
> whether clients should open multiple connections to a given server,
> or whether a server should publish URIs for its own resources that
> it would fail to parse on subsequent connections."  These
> larger-scoped requirements are the "social" requirements.
> 

Interesting understanding of the phrase :), might be interesting to
quizz Mark Notthingham about this.

>> 8. Strong point for slow and/or instable connections, support 
>> connectivity where there is bad connection. Excellent protection
>> of the right to receive and impart info. I think we could frame
>> this under connecivity as well.
>> 
>> [QUOTE] 6.3.1.  Retrying Requests
>> 
>> Connections can be closed at any time, with or without
>> intention. Implementations ought to anticipate the need to
>> recover from asynchronous close events. [/QUOTE]
> 
> this is definitely about the ethic of trying to connect, and
> robust communications in general.  Without this baseline
> assumption, most internet standards be even worse than the
> (admittedly not very good) experience we've come to expect.  But
> it's not necessarily about supporting connections where the
> underlying links might be bad.  I'd argue that it's more about
> responsible handling (and awareness) of error conditions.
> 
> Postel's law, which was taken as gospel for many years (and is
> named after Jon Postel, the first RFC editor, and the author of
> numerous early RFCs), emphasizes connectivity in a formulation that
> usually runs something like this : "Be liberal in what you receive,
> and conservative in what you send".
> 
> But within the tech security community, Postel's law is now under
> attack (or at least, heavy revision).  In particular, it is
> understood to often lead to buggy, non-predictable implementations
> that are likely to harbor security vulnerabilities (e.g. imagine if
> a TCP implementation accepted packets that had a "close enough"
> sequence number, instead of requiring a correct match).  Modern
> security-conscious standards are much more likely to adopt Postel's
> law in a more minimalist form, encThisouraging implementors to drop
> or reject ill-formed input, while dealing gracefully with the
> resulting failure conditions.
> 
> This results in less "papering over" of failures from the remote
> peer, while still providing robust communications.  Maybe the
> underlying ethics here are (a) transparency and (b) robustness?
> Both of these are user-centric notions -- the user should not be
> misled by the tools, and the tools should not disobey the user.
> 

This might be an interesting point for research as well, since it is
about balancing fault tolerance (connectivity) and security.

Am definitely adding user-centric, robustness and transparency to the
list of topics to look for. It seems that we are slowly coming up with
relevant technical concepts that impact rights online, so this is
really helpful.

Looking at RFCs through the lens of these concepts might be
methodology we are looking for.

> --dkg

Best,

Niels
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iQEcBAEBAgAGBQJU2Fs6AAoJEAi1oPJjbWjpWIAH/RO1nRiKM+Y0JjxdUyhHXI5C
lmkP34pyp9Nz9Ug0CQ+SczArh7JbKPe0xW8EPHEilN+SiAgpDI9DvZPdXT+n2MDq
F9Fv+kHIMdkzokyjE7SMhDfrbB8T2buS2oVqUQ2zFA3qIiaV2uLM/wSK0d3OvXhT
OJnwH8L0VZ/gCZSWIfJpX9DQKPdGWNC5o2qiC1GtPqhaqDFKRAK4Y14ONkS1dlWn
qRVVcd1tZ8ASiCuz8/uQ4+QdT0sXeUKWZuFw4EcDaskDqaRZ3FA+I9y7pUOop3kI
fnjAN+58mTdjq2Y5ZygWmZnXlQXB4NyaWyWIqZxr4qrF0GCHQdoSVc1FEhY/skY=
=C35E
-----END PGP SIGNATURE-----

Re: [hrpc] first screening of RFC7230 for Human R… Daniel Kahn Gillmor
[hrpc] first screening of RFC7230 for Human Right… Niels ten Oever
Re: [hrpc] first screening of RFC7230 for Human R… Stephen Farrell
Re: [hrpc] first screening of RFC7230 for Human R… Daniel Kahn Gillmor
Re: [hrpc] first screening of RFC7230 for Human R… Stephane Bortzmeyer
Re: [hrpc] first screening of RFC7230 for Human R… Niels ten Oever
Re: [hrpc] first screening of RFC7230 for Human R… Daniel Kahn Gillmor