Re: [Uri-review] Review request for gittorrent: URI scheme

Graham Klyne <gk@ninebynine.org> Mon, 04 April 2016 07:51 UTC

Return-Path: <gk@ninebynine.org>
X-Original-To: uri-review@ietfa.amsl.com
Delivered-To: uri-review@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 770F512D160 for <uri-review@ietfa.amsl.com>; Mon, 4 Apr 2016 00:51:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.078
X-Spam-Level:
X-Spam-Status: No, score=-3.078 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, URI_HEX=1.122] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PfRlDDb21iHQ for <uri-review@ietfa.amsl.com>; Mon, 4 Apr 2016 00:51:10 -0700 (PDT)
Received: from relay16.mail.ox.ac.uk (relay16.mail.ox.ac.uk [163.1.2.166]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3896D12D0F1 for <uri-review@ietf.org>; Mon, 4 Apr 2016 00:51:10 -0700 (PDT)
Received: from smtp4.mail.ox.ac.uk ([129.67.1.207]) by relay16.mail.ox.ac.uk with esmtp (Exim 4.80) (envelope-from <gk@ninebynine.org>) id 1amzHw-0007Tr-ql; Mon, 04 Apr 2016 08:51:08 +0100
Received: from gklyne38.plus.com ([81.174.129.24] helo=sasharissa.local) by smtp4.mail.ox.ac.uk with esmtpsa (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <gk@ninebynine.org>) id 1amzHw-0002Pw-DW; Mon, 04 Apr 2016 08:51:08 +0100
Message-ID: <57021CEA.7050204@ninebynine.org>
Date: Mon, 04 Apr 2016 08:51:06 +0100
From: Graham Klyne <gk@ninebynine.org>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:31.0) Gecko/20100101 Thunderbird/31.3.0
MIME-Version: 1.0
To: Chris Rebert <iana.url.schemes.gittorrent@chrisrebert.com>, uri-review@ietf.org
References: <1459739409.1809977.567878170.34FFAB67@webmail.messagingengine.com>
In-Reply-To: <1459739409.1809977.567878170.34FFAB67@webmail.messagingengine.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
X-Oxford-Username: zool0635
Archived-At: <http://mailarchive.ietf.org/arch/msg/uri-review/qil3fhHUOvA9MTgUWLac9rDyz0c>
Subject: Re: [Uri-review] Review request for gittorrent: URI scheme
X-BeenThere: uri-review@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Proposed URI Schemes <uri-review.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/uri-review>, <mailto:uri-review-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/uri-review/>
List-Post: <mailto:uri-review@ietf.org>
List-Help: <mailto:uri-review-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/uri-review>, <mailto:uri-review-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Apr 2016 07:51:14 -0000

I have two comments:

1. It is not the place of a URI scheme registration to prohibit the use of 
fragments.  I suggest dropping the sentence "The "fragment" URI component is 
never permitted" under section "semantics".  (The protocol may not allow 
fragments, but that's no different from HTTP - once a URI is dereferenced, it is 
the format of the resulting representation that determines how if and how 
fragments may be applied.)

2 The registration says "gittorrent URIs represent Git repositories".  Looking 
at the descriptions, it seems that the URI denotes one of three different 
things, depending on the form used:

Type 0: appears to denote the hash of the latest commit in the repository.

Type 1: appears to denote the repository itself, as accessed via a bittorrent stream

Type 2: appears to denote a bitcoin transaction element containing a reference 
to the repository per type 1.

I observe that there appears to be a tight coupling between the URIs and the 
resource representations associated with dereferencing of the URIs.  If this is 
seen as being part of Web architecture (else why use URIs?), then I'd be 
inclined to be less prescritive about representation details. These may be a 
feature of current implementations, but who knows how future implementations may 
evolve?

#g
--


On 04/04/2016 04:10, Chris Rebert wrote:
> Hello,
>
> Per the advice of RFC 7595, I hereby present the following proposed
> registration of the "gittorrent" provisional URI scheme for review.
> Any feedback is greatly appreciated. Thanks.
>
> Cheers,
> Chris
> ****
> http://chrisrebert.com
> Browser 🐛 of the day: http://bugzil.la/1259972
>
> ********
>
> Scheme name:  gittorrent
>
> Status:  Provisional
>
> Applications/protocols that use this scheme name:
>    GitTorrent ("A decentralization of GitHub using BitTorrent and
>    Bitcoin")
>
> Contact:
>    Scheme creator:
>      Chris Ball <http://printf.net/>
>    Registering party:
>      Chris Rebert <iana.url.schemes.gittorrent@chrisrebert.com>
>
> Change controller:
>    Either the scheme creator or the registering party.
>
> References:
>    Ball, C., "Announcing GitTorrent: A Decentralized GitHub", 29 May
>    2015,
>        <http://blog.printf.net/articles/2015/05/29/announcing-gittorrent-a-decentralized-github/>.
>    Ball, C., "GitTorrent", 2016, <http://gittorrent.org/>.
>    Ball, C., "GitTorrent", 2016, <https://github.com/cjb/GitTorrent>.
>    Bernstein, D. J., Duif, N., Lange, T., Schwabe, P., and B. Yang,
>        "Ed25519: high-speed high-security signatures", 27 September 2011,
>        <https://ed25519.cr.yp.to/>.
>    Bitcoin Project, "Bitcoin - Open source P2P money", 2016,
>        <https://bitcoin.org/en/>.
>    Bray, T., Ed., "The JavaScript Object Notation (JSON) Data Interchange
>        Format", RFC 7159, March 2014.
>    Cohen, B., "BEP 3: The BitTorrent Protocol Specification", 11 October
>    2013,
>        <http://www.bittorrent.org/beps/bep_0003.html>.
>    Eastlake 3rd, D. and P. Jones, "US Secure Hash Algorithm 1 (SHA1)",
>    RFC 3174,
>        September 2001.
>    "Git", 2016, <https://git-scm.com/>.
>
> Scheme syntax:
>    This scheme uses a profile of the RFC 3986 generic URI syntax.
>    The "fragment" URI component is never permitted.
>
>    A gittorrent URI may come in one of three forms:
>
>      0. Where the "authority" component is a domain name
>        Example:  gittorrent://github.com/cjb/recursers
>        The "path" and "query" components have no extra restrictions.
>
>      1. Where the "authority" component is a 40-byte hexadecimal number
>      (the
>         conventional representation of a SHA-1 hash digest)
>        Example:
>        gittorrent://81e24205d4bac8496d3e13282c90ead5045f09ea/recursers
>        In this case, the "query" component is not permitted, and
>        the "path" component consists of exactly one segment (the Git
>        repository
>        name).
>
>      2. Where the "authority" is a username
>        Example:  gittorrent://cjb/foo
>        In this case, the "query" component is not permitted, and
>        the "path" component consists of exactly one segment (the Git
>        repository
>        name).
>
>    There may be further restrictions on the format of usernames and
>    repository
>    names.
>
> Scheme semantics:
>    See the GitTorrent project for details.  The following is a summary of
>    read-only usage.
>
>    gittorrent URIs represent Git repositories and specify the metadata
>    necessary
>    to clone a repository, to read the repository's commits, and, with the
>    necessary cryptographic key, to write commits to the repository.
>
>    In URIs of type (0), the SHA-1 hash identifier of the latest commit of
>    the
>    primary branch is fetched via the git protocol, as if this had been a
>    git:
>    URI.  The actual data for that commit is then downloaded via
>    BitTorrent.
>
>    In URIs of type (1), the SHA-1 hash in the "authority" component is
>    used as a
>    key for a lookup in a BitTorrent DHT (distributed hash table).  The
>    value
>    obtained from the lookup is a JSON object representing a GitTorrent
>    user
>    profile, which includes the names of that user's repositories, the
>    names of
>    those repositories' git refs, and the SHA-1 hash identifiers of the
>    commits
>    that those refs currently point to.  The "path" component is the name
>    of the
>    repository, and is used to look up the corresponding SHA-1 hash commit
>    identifier for the repository in the user profile.  The actual data
>    for that
>    commit is then downloaded via BitTorrent.
>
>    In URIs of type (2), the username in the "authority" component is used
>    for an
>    OP_RETURN transaction lookup in Bitcoin's blockchain.  If successful,
>    this
>    lookup yields a SHA-1 hash which is then used as a key for a lookup in
>    a
>    BitTorrent DHT (distributed hash table).  The value obtained from the
>    lookup
>    is a JSON object representing a GitTorrent user profile (as described
>    in the
>    previous paragraph).  The "path" component is the name of the
>    repository, and
>    is used to look up the corresponding SHA-1 hash commit identifier for
>    the
>    repository in the user profile.  The actual data for that commit is
>    then
>    downloaded via BitTorrent.
>
> Encoding considerations:
>    Unknown, use with care.
>
> Interoperability considerations:
>    Not fully known, use with care.
>
>    The "fragment" URI component has no known meaning or usage.  Unless it
>    becomes meaningful in the future, omitting it is strongly advised.
>
> Security considerations:
>    Not fully known, use with care.
>
>    GitTorrent normally uses public BitTorrent swarms, and thus doesn't
>    ensure
>    confidentiality of the Git data it stores.  Therefore it's normally
>    unsuitable
>    for Git repositories which contain unencrypted private data.  The
>    confidentiality of the data when in transit between peers depends on
>    the
>    particular flavor of the BitTorrent protocol being used by the peers.
>
>    Git and BitTorrent use SHA-1 hashes to ensure the integrity of the
>    data.  The
>    general security considerations for SHA-1 thus also apply to
>    GitTorrent.
>    GitTorrent uses Ed25519 as its digital signature scheme for ensuring
>    the
>    integrity and ownership of GitTorrent user profiles, and thus inherits
>    the
>    security considerations of Ed25519.
>
>    gittorrent: URIs of type (0) refer to hosts using domain names.  The
>    domain
>    name resolution process is subject to its own set of security
>    considerations
>    (see RFC 4033).
>    gittorrent: URIs of type (2) use GitTorrent usernames, which use the
>    Bitcoin
>    protocol/network for their registration infrastructure, and are thus
>    subject
>    to Bitcoin's security considerations.  Users of type (2) URIs should
>    keep in
>    mind that GitTorrent usernames don't necessarily correspond to the
>    usernames
>    of other Git-related systems, other source code management systems, or
>    other
>    software project management systems in general.  Users should
>    externally
>    verify the identities associated with GitTorrent usernames before
>    utilizing
>    gittorrent: URIs involving those usernames.
>
>    Beware of homograph attacks when dealing with gittorrent: URIs.
>    Attackers may
>    register GitTorrent usernames which deliberately appear visually
>    similar to
>    other GitTorrent usernames in an attempt to fool unwary users.
>    Attackers may
>    likewise upload Git repositories with names which deliberately appear
>    visually
>    similar to those of other Git repositories.
>
>    It's currently unclear precisely how GitTorrent software
>    differentiates
>    between gittorrent: URIs of type (0) and type (2).  For example,
>    without
>    further restrictions on allowed domain names, the URI
>    gittorrent://abc/xyz
>    could potentially either reference the top-level domain "abc" or the
>    GitTorrent username "abc".  Similarly, without further restrictions on
>    allowed
>    GitTorrent usernames, the URI gittorrent://abc.xyz/qwe could
>    potentially
>    either reference the domain "abc.xyz" or the GitTorrent username
>    "abc.xyz".
>    The usage of gittorrent: URIs with usernames that contain periods
>    should
>    therefore be avoided for the time being.
>    Accessing GitTorrent URIs while on an untrusted network is thus
>    potentially
>    dangerous, since a malicious network operator might be able to
>    influence which
>    interpretation the GitTorrent software chooses by causing the
>    "username" to
>    unexpectedly resolve as a domain name or by causing the domain name to
>    resolve
>    to the IP address of an attacker-controlled server.
>    Git's integrity assurance mechanisms may allow these attacks to be
>    detected in
>    certain cases, provided that the Git repository had been previously
>    cloned via
>    a trustworthy mechanism.
>
> _______________________________________________
> Uri-review mailing list
> Uri-review@ietf.org
> https://www.ietf.org/mailman/listinfo/uri-review
>