[apps-discuss] the URI definition model (was: Fun with URLs and regex)

Matthew Kerwin <matthew@kerwin.net.au> Thu, 29 January 2015 21:55 UTC

Return-Path: <phluid61@gmail.com>
X-Original-To: apps-discuss@ietfa.amsl.com
Delivered-To: apps-discuss@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BB68C1A8854 for <apps-discuss@ietfa.amsl.com>; Thu, 29 Jan 2015 13:55:39 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.427
X-Spam-Level:
X-Spam-Status: No, score=-0.427 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FM_FORGED_GMAIL=0.622, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, J_CHICKENPOX_15=0.6, SPF_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mEkSd3NsQeqD for <apps-discuss@ietfa.amsl.com>; Thu, 29 Jan 2015 13:55:37 -0800 (PST)
Received: from mail-qg0-x235.google.com (mail-qg0-x235.google.com [IPv6:2607:f8b0:400d:c04::235]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 818771A884E for <apps-discuss@ietf.org>; Thu, 29 Jan 2015 13:55:37 -0800 (PST)
Received: by mail-qg0-f53.google.com with SMTP id a108so34885726qge.12 for <apps-discuss@ietf.org>; Thu, 29 Jan 2015 13:55:36 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:date:message-id:subject:from:to:cc:content-type; bh=qqmkc+dxKDfq7OR9YzYffLWP+zTpUEDjo48Yo0XZsa4=; b=j30bo+CnTvtKB+9uMO10qSCtWOzDcvMyBWCneFZI0nak1M+/Bv9FUMCuVVri3sh8yo X1eEP5C8W/ciqPIAeiz8ZO4TPBF/gNIyKhwPNlwpLj6SmnT+NQzOEaTZDF62R4FloC/I Rijd0NgV9GXN0Qjbr56X0ZJz1q/G0tslVbzVUtWUnI5yX7kA1dCldoq/4GU3OvkK1aCC +M7CzyoDDoLaqhAf9aZTUAEJ8LDUQMplZDBMGWKBA9UTaI1PZqjdL+JmjrnRXmToxf07 +rO1c3ImxWlOEVO5692d1x/QKvnvv3LL6NoAyGSfvXYbzUYujkZ/nob4ytHQzDcxQFzg jzbQ==
MIME-Version: 1.0
X-Received: by 10.140.35.114 with SMTP id m105mr2077120qgm.79.1422568536465; Thu, 29 Jan 2015 13:55:36 -0800 (PST)
Sender: phluid61@gmail.com
Received: by 10.140.105.75 with HTTP; Thu, 29 Jan 2015 13:55:36 -0800 (PST)
Date: Fri, 30 Jan 2015 07:55:36 +1000
X-Google-Sender-Auth: e2CN2k7-Qf3dGRp2-87jBgZBIjU
Message-ID: <CACweHNDbwm2U+0aq_+6+TrRZc9Umft7zuPChH26RWpHTz9n-cg@mail.gmail.com>
From: Matthew Kerwin <matthew@kerwin.net.au>
To: "t.petch" <ietfc@btconnect.com>
Content-Type: multipart/alternative; boundary="001a11c003c048e8b7050dd18cd7"
Archived-At: <http://mailarchive.ietf.org/arch/msg/apps-discuss/0_Fm7M8sNEggVvdgFES6XmTMJHM>
Cc: "Roy T. Fielding" <fielding@gbiv.com>, IETF Apps Discuss <apps-discuss@ietf.org>
Subject: [apps-discuss] the URI definition model (was: Fun with URLs and regex)
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/apps-discuss/>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 29 Jan 2015 21:55:40 -0000

On 29 January 2015 at 20:46, t.petch <ietfc@btconnect.com> wrote:

> ----- Original Message -----
> From: "Matthew Kerwin" <matthew@kerwin.net.au>
> To: "Roy T. Fielding" <fielding@gbiv.com>
> Cc: "IETF Apps Discuss" <apps-discuss@ietf.org>
> Sent: Thursday, January 29, 2015 5:14 AM
>
>
> I'm still suffering a misalignment: RFC 3986 defines the whole generic
> URI
> syntax as:
>
>     URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
>
> and my draft that references it essentially defines (or will soon
> define)
> the whole file-URI syntax as:
>
>     file-URI = subset-of-scheme ":" subset-of-hier-part
>
> By leaving off the query part, I've either said that a URI with a query
> part cannot be a 'file' URI, or that a URI that starts with "file:" and
> has
> a query part is invalid. Potayto potahto.
>
> <tp>
>
> When I read your I-D, I assumed that you knew what you were doing:-)  I
> can see a use for 'query' but if noone implements it, then better the
> I-D left it out.  But I did have it as a point to pursue, once the
> bigger issue, to me, of splitting what is valid according to RFC3986
> from what is not and seeing that reflected in the I-D (and of course,
> there is the mini-charter to agree:-(
>
> Tom Petch
>
> </tp>
> ​​
>
>
>
Ouch ;) To be honest, I think I thought I knew more than I really did. I'd
like to think that, as I've discovered holes in my knowledge I've also
filled them. Also in my defense, I've not had enough sleep the past week.

Part of my problem is that I'm somewhere between a software engineer and a
code hacker, so while I appreciate and take advantage of patterns and
frameworks, I can usually get by without. In the case of a URI scheme, as
Sean said earlier, our document model (write the ABNF) and the URI
definition model (define a subset of RFC 3986) don't line up. Because the
patterns don't work, I've gone with the approach of developing prototypes
that swing back and forth until they focus on the (local?) maximal solution.

Having (at least in my mind, and in local incomplete drafts) resolved the
issue of "c|/", and having some way forward for UNC, the remaining fuzzy
parts are the query and fragment. Nobody uses queries, so there's no point
adding them in. Fragments are currently out, because I was advised early on
to leave them out. The fuzziness is how to do it while remaining compatible
with both the document and definition models.

What I've seen, having looked at other schemes written or revised since RFC
3986, is that they all seem to dance around the relationship with RFC 3986.
'dns' keeps the subtleties implicit, 'acct' only says it borrows syntactic
elements, 'stun' goes out of its way to say RFC 3986 doesn't apply, 'geo'
doesn't even mention it, etc. Maybe that just means I'm overthinking it.


-- 
  Matthew Kerwin
  http://matthew.kerwin.net.au/