Re: [apps-discuss] Fun with URLs and regex

Matthew Kerwin <matthew@kerwin.net.au> Wed, 28 January 2015 23:59 UTC

Return-Path: <phluid61@gmail.com>
X-Original-To: apps-discuss@ietfa.amsl.com
Delivered-To: apps-discuss@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 15ACD1A8784 for <apps-discuss@ietfa.amsl.com>; Wed, 28 Jan 2015 15:59:20 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.028
X-Spam-Level:
X-Spam-Status: No, score=-1.028 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FM_FORGED_GMAIL=0.622, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, SPF_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XlGXLJazCTdh for <apps-discuss@ietfa.amsl.com>; Wed, 28 Jan 2015 15:59:18 -0800 (PST)
Received: from mail-qg0-x22e.google.com (mail-qg0-x22e.google.com [IPv6:2607:f8b0:400d:c04::22e]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8AA671A854D for <apps-discuss@ietf.org>; Wed, 28 Jan 2015 15:59:18 -0800 (PST)
Received: by mail-qg0-f46.google.com with SMTP id i50so21364974qgf.5 for <apps-discuss@ietf.org>; Wed, 28 Jan 2015 15:59:17 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=RTsKTor2Q+K8Y2HhpPqHX2PwIEE6x6u1bWIBhFgsLQA=; b=G7R1MWVTofF/Ai1O1yW2R7Vr8l4Gymz8xolr4UUugGo3rws3Je7CfT7DElneGl4rDq bN8xHVjmPSn1fhMu9Nhx9BSEVmcN8irfPC3cZFBXGB+GiLGK51VEfLMrwjNpYRFXaV86 r/4lzneYQCKui66N41eF4yRLiz9ed6gmJSovHo3zhLlDfDy6vfeooqh0F2W0rq3hm7Hy RD16lUjtVYkDm0dl71FVAWLca4bNPQrUtwShCst0214QXUeBtm5twAV4SttXHH2xkfZM yPVBYBIa0Y+HDtHFxlTgCGfvi5zBU+LFx83LWbziEKZCD+tSmX7c7lb+yMjF/DmN658W 8nPA==
MIME-Version: 1.0
X-Received: by 10.140.35.114 with SMTP id m105mr17681792qgm.79.1422489557819; Wed, 28 Jan 2015 15:59:17 -0800 (PST)
Sender: phluid61@gmail.com
Received: by 10.140.105.75 with HTTP; Wed, 28 Jan 2015 15:59:17 -0800 (PST)
In-Reply-To: <54C95AF7.6030703@gmx.de>
References: <C5B10293-E6F6-4348-9782-C9C00A4476CE@mnot.net> <CACweHNBVOrVMesB7HOjPNHe5FtzL1k9XDGAHUXAx5DbOSYv5jA@mail.gmail.com> <A1E5B0EC-FAD5-4178-8C7B-540BEB61DC06@mnot.net> <54AEB660.1020701@intertwingly.net> <F122ADA8-4A96-4F88-BB9F-3C5C6A544067@mnot.net> <54C84872.5040902@intertwingly.net> <EF1E36FA-6A30-4A65-9520-5A31571EE445@mnot.net> <54C95132.2060402@gmx.de> <154ABFBB-AB8C-447A-89A3-D1746EFBF1C6@gbiv.com> <54C95AF7.6030703@gmx.de>
Date: Thu, 29 Jan 2015 09:59:17 +1000
X-Google-Sender-Auth: QtnnpdgGAzl04RpQax2VS-pNeQg
Message-ID: <CACweHNBHiEGUwLB3z6YoTexF=b9ApwsUy6-DVCf9vnBSD+L5Rw@mail.gmail.com>
From: Matthew Kerwin <matthew@kerwin.net.au>
To: Julian Reschke <julian.reschke@gmx.de>
Content-Type: text/plain; charset="UTF-8"
Archived-At: <http://mailarchive.ietf.org/arch/msg/apps-discuss/IXUJhYZPJ0UWzkb8rT3hVSiwJoE>
Cc: "Roy T. Fielding" <fielding@gbiv.com>, Mark Nottingham <mnot@mnot.net>, IETF Apps Discuss <apps-discuss@ietf.org>
Subject: Re: [apps-discuss] Fun with URLs and regex
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/apps-discuss/>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 28 Jan 2015 23:59:20 -0000

On 29/01/2015, Julian Reschke <julian.reschke@gmx.de> wrote:
>
> I agree that the fragment is part of the URI; the question, as far as I
> understand, is whether the *scheme* definition should include the
> fragment, given the fact that you can attach a fragment to any URI anyway.
>

The answer to this affects what I write in the 'file' scheme draft. I
was advised early on to not mention fragments (which I took as
"disallow by omission") because, while it's easy to define syntax, the
scheme also has to define semantics, and fragment semantics are tied
to content type, and dereferenced 'file' URIs don't have a
well-defined content type.

Whether or not I mention it comes back to the definition and intended
use-case of RFC 3986; if it defines an 'abstract' syntax - in the POO
sense - then there's no such thing as a universal parser (i.e. It's
impossible to parse a URI with an unknown scheme). If it defines a
low-level structure, then any URI can be parsed, but the individual
components can't be validated without deferring to scheme-specific
machinery.

If the former and I don't include the fragment in 'file', it isn't
allowed. If the latter, I just leave a hole in the spec.

Cheers
-- 
  Matthew Kerwin
  http://matthew.kerwin.net.au/