Re: [apps-discuss] Fun with URLs and regex
Sam Ruby <rubys@intertwingly.net> Wed, 07 January 2015 22:45 UTC
Return-Path: <rubys@intertwingly.net>
X-Original-To: apps-discuss@ietfa.amsl.com
Delivered-To: apps-discuss@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id F09EC1A1B94 for <apps-discuss@ietfa.amsl.com>; Wed, 7 Jan 2015 14:45:25 -0800 (PST)
X-Quarantine-ID: <pu573Yy0AZzd>
X-Virus-Scanned: amavisd-new at amsl.com
X-Amavis-Alert: BANNED, message contains text/plain,.exe
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pu573Yy0AZzd for <apps-discuss@ietfa.amsl.com>; Wed, 7 Jan 2015 14:45:23 -0800 (PST)
Received: from cdptpa-oedge-vip.email.rr.com (cdptpa-outbound-snat.email.rr.com [107.14.166.225]) by ietfa.amsl.com (Postfix) with ESMTP id 32B1D1A1B93 for <apps-discuss@ietf.org>; Wed, 7 Jan 2015 14:45:23 -0800 (PST)
Received: from [98.27.51.253] ([98.27.51.253:25264] helo=rubix) by cdptpa-oedge03 (envelope-from <rubys@intertwingly.net>) (ecelerity 3.5.0.35861 r(Momo-dev:tip)) with ESMTP id A7/E2-22136-207BDA45; Wed, 07 Jan 2015 22:45:22 +0000
Received: from [192.168.159.48] (unknown [104.132.4.105]) (using TLSv1 with cipher ECDHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: rubys) by rubix (Postfix) with ESMTPSA id 103AE14032C; Wed, 7 Jan 2015 17:45:21 -0500 (EST)
Message-ID: <54ADB701.4000309@intertwingly.net>
Date: Wed, 07 Jan 2015 17:45:21 -0500
From: Sam Ruby <rubys@intertwingly.net>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0
MIME-Version: 1.0
To: Mark Nottingham <mnot@mnot.net>, IETF Apps Discuss <apps-discuss@ietf.org>
References: <C5B10293-E6F6-4348-9782-C9C00A4476CE@mnot.net>
In-Reply-To: <C5B10293-E6F6-4348-9782-C9C00A4476CE@mnot.net>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
X-RR-Connecting-IP: 107.14.168.142:25
X-Cloudmark-Score: 0
Archived-At: http://mailarchive.ietf.org/arch/msg/apps-discuss/fy32kRi0Rn_zCt3s1g1P2i83-yo
Subject: Re: [apps-discuss] Fun with URLs and regex
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/apps-discuss/>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 07 Jan 2015 22:45:26 -0000
On 01/07/2015 04:35 PM, Mark Nottingham wrote: > I’ve updated my Python script that serves as a translation of ABNF for URIs into regex. > > https://gist.github.com/mnot/138549 > > It now validates the following URI schemes according to their respective specifications: > - http > - https > - file > - data > - gopher > - ws > - wss > - mailto My test data: http://intertwingly.net/stories/2014/10/05/urltestdata.json A program to test each input: import uri_validate import json import re f = open('urltestdata.json') tests = json.load(f) f.close() valid = {} for test in tests: instr = test['input'] valid[instr] = False if re.match("^%s$" % uri_validate.URI_reference, instr, re.VERBOSE): try: scheme_validator = "%s_URI" % instr.split(":", 1)[0].lower() validator = getattr(uri_validate, scheme_validator) if re.match("^%s#" % validator, instr, re.VERBOSE): valid[instr] = True except AttributeError: valid[instr] = True print json.dumps(valid, indent=2)
- [apps-discuss] Fun with URLs and regex Mark Nottingham
- Re: [apps-discuss] Fun with URLs and regex Sam Ruby
- Re: [apps-discuss] Fun with URLs and regex Matthew Kerwin
- Re: [apps-discuss] Fun with URLs and regex Bjoern Hoehrmann
- Re: [apps-discuss] Fun with URLs and regex Martin Thomson
- Re: [apps-discuss] Fun with URLs and regex Martin J. Dürst
- Re: [apps-discuss] Fun with URLs and regex Mark Nottingham
- Re: [apps-discuss] Fun with URLs and regex Mark Nottingham
- Re: [apps-discuss] Fun with URLs and regex Sam Ruby
- Re: [apps-discuss] Fun with URLs and regex Sam Ruby
- Re: [apps-discuss] Fun with URLs and regex Mark Nottingham
- Re: [apps-discuss] Fun with URLs and regex Mark Nottingham
- Re: [apps-discuss] Fun with URLs and regex Sam Ruby
- Re: [apps-discuss] Fun with URLs and regex Mark Nottingham
- Re: [apps-discuss] Fun with URLs and regex Sam Ruby
- Re: [apps-discuss] Fun with URLs and regex Nico Williams
- Re: [apps-discuss] Fun with URLs and regex Julian Reschke
- Re: [apps-discuss] Fun with URLs and regex Roy T. Fielding
- Re: [apps-discuss] Fun with URLs and regex Sam Ruby
- Re: [apps-discuss] Fun with URLs and regex Julian Reschke
- Re: [apps-discuss] Fun with URLs and regex Roy T. Fielding
- Re: [apps-discuss] Fun with URLs and regex Mark Nottingham
- Re: [apps-discuss] Fun with URLs and regex Nico Williams
- Re: [apps-discuss] Fun with URLs and regex Nico Williams
- Re: [apps-discuss] Fun with URLs and regex Matthew Kerwin
- Re: [apps-discuss] Fun with URLs and regex Larry Masinter
- Re: [apps-discuss] Fun with URLs and regex Roy T. Fielding
- Re: [apps-discuss] Fun with URLs and regex Matthew Kerwin
- Re: [apps-discuss] Fun with URLs and regex Julian Reschke
- Re: [apps-discuss] Fun with URLs and regex Sean Leonard
- Re: [apps-discuss] Fun with URLs and regex t.petch
- Re: [apps-discuss] Fun with URLs and regex Sam Ruby
- Re: [apps-discuss] Fun with URLs and regex Bjoern Hoehrmann