[dispatch] BCP proposal: regular expressions for Internet Mail identifiers
Sean Leonard <dev+ietf@seantek.com> Tue, 22 March 2016 22:53 UTC
Return-Path: <dev+ietf@seantek.com>
X-Original-To: dispatch@ietfa.amsl.com
Delivered-To: dispatch@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E787F12DAF2; Tue, 22 Mar 2016 15:53:01 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.601
X-Spam-Level:
X-Spam-Status: No, score=-2.601 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Q7RNRmBS-mks; Tue, 22 Mar 2016 15:53:00 -0700 (PDT)
Received: from mxout-08.mxes.net (mxout-08.mxes.net [216.86.168.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AE58812DAED; Tue, 22 Mar 2016 15:52:59 -0700 (PDT)
Received: from [192.168.123.7] (unknown [75.83.2.34]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by smtp.mxes.net (Postfix) with ESMTPSA id A32FF509B5; Tue, 22 Mar 2016 18:52:58 -0400 (EDT)
References: <20160321235553.10930.4801.idtracker@ietfa.amsl.com>
To: ietf-smtp@ietf.org, dispatch@ietf.org
From: Sean Leonard <dev+ietf@seantek.com>
X-Forwarded-Message-Id: <20160321235553.10930.4801.idtracker@ietfa.amsl.com>
Message-ID: <56F1CD23.2040002@seantek.com>
Date: Tue, 22 Mar 2016 15:54:27 -0700
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0
MIME-Version: 1.0
In-Reply-To: <20160321235553.10930.4801.idtracker@ietfa.amsl.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/dispatch/qQBFyXcFmO-DkuMqLmgFoq4uYEI>
Subject: [dispatch] BCP proposal: regular expressions for Internet Mail identifiers
X-BeenThere: dispatch@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: DISPATCH Working Group Mail List <dispatch.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dispatch>, <mailto:dispatch-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dispatch/>
List-Post: <mailto:dispatch@ietf.org>
List-Help: <mailto:dispatch-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dispatch>, <mailto:dispatch-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 22 Mar 2016 22:53:02 -0000
Greetings IETF-SMTP Gods and Denizens (and dispatch): Over the winter I worked on a new Internet-Draft that I would like to propose the IETF adopts: Regular Expressions for Internet Mail. The draft focuses on two identifiers: email addresses and Message-IDs. The purpose of this standard (proposed as a Best Current Practice) is to have *IETF-vetted* expressions that implementers and non-mail standards authors can plug-and-chug without futzing with trying to interpret 40 years of (occasionally conflicting and arcane) RFCs and implementation lore. There are many non-mail systems out there (read: nearly every web app, reservation system, customer database, etc. on Earth) that use or consume email addresses as identifiers, and their inability to accept the most obvious valid characters (like "+" or even "-"; I have used apps that do not even accept "-") is a great source of interoperability problems. (This document is also relevant to some other threads about the nature of email address identifiers in security artifacts such as certificates, PGP keys, and DNS records: anyone who is vouching for an email address ought to be sure that they are recording something that actually is a valid email address in the first place.) We should get this right now, before Unicode/EAI makes interoperability issues 50000x more expensive to correct. The document is not meant to modify the mail standards, but merely to reflect and track them as they are updated over time. As a first draft, the document is in rough shape and has extensive notes about issues that came up during R&D but have yet to be addressed. Significant areas that need adequate treatment include: 1. the impact of Unicode (EAI) on identifiers. 2. handling domain names, which comprise 50% of an email address, but perhaps 85% of the complexity when Unicode gets involved. 2. "deliverable email address" (complying with the modern SMTP infrastructure) vs. other kinds of email addresses (Internet Message Format, historic forms). 3. regular expression engines and grammars (i.e., which grammars to use, which are widely used and produce uniform results). 4. efficiency of the regular expressions. 5. different expressions for validation (testing), part extraction (capturing groups), decoding, encoding, and searching through text. 6. test vectors. Hopefully the adoption of this work as an IETF item, coupled with input from those with extensive experience (Thanks to John Levine, Pete Resnick, and others for taking initial questions and discussion on the topic.) Discussion welcome. Thanks. Sean -------- Forwarded Message -------- Subject: New Version Notification for draft-seantek-mail-regexen-00.txt Date: Mon, 21 Mar 2016 16:55:53 -0700 From: internet-drafts@ietf.org A new version of I-D, draft-seantek-mail-regexen-00.txt has been successfully submitted by Sean Leonard and posted to the IETF repository. Name: draft-seantek-mail-regexen Revision: 00 Title: Regular Expressions for Internet Mail Document date: 2016-03-21 Group: Individual Submission Pages: 24 URL: https://www.ietf.org/internet-drafts/draft-seantek-mail-regexen-00.txt Status: https://datatracker.ietf.org/doc/draft-seantek-mail-regexen/ Htmlized: https://tools.ietf.org/html/draft-seantek-mail-regexen-00 Abstract: Internet Mail identifiers are used ubiquitously throughout computing systems as building blocks of online identity. Unfortunately, incomplete understandings of the syntaxes of these identifiers has led to interoperability problems and poor user experiences. Many users use specific characters in their addresses that are not properly accepted on various systems. This document prescribes normative regular expression (regex) patterns for all Internet- connected systems to use when validating or parsing Internet Mail identifiers, with special attention to regular expressions that work with popular languages and platforms.
- [dispatch] BCP proposal: regular expressions for … Sean Leonard
- Re: [dispatch] BCP proposal: regular expressions … Adam Roach
- Re: [dispatch] BCP proposal: regular expressions … Dale R. Worley
- Re: [dispatch] BCP proposal: regular expressions … Ben Campbell
- Re: [dispatch] BCP proposal: regular expressions … Paul Kyzivat
- Re: [dispatch] BCP proposal: regular expressions … Sean Leonard
- Re: [dispatch] [ietf-smtp] BCP proposal: regular … Ted Lemon
- Re: [dispatch] BCP proposal: regular expressions … Ben Campbell
- Re: [dispatch] [ietf-smtp] BCP proposal: regular … Alexey Melnikov
- Re: [dispatch] BCP proposal: regular expressions … Murray S. Kucherawy
- Re: [dispatch] BCP proposal: regular expressions … Murray S. Kucherawy
- Re: [dispatch] [ietf-smtp] BCP proposal: regular … John C Klensin
- Re: [dispatch] BCP proposal: regular expressions … Sean Leonard
- Re: [dispatch] BCP proposal: regular expressions … Murray S. Kucherawy
- Re: [dispatch] BCP proposal: regular expressions … Dale R. Worley
- Re: [dispatch] [ietf-smtp] BCP proposal: regular … John C Klensin
- Re: [dispatch] [ietf-smtp] BCP proposal: regular … Sean Leonard
- Re: [dispatch] [ietf-smtp] BCP proposal: regular … Sean Leonard
- Re: [dispatch] [ietf-smtp] BCP proposal: regular … John C Klensin
- Re: [dispatch] BCP proposal: regular expressions … Martin J. Dürst
- Re: [dispatch] [ietf-smtp] BCP proposal: regular … Valdis.Kletnieks
- Re: [dispatch] BCP proposal: regular expressions … Arnt Gulbrandsen
- Re: [dispatch] [ietf-smtp] BCP proposal: regular … Sean Leonard
- Re: [dispatch] [ietf-smtp] BCP proposal: regular … Arnt Gulbrandsen
- Re: [dispatch] [ietf-smtp] BCP proposal: regular … Ned Freed
- Re: [dispatch] [ietf-smtp] BCP proposal: regular … Sean Leonard
- Re: [dispatch] [ietf-smtp] BCP proposal: regular … Dale R. Worley