Re: [apps-discuss] I-D Action: draft-ietf-appsawg-text-markdown-02.txt

Sean Leonard <dev+ietf@seantek.com> Wed, 24 September 2014 07:29 UTC

Return-Path: <dev+ietf@seantek.com>
X-Original-To: apps-discuss@ietfa.amsl.com
Delivered-To: apps-discuss@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 68A201A8BC1 for <apps-discuss@ietfa.amsl.com>; Wed, 24 Sep 2014 00:29:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.6
X-Spam-Level:
X-Spam-Status: No, score=-6.6 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, GB_I_INVITATION=-2, GB_I_LETTER=-2, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0AR82a6v17cj for <apps-discuss@ietfa.amsl.com>; Wed, 24 Sep 2014 00:29:18 -0700 (PDT)
Received: from mxout-08.mxes.net (mxout-08.mxes.net [216.86.168.183]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 289891A70FD for <apps-discuss@ietf.org>; Wed, 24 Sep 2014 00:29:18 -0700 (PDT)
Received: from [192.168.123.7] (unknown [23.240.242.6]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by smtp.mxes.net (Postfix) with ESMTPSA id C4EC7509B8 for <apps-discuss@ietf.org>; Wed, 24 Sep 2014 03:29:16 -0400 (EDT)
Message-ID: <542272C2.8030305@seantek.com>
Date: Wed, 24 Sep 2014 00:29:06 -0700
From: Sean Leonard <dev+ietf@seantek.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.1.1
MIME-Version: 1.0
To: apps-discuss@ietf.org
References: <20140922224217.25104.13357.idtracker@ietfa.amsl.com> <CAL0qLwZffLpce4X1Lo_V9-yxUBkCnAbtigCe59OUbzeWKs8LFw@mail.gmail.com>
In-Reply-To: <CAL0qLwZffLpce4X1Lo_V9-yxUBkCnAbtigCe59OUbzeWKs8LFw@mail.gmail.com>
Content-Type: multipart/alternative; boundary="------------080504010007010704080800"
Archived-At: http://mailarchive.ietf.org/arch/msg/apps-discuss/-Q1pH5ecHFfYTAo8RzatnzrdOek
Subject: Re: [apps-discuss] I-D Action: draft-ietf-appsawg-text-markdown-02.txt
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/apps-discuss/>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 24 Sep 2014 07:29:21 -0000

On 9/23/2014 11:22 PM, Murray S. Kucherawy wrote:
> On Mon, Sep 22, 2014 at 3:42 PM, <internet-drafts@ietf.org 
> <mailto:internet-drafts@ietf.org>> wrote:
>
>
>     A New Internet-Draft is available from the on-line Internet-Drafts
>     directories.
>      This draft is a work item of the Applications Area Working Group
>     Working Group of the IETF.
>
>             Title           : The text/markdown Media Type
>             Author          : Sean Leonard
>             Filename        : draft-ietf-appsawg-text-markdown-02.txt
>             Pages           : 25
>             Date            : 2014-09-22
>
>     Abstract:
>        This document registers the text/markdown media type for use with
>        Markdown, a family of plain text formatting syntaxes that
>     optionally
>        can be converted to formal markup languages such as HTML.
>
>
> This is my first time through the document,

Yay!

> so I may lack some context.  I'm still learning the history and 
> politics; hopefully some of this review has not yet been tainted by them.
>
> 1) You can drop ".txt" from the filename.
No problem.

>
> 2) The section numbering goes backwards after 1.4 somehow.  Is there 
> anything funny in the XML forcing this?  Or are you editing these by hand?

Good 'ol nroff (NroffEdit, specifically). I think for folks from the 
security-area, nroff is the preferred tool. xml2rfc does not permit fine 
enough control over spacing.

>
> 3) There's an awful lot of context and history being established 
> throughout Section 1.  It's not clear to me that a document that's 
> supposed to be just a media type registration needs all this stuff 
> (material SM would call "marketing").  Based on a cursory review, it 
> looks like Section 1.4, paragraphs 1, 3, and 4, would be an adequate 
> and complete Section 1.  If you're keen to have all this context 
> published, you could move the rest to an appendix.

Noted.

>
> 4) What is now Section 2 should go after what is now Section 4.

Originally I put the example at the end, but for draft-02 I felt like it 
should be at the beginning. Especially given the lack of a *formal* 
specification, I wanted to be clear with an example up-front. It can go 
back, though.

>
> 5) I'm not sure that I agree the charset should be mandatory.  It 
> seems to go against what I'm reading in RFC2046 Section 4.1 to not 
> have "us-ascii" as a default since this is a subtype of the "text" 
> media type.  Why should this be different from how other text/* types 
> do it?

See RFC 6657 (whole thing) and RFC 6838 Section 4.2.1.

>
> 6) The "flavor" tag itself seems to be a debatable point.  I don't 
> have an opinion on that yet (more discussion, please), but as defined 
> the name is case-sensitive.  Is that what we want?  And does it need 
> to be able to contain spaces or special characters such that it will 
> need to be quoted?

There are a few reasons for the case-sensitivity of the name. First, the 
parameter value can be any Unicode string. The purpose was to enable 
flavors (variants) to be named things in languages other than English. 
See BCP 18 Section 2, "Where to do internationalization". Right now most 
examples of Markdown-related flavors are in English, but it is perfectly 
conceivable that someone can write some Markdown variant in some other 
script. Actually, Markdown itself is starting to be iconized as M↓ by 
the community.

Over the last couple of years, I have become very suspicious about 
case-insensitivity and its interactions with Unicode. It's one thing to 
map the US-ASCII characters U+0041-U+005A (uppercase) to U+0061-U+007A; 
it's another thing to require huge tables of mappings for all sorts of 
scripts out there. See 
<http://en.wikipedia.org/wiki/Letter_case#Unicode_case_folding_and_script_identification> 
for the case folding algorithm. Note that Unicode defines three cases: 
uppercase, lowercase, and title case. Too. Much. Detail.

BCP 18 frames the problem and states specifically that:
"Names are a problem, because people feel strongly about them, many of 
them are mostly for local usage, and all of them tend to leak out of the 
local context at times. RFC 1958 
<http://tools.ietf.org/html/rfc1958>recommends US-ASCII for all globally 
visible names. This document does not mandate a policy on name 
internationalization, but requires that all protocols describe whether 
names are internationalized or US-ASCII."

The compromise position I reached was that you can use Unicode, but you 
SHOULD use US-ASCII. And since I wanted to obviate the case-folding 
issue, I said case-sensitive.

As a bonus: John Gruber is extremely sensitive to capitalization of 
"Markdown".

>
> 7) The "processor" tag makes me very nervous indeed. It seems to me 
> anything you might say as part of the processor argument should be 
> inferred from the value of the flavor argument, obviating the need for 
> this.  I would not expect security reviewers or consumers to tolerate 
> the idea that the author of a MIME header field can tell a consumer 
> what command to run and with what arguments.  If that were the case, 
> we had better be prepared to come up with a lot of text or ABNF that 
> hardens this against command injection attacks.

I think the security risk is significantly mitigated (perhaps even 
eliminated) by registration. Will write in separate e-mail.

>
> 8) I'm unclear on what the "output-type" tag is for. Isn't the output 
> format a function of the context in which the MIME part is being 
> processed?  For example, if I get this in a piece of email, wouldn't 
> the markdown processor output in HTML if I'm using an HTML-enabled 
> MUA, or in text otherwise?

First, thanks for the open discussion of output-type. I think the text 
spells it out. text/html is what most people think of...but text/html is 
not the way that Markdown is going. Markdown is slowly encroaching upon 
every other format, because other formats are "too complicated".

The use case of an e-mail client showing the HTML output of Markdown 
inline, is not realistic. As someone else noted earlier, if you write an 
e-mail in Markdown and want the recipient to see formatted text, the 
expectation is that the sending mail client will do the formatting, so 
in an e-mail, the received data will be text/html.

Probably a better scenario (which is becoming a significant use case) is 
authors collaborating on a document. The authors on disparate machines 
*both* want to see the source, *and* the output, preferably at the same 
time. Maybe one collaborator is the author and the other is an editor or 
reviewer.

>
> 9) Why is this a provisional registration?

Oh, just because it's an Internet-Draft. I sent a registration request 
for a provisional registration back after draft-01 was published. I 
think it's being held up by an IANA question of whether I-Ds can do 
provisional registrations, or if it requires AD action.

Anyway, that field should be changed to "No".

>
> 10) In Section 4 you talk about private use or custom parameter values 
> needing to be prefixed with "!".  How is this different from the 
> now-deprecated practice of prefixing private use header fields with 
> "X-"?  (See BCP 178.)

I was not aware of BCP 178 at the time. Also, several commenters asked 
for a way to use unregistered identifiers...I don't want to quote them 
directly but I recall that they expressed concerns that nobody would 
bother registering identifiers. Now that I am aware of BCP 178, I agree 
that the unregistered value mechanism should be removed. The right way 
to fix this is to make registration very simple. I am less sure about 
provisional registrations...that sounds more complicated (and moreover, 
it sounds like an invitation to do a provisional registration and then 
skip town).

>
> 11) For the flavor parameter, I'm not clear on why it's a mandatory 
> value that has a default.

It's optional.

The text does say "Generators MUST NOT emit empty flavor 
parameters"...but then it proceeds, "but parsers MUST treat empty flavor 
parameters the same as if omitted." The whole parameter is optional. The 
point is that there is a syntactic difference between:
  text/markdown; flavor=""
and
  text/markdown

but there is no semantic difference--they mean the same thing.

>
> 12) The requirement to register tools that implement given flavors is 
> unusual.  What's the impetus here?
Section 5.1.1.:
"The purpose of the tool requirement is to ensure that the flavor is 
actually used in practice."

Due to the proliferation of processors, there have been very many calls 
in the Markdown community to have "one true formalized syntax". So that 
means that now there is a proliferation of syntax 
specifications--several of which are not representative of any 
implementation at all!

>   I'm also not sure about having a Designated Expert that to validate 
> every such registration.  That seems like it could be quite a lot to 
> ask of a volunteer.  Is that necessary?

I tried to constrain the work that the DE would have to do. It's 
definitely worth discussing.

>
> 13) Security Considerations refers to the "template questions in 
> Section 2", but Section 2 is an example section.  Are you using "xref" 
> tags, or setting section numbers manually?
nroff.

Incidentally: Markdown has setext and atx header syntaxes, but Gruber's 
Markdown syntax does not allow headers to be numbered. Numbered headers 
(which would be very useful for IETF documents, wink wink) are an 
extension in some Markdown variants. For example:

|pandoc --number-sections|

<http://stackoverflow.com/questions/19999696/are-numbered-headings-in-markdown-rdiscount-possible>

> 14) Also in Security Considerations, I suggest at least having a 
> summary of the Section 4 issues here, if not actually moving them here.
Ok. Noted.

>
> 15) RFC1738 is obsoleted by RFCs 4248 and 4266.

RFC 1738 is the latest reference for file:/// URLs.

>
> 16) Appendix A should include a notation like "[RFC Editor: Please 
> delete this section prior to publication.]"
Ok. Noted. Thanks!

-Sean