[Rfc-markdown] Initial mmark spec

Miek Gieben <miek@miek.nl> Wed, 31 December 2014 18:16 UTC

Return-Path: <miek@miek.nl>
X-Original-To: rfc-markdown@ietfa.amsl.com
Delivered-To: rfc-markdown@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 450001A03AA for <rfc-markdown@ietfa.amsl.com>; Wed, 31 Dec 2014 10:16:37 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.89
X-Spam-Level:
X-Spam-Status: No, score=-1.89 tagged_above=-999 required=5 tests=[BAYES_50=0.8, GB_I_LETTER=-2, RCVD_IN_DNSWL_LOW=-0.7, T_FILL_THIS_FORM_SHORT=0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UjxCmcZPZ4ZH for <rfc-markdown@ietfa.amsl.com>; Wed, 31 Dec 2014 10:16:32 -0800 (PST)
Received: from mail-we0-f171.google.com (mail-we0-f171.google.com [74.125.82.171]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C91F51A038D for <rfc-markdown@ietf.org>; Wed, 31 Dec 2014 10:16:31 -0800 (PST)
Received: by mail-we0-f171.google.com with SMTP id u56so2721638wes.2 for <rfc-markdown@ietf.org>; Wed, 31 Dec 2014 10:16:30 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:subject:message-id:mail-followup-to :mime-version:content-type:content-disposition:user-agent; bh=ptM0kP5L4cFru4MWffe+RWBVXNIoxt6Gno28uQfnOI8=; b=cDKmn1WRkCfTimkJkQ7Y2LwCAsW+c46afH/dRA2szKeVAVW5qMOT+6jeJWau/Lwug0 hwjH4sh77ReS7ChkUtHQ2LDn2huOakAZRrwmM+UMH0m57seyQMK4cV6DE2im/PQTAN5S CHZx55RWjvLU37o3Oh/gAmNRwUpzxyEnz00eBmu5kHsHrz2br14RVU04Lu6gawzCbpCq DjZuJV68Z71ovl2z4pcdm24gTyBYd2QHpBAmXnlBAqaqe6B9UTfByeycyCue6qrF0qPs Q78W66/63fmUVYun3ATVEJuVO2Jz16RzN1fTSDJVXRN50MQGL2+42VGxyx5tMo3kFhvZ r2EQ==
X-Gm-Message-State: ALoCoQkao2xMHsh6W9iNqFTbt8xvBUptDuKpq2RgGWAwjWRmW8FMJnecROt/GV+DCzQOAn48NVmQ
X-Received: by 10.180.73.235 with SMTP id o11mr117804792wiv.51.1420049790445; Wed, 31 Dec 2014 10:16:30 -0800 (PST)
Received: from miek.nl (53578DBC.cm-6-8c.dynamic.ziggo.nl. [83.87.141.188]) by mx.google.com with ESMTPSA id b10sm47933549wiw.9.2014.12.31.10.16.29 for <rfc-markdown@ietf.org> (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Wed, 31 Dec 2014 10:16:29 -0800 (PST)
Date: Wed, 31 Dec 2014 18:16:28 +0000
From: Miek Gieben <miek@miek.nl>
To: RFC Markdown <rfc-markdown@ietf.org>
Message-ID: <20141231181628.GA17570@miek.nl>
Mail-Followup-To: RFC Markdown <rfc-markdown@ietf.org>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format="flowed"
Content-Disposition: inline
User-Agent: Vim/Mutt/Linux
X-Home: http://www.miek.nl
Archived-At: http://mailarchive.ietf.org/arch/msg/rfc-markdown/phHInS0P56to8QCsnuWlJECcDSs
Subject: [Rfc-markdown] Initial mmark spec
X-BeenThere: rfc-markdown@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "rfc-markdown is a discussion list for people writing I-Ds and RFCs in Markdown and the authors of the tools used for that." <rfc-markdown.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rfc-markdown>, <mailto:rfc-markdown-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rfc-markdown/>
List-Post: <mailto:rfc-markdown@ietf.org>
List-Help: <mailto:rfc-markdown-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rfc-markdown>, <mailto:rfc-markdown-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 31 Dec 2014 18:16:37 -0000

Hello,

Mmark (https://github.com/miekg/mmark) is shaping up as a nice (and fast) markdown parser,
there is still work to be done, but I wanted to post an initial document 
detailing the syntax. It borrows syntax from all over the place and I'm the only
one developing it, so feedback is welcome.

Below is the XML2 output after processing with XML2RFC of the `mmark2rfc.md`
document which you can find in the github repository.

/Miek

--
Miek Gieben






Network Working Group                                          R. Gieben
Internet-Draft                                                    Google
Intended status: Informational                         December 10, 2014
Expires: June 13, 2015


                   Using mmark to create I-Ds and RFCs
                        draft-gieben-mmark2rfc-00

Abstract

    This document describes an markdown variant called mmark [mmark] that
    can be used to create RFC documents.  The aim of mmark is to make
    writing document as natural as possible, while providing a lot of
    power on how to structure and layout the document.

    The source of this document [1] provides a good example.

Status of This Memo

    This Internet-Draft is submitted in full conformance with the
    provisions of BCP 78 and BCP 79.

    Internet-Drafts are working documents of the Internet Engineering
    Task Force (IETF).  Note that other groups may also distribute
    working documents as Internet-Drafts.  The list of current Internet-
    Drafts is at http://datatracker.ietf.org/drafts/current/.

    Internet-Drafts are draft documents valid for a maximum of six months
    and may be updated, replaced, or obsoleted by other documents at any
    time.  It is inappropriate to use Internet-Drafts as reference
    material or to cite them other than as "work in progress."

    This Internet-Draft will expire on June 13, 2015.

Copyright Notice

    Copyright (c) 2014 IETF Trust and the persons identified as the
    document authors.  All rights reserved.

    This document is subject to BCP 78 and the IETF Trust's Legal
    Provisions Relating to IETF Documents
    (http://trustee.ietf.org/license-info) in effect on the date of
    publication of this document.  Please review these documents
    carefully, as they describe your rights and restrictions with respect
    to this document.  Code Components extracted from this document must
    include Simplified BSD License text as described in Section 4.e of




Gieben                    Expires June 13, 2015                 [Page 1]

Internet-Draft                  mmark2rfc                  December 2014


    the Trust Legal Provisions and are provided without warranty as
    described in the Simplified BSD License.

Table of Contents

    1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
    2.  Mmark Syntax  . . . . . . . . . . . . . . . . . . . . . . . .   3
    3.  TOML header . . . . . . . . . . . . . . . . . . . . . . . . .   4
    4.  Citations . . . . . . . . . . . . . . . . . . . . . . . . . .   4
    5.  Internal References . . . . . . . . . . . . . . . . . . . . .   4
    6.  Document divisions  . . . . . . . . . . . . . . . . . . . . .   4
    7.  Abstract  . . . . . . . . . . . . . . . . . . . . . . . . . .   5
    8.  Captions  . . . . . . . . . . . . . . . . . . . . . . . . . .   5
      8.1.  Figures . . . . . . . . . . . . . . . . . . . . . . . . .   5
      8.2.  Quotes  . . . . . . . . . . . . . . . . . . . . . . . . .   5
      8.3.  Tables  . . . . . . . . . . . . . . . . . . . . . . . . .   5
    9.  Tables  . . . . . . . . . . . . . . . . . . . . . . . . . . .   6
    10. Inline Attribute Lists  . . . . . . . . . . . . . . . . . . .   6
    11. Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . .   7
      11.1.  Ordered Lists  . . . . . . . . . . . . . . . . . . . . .   7
      11.2.  Unordered Lists  . . . . . . . . . . . . . . . . . . . .   7
      11.3.  Definition Lists . . . . . . . . . . . . . . . . . . . .   7
      11.4.  Example Lists  . . . . . . . . . . . . . . . . . . . . .   7
    12. Miscellaneous . . . . . . . . . . . . . . . . . . . . . . . .   8
      12.1.  HTML Comment . . . . . . . . . . . . . . . . . . . . . .   8
      12.2.  Including Files  . . . . . . . . . . . . . . . . . . . .   8
    13. XML2RFC V3 features . . . . . . . . . . . . . . . . . . . . .   8
      13.1.  Asides . . . . . . . . . . . . . . . . . . . . . . . . .   8
      13.2.  Notes  . . . . . . . . . . . . . . . . . . . . . . . . .   8
      13.3.  RFC 2119 Keywords  . . . . . . . . . . . . . . . . . . .   8
      13.4.  Super- and Subscripts  . . . . . . . . . . . . . . . . .   8
      13.5.  Images . . . . . . . . . . . . . . . . . . . . . . . . .   8
    14. Converting from RFC 7328 Syntax . . . . . . . . . . . . . . .   9
    15. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .   9
    16. References  . . . . . . . . . . . . . . . . . . . . . . . . .   9
      16.1.  Informative References . . . . . . . . . . . . . . . . .   9
      16.2.  Normative References . . . . . . . . . . . . . . . . . .  10
      16.3.  URIs . . . . . . . . . . . . . . . . . . . . . . . . . .  10
    Appendix A.  Bugs . . . . . . . . . . . . . . . . . . . . . . . .  11
    Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  11

1.  Introduction

    Mmark [mmark] is a markdown processor.  It supports the markdown
    syntax and has been extended with (syntax) features found in other
    markdown implementations like kramdown [2], PHP markdown extra [3],
    [pandoc], leanpub [4] and even asciidoc [5].  This allows mmark to be




Gieben                    Expires June 13, 2015                 [Page 2]

Internet-Draft                  mmark2rfc                  December 2014


    used to write larger, structured documents such as RFC and I-Ds or
    even books, while not deviating too far from markdown.

    Mmark is a fork of blackfriday [blackfriday] and is written in Golang
    and very fast.  Input to mmark must be UTF-8, the output is also UTF-
    8.  Mmark converts tabs to 4 spaces.

    The goals of mmark are:

    (I)     Self contained: a single file can be converted to XML2RFC v2
            or (v3) or HTML5.

    (II)    Make the markdown "source code" look as natural as possible.

    (III)   Provide seemless upgrade path to XML2RFC v3.

    Mmark uses two scans when converting a document and does not build an
    internal AST of the document, this means it can not adhere 100% to
    the CommonMark [6] specification, however the CommonMark test suite
    is used when developing mmark.  Currently mmark passes ~60% of the
    tests.

    Using Figure 1 from [RFC7328], mmark can be positioned as follows:

     +-------------------+   pandoc   +---------+
     | ALMOST PLAIN TEXT |   ------>  | DOCBOOK |
     +-------------------+            +---------+
                   |      \                 |
     non-existent  |       \_________       | xsltproc
       faster way  |        *mmark*  \      |
                   v                  v     v
           +------------+    xml2rfc  +---------+
           | PLAIN TEXT |  <--------  |   XML   |
           +------------+             +---------+

                                  Figure 1

    Note that kramdown-2629 [7] fills the same niche as mmark.

2.  Mmark Syntax

    In the following sections we go over some of the difference, and the
    extra syntax features of mmark.








Gieben                    Expires June 13, 2015                 [Page 3]

Internet-Draft                  mmark2rfc                  December 2014


3.  TOML header

    Mmark uses TOML [toml] document header to specify the document's meta
    data.  Each line of this header must start with an "%".

4.  Citations

    A citation can be entered using the syntax from [pandoc]:
    "[@reference]", such a reference is "informative" by default.  Making
    a reference informative or normative can be done with a "?" and "!"
    respectively: "[@!reference]" is a normative reference.

    For RFC and I-Ds the references are generated automatically, although
    for I-Ds you might need to include a draft version in the reference
    "[@?I-D.draft-blah,#06]", creates an informative reference to the
    seventh version of draft-blah.

    Once a citation has been defined the brackets can be omited, so once
    "[@pandoc]" is used, you can just use "@pandoc".

    If the need arises (usually when citing a document that is not in the
    XML2RFC database) an XML reference fragment can be included, note
    that this needs to happen _before_ the back matter is started,
    because that is the point when the references are outputted (right
    now the implementation does not scan the entire file for citations,
    also see Appendix A.

5.  Internal References

    The cross reference syntax is "[](#id)", which allows for an optional
    title between the brackets.  Usually this is left empty, for this use
    case mmark allows the shortcut form "(#id)" which omits the brackets
    in its entirely.

    The external reference syntax is "[](url)".

6.  Document divisions

    Using "{mainmatter}" on a line by itself starts the main matter
    (middle) of the document, "{backmatter}" starts the appendix.  There
    is also a "{frontmatter}" that starts the front matter (front) of the
    document, but is normally not needed because the TOML header
    (Section 3) starts that by default.








Gieben                    Expires June 13, 2015                 [Page 4]

Internet-Draft                  mmark2rfc                  December 2014


7.  Abstract

    Any paragraph prefixed with "A>" is an abstract.  This is similar to
    asides and notes (Section 13.1 , Section 13.2) work.  Note that an
    RFC document can only have one abstract.

8.  Captions

    Whenever an blockquote, fenced codeblock or image has caption text,
    the entire block is wrapped in a "<figure>" and the caption text is
    put in a "<name>" tag for v3.

    In mmark you can put a caption under either a table, indented code
    block (even after a fenced code block) or even after a block quote.
    Referencing these elements (and thus creating an document "id" for
    them), is done with an IAL (Section 10:

    {#identifier}
    Name    | Age
    --------|-----:
    Bob     | 27
    Alice   | 23

    An empty line between the IAL and the table or indented code block is
    allowed.

8.1.  Figures

    Any text directly after the figure starting with "Figure:" is used as
    the caption.

8.2.  Quotes

    After a quote (a paragraph prefixed with ">") you can add a caption:

    Quote: Name -- URI for attribution

    In v3 this is used in the block quote attributes, for v2 it is
    discarded.

8.3.  Tables

    A table caption is signalled by using "Table:" directly after the
    table.  The table syntax used that one of Markdown Extra [8].







Gieben                    Expires June 13, 2015                 [Page 5]

Internet-Draft                  mmark2rfc                  December 2014


9.  Tables

    Tables can be created by drawing them in the input using a simple
    syntax:

    Name    | Age
    --------|-----:
    Bob     | 27
    Alice   | 23

    Tables can also have a footer: use equal signs instead of dashes for
    the separator, to start a table footer.  If there are multiple footer
    lines, the first one is used as a starting point for the table
    footer.

    Name    | Age
    --------|-----:
    Bob     | 27
    Alice   | 23
    ======= | ====
    Charlie | 4

    If a table is started with a _block table header_, which starts with
    a pipe or plus sign and a minimum of three dashes, it is a *Block
    Table*. A block table may include block level elements in each (body)
    cell.  If we want to start a new cell use the block table header
    syntax.  In the example below we include a list in one of the cells.

    |-----------------+------------+-----------------|
    | Default aligned |Left aligned| Center aligned  |
    |-----------------|:-----------|:---------------:|
    | Bob             |Second cell | Third cell      |
    | Alice           |foo         | **strong**      |
    | Charlie         |quux        | baz             |
    |-----------------+------------+-----------------|
    | Bob             | foot       | 1. Item2        |
    | Alice           | quuz       | 2. Item2        |
    |=================+============+=================|
    | Footer row      | more footer| and more        |
    |-----------------+------------+-----------------|

    Note that the header and footer can't contain block level elements.

10.  Inline Attribute Lists

    This borrows from [kramdown][[9]], with the difference that the colon
    is dropped and each IAL must be typeset _before_ the block element
    (see Appendix A).  Added an anchor to blockquote can be done like so:



Gieben                    Expires June 13, 2015                 [Page 6]

Internet-Draft                  mmark2rfc                  December 2014


    {#quote:ref1}
    > A block quote

    You can specify classes with ".class" (although these are not used
    when converting to XML2RFC), and arbitrary key value pairs where each
    key/value becomes an attribute.

11.  Lists

11.1.  Ordered Lists

    The are several ways to start an ordered lists.  You can use numbers,
    roman numbers, letters and uppercase letters.  When using roman
    numbers and letter you *MUST* use two spaces after the dot or the
    brace (the underscore signals a space here):

    a)__
    II.__
    A)__

    Note that mmark (just as [pandoc]) pays attention to the starting
    number of a list (when using decimal numbers), thus a list started
    with:

    4) Item4
    5) Item5

    Will use for "4" as the starting number.

11.2.  Unordered Lists

    Unordered lists can be started with "*", "+" or "-" and follow the
    normal markdown syntax rules.

11.3.  Definition Lists

    Mmark supports the definition list syntax from PHP Markdown Extra
    [10], meaning there can not be a empty line between the term and the
    definition.  Note the multiple terms and definition syntax is _not_
    supported.

11.4.  Example Lists

    This is the example list syntax from pandoc [11].  References to
    example lists work as well.  Note that an example list always needs
    to have an identifier, "(@good)" works, "(@)" does not.





Gieben                    Expires June 13, 2015                 [Page 7]

Internet-Draft                  mmark2rfc                  December 2014


12.  Miscellaneous

12.1.  HTML Comment

    If a HTML comment contains "--", it will be rendered as a "cref"
    comment in the resulting XML file.  Typically "<!-- Miek Gieben --
    you want to include the next paragraph? -->".

12.2.  Including Files

    Files can be included using "{{filename}}", "filename" is relative to
    the current working directory if it is not absolute.

13.  XML2RFC V3 features

    The v3 syntax adds some new features, those can already be used in
    mmark (even for documents targeting v2 -- but there they will be
    faked with the limited constructs of v2 syntax).

13.1.  Asides

    Any paragraph prefixed with "AS>".  For v2 this becomes a indented
    paragraph.

13.2.  Notes

    Any paragraph prefixed with "N>".  For v2 this becomes a indented
    paragraph.

13.3.  RFC 2119 Keywords

    Any [RFC2119] keyword used with strong emphasis _and_ in uppercase
    will be typeset within "bcp14" tags, that is "**MUST**" becomes
    "<bcp14>MUST</bcp14>", but "**must**" will not.  For v2 they are
    stripped of the emphasis and outputted as-is.

13.4.  Super- and Subscripts

    Use H~2~O and 2^10^ is 1024.  In v2 these are outputted as-is.

13.5.  Images

    TODO








Gieben                    Expires June 13, 2015                 [Page 8]

Internet-Draft                  mmark2rfc                  December 2014


14.  Converting from RFC 7328 Syntax

    Converting from an RFC 7328 ([RFC7328]) document can be done using
    the quick and dirty Perl script [12], which uses pandoc to output
    markdown PHP extra and converts that into proper mmark: (mmark is
    more like markdown PHP extra, than like pandoc).

    for i in middle.mkd back.mkd; do \
        pandoc --atx-headers -t markdown_phpextra < $i |
        ./parts.pl
    done

    Note this:

    o  Does not convert the abstract to a prefixed paragraph;

    o  Makes all RFC references normative;

    o  Handles all figure and table captions and adds references (if
       appropriate);

    o  Probably has other bugs, so a manual review should be in order.

    There is also titleblock.pl [13] which can be given an [RFC7328]
    "template.xml" file and will output a TOML titleblock, that can be
    used as a starting point.

       Yes, this uses pandoc and Perl.. why?  Becasue if mmark could
       parse the file by itself, there wasn't much of problem.  Two
       things are holding this back: mmark cannot parse definition lists
       with empty spaces and there isn't renderer that can output
       markdown syntax.

    For now the mmark parser will not get any features that makes it
    backwards compatible with pandoc2rfc.

15.  Acknowledgements

    This documents has been modeled after the excellent kramdown syntax
    page [14].

16.  References

16.1.  Informative References

    [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
               Requirement Levels", BCP 14, RFC 2119, March 1997.




Gieben                    Expires June 13, 2015                 [Page 9]

Internet-Draft                  mmark2rfc                  December 2014


    [blackfriday]
               "Blackfriday git repository", November 2011,
               <http://github.com/russross/blackfriday>.

    [mmark]    Gieben, R., "Mmark git repository", December 2014,
               <http://github.com/miekg/mmark>.

    [pandoc]   MacFarlane, J., "Pandoc, a universal document converter",
               2006, <http://johnmacfarlane.net/pandoc/>.

16.2.  Normative References

    [RFC7328]  Gieben, R., "Writing I-Ds and RFCs Using Pandoc and a Bit
               of XML", RFC 7328, August 2014.

    [toml]     Preston-Werner, T., "TOML git repository", March 2013,
               <https://github.com/toml-lang/toml>.

16.3.  URIs

    [1] http://http://kramdown.gettalong.org/

    [2] http://michelf.com/projects/php-markdown/extra/

    [3] https://leanpub.com/help/manual

    [4] http://www.methods.co.nz/asciidoc/

    [5] http://commonmark.org/

    [6] https://github.com/cabo/kramdown-rfc2629

    [7] https://michelf.ca/projects/php-markdown/extra/#table

    [8] http://kramdown.gettalong.org/syntax.html#block-ials

    [9] https://michelf.ca/projects/php-markdown/extra/#def-list

    [10] http://johnmacfarlane.net/pandoc/README.html#extension-
         example_lists

    [11] https://raw.githubusercontent.com/miekg/mmark/master/convert/
         parts.pl

    [12] https://raw.githubusercontent.com/miekg/mmark/master/convert/
         titleblock.pl

    [13] http://kramdown.gettalong.org/syntax.html



Gieben                    Expires June 13, 2015                [Page 10]

Internet-Draft                  mmark2rfc                  December 2014


Appendix A.  Bugs

    o  Citations must be included in the text before the "{backmatter}"
       starts.  otherwise they are not available in the appendix.

    o  Inline Attribute Lists must be given _before_ the block element.

    o  Mmark cannot parse @RFC728 markdown.

    o  Multiple terms and definitions are not supported in definition
       lists.

Author's Address

    R. (Miek) Gieben
    Google
    Buckingham Palace Road

    Email: miek@google.com
































Gieben                    Expires June 13, 2015                [Page 11]