Re: [Json] Advice on registering JSON Lines (not JSON) as IANA Media Type

Tim Bray <tbray@textuality.com> Wed, 30 December 2020 19:50 UTC

Return-Path: <tbray@textuality.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7DDBE3A0C5E for <json@ietfa.amsl.com>; Wed, 30 Dec 2020 11:50:09 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Level:
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=textuality-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mSEPpM00t-ha for <json@ietfa.amsl.com>; Wed, 30 Dec 2020 11:50:07 -0800 (PST)
Received: from mail-lf1-x132.google.com (mail-lf1-x132.google.com [IPv6:2a00:1450:4864:20::132]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 20AF83A0C6B for <json@ietf.org>; Wed, 30 Dec 2020 11:50:06 -0800 (PST)
Received: by mail-lf1-x132.google.com with SMTP id o19so39960623lfo.1 for <json@ietf.org>; Wed, 30 Dec 2020 11:50:06 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=textuality-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=tuUQJAyBmRe51tDMj1GQGLo6Pb43D+ASi6FLTf9fJmc=; b=Nido/AWX+dLGcsw+DUc8WZVvM3pc0cu+AOXYEku9dih6YQy5CBVEQcnE/Fyc93GpGY GMlJRHsODBdDnSY1lYr0pSmkOLBexu0jZsZK9l0yS/F2lxujUEYjT9KxoUVme8EbGHyg 33VCyvGmN3F5Dkc3FLOsYfpFLL/xlKg5R9metbWO672gMTKjVu7/Xr9eDawYhP74IROh O/k3qVfYTOm4I0BnqhqUww977HHUDTWLNHPT2X4K2XO+uxD0h0uQMTETTuY/PPtMFU42 PvDlpG/eI9+M3U+ztUctQNNwrPifclycXLE7Sh5Z5lvioiCOecBa8eBPXJaMUvar40B5 a5wA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=tuUQJAyBmRe51tDMj1GQGLo6Pb43D+ASi6FLTf9fJmc=; b=SyG0iOhLB71f40gBSf262gAeQ7rEq/eKKffpHnPBaB/eopLPyXOSOpqtGUv3dJ1u15 4jdfYS8kkkpe0ANPUetkdTl4U22bZrXBM3wUjaCbJi/1SwcH0IDrEuBVjtEMfdtjLr+b GsM7eitr/s+0UsYeAcbK+s7rQAjtuR1RnVyDRFQtO4MTmI6WpCBQyYxhNfe52waEqYEL LlCNjJwudoTkirnjw7119oSqduzPRUR1ZqI2wqiv+ZXlAI1w0KZXi+MZIHg8pvwAN2ep Mt+EoWaAtbUAAACIyTpIUukOCvMVvxlr+jdXqa19Wxov60S/PE7sqyzRGBvCcP7tpcnJ gk9A==
X-Gm-Message-State: AOAM531v3gS4FMGfoqVANl9ciaBbr/krlkKpgO4Xe+QDJU+MAXu+eiUi I+bOkhkKr6BbzCYD0GvBhP09ol925vhMzVXKowXzAw==
X-Google-Smtp-Source: ABdhPJzHcXzOBzC8W5Vg0bCanD2fa5kDi+Nre8/bOD20hLVxD1ziUlmeLrINdDvvQAj19Mr5ke1DhKkm9wJsv6oudsk=
X-Received: by 2002:a05:651c:202:: with SMTP id y2mr26145318ljn.162.1609357805146; Wed, 30 Dec 2020 11:50:05 -0800 (PST)
MIME-Version: 1.0
References: <92962f86-1e03-aaae-4b7d-bbb76c88ac6c@crockford.com> <5DE9D26C-7F3E-4448-9B2E-675FC840D507@dilettant.eu>
In-Reply-To: <5DE9D26C-7F3E-4448-9B2E-675FC840D507@dilettant.eu>
From: Tim Bray <tbray@textuality.com>
Date: Wed, 30 Dec 2020 11:49:54 -0800
Message-ID: <CAHBU6ivV1exy-3r5LcXExDHJzBTLY5+_GBEretPred=qeOrWyQ@mail.gmail.com>
To: Stefan Hagen <stefan@dilettant.eu>
Cc: Douglas Crockford <douglas@crockford.com>, Nico Williams <nico@cryptonector.com>, JSON WG <json@ietf.org>, "Hlavina, Wratko (NIH/NLM/NCBI) [E]" <whlavina@ncbi.nlm.nih.gov>
Content-Type: multipart/alternative; boundary="0000000000004ae5e905b7b3d09c"
Archived-At: <https://mailarchive.ietf.org/arch/msg/json/i4mJuEo-ZFELKpLJGJG-kkm3bjw>
Subject: Re: [Json] Advice on registering JSON Lines (not JSON) as IANA Media Type
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 30 Dec 2020 19:50:10 -0000

A survey of https://en.wikipedia.org/wiki/JSON_streaming is instructive.
Among other things, it revealed (to me) the existence of json-seq,
https://tools.ietf.org/html/rfc7464

Doug has a point about naming but I've lost enough arguments about what
things should be called that I'll skip that subject.

In any case, to answer the original question, to register a media type you
need to link to a stable specification.  The contents of
https://jsonlines.org probably don’t qualify, so the conventional thing
would be to write an Internet-Draft which AFAICT would be the same as
json-seq only without the leading "ASCII Record Separator (0x1E)" but
retaining the trailing \n.

Given such an Internet-Draft and evidence that the community actually
cares, it's not difficult to register a new media type.



On Wed, Dec 30, 2020 at 11:36 AM Stefan Hagen <stefan@dilettant.eu> wrote:

>
> Am 30.12.2020 um 20:03 schrieb Douglas Crockford <douglas@crockford.com>om>:
>
> 
>
> Anything that is not JSON should not be called JSON. It should have a less
> confusing name.
>
>
> On 2020-12-30 8:58 AM, Hlavina, Wratko (NIH/NLM/NCBI) [E] wrote:
>
> Hello, Mr. Crockford and Mr. Williams,
>
>
>
> I understand you are listed as the authors for the "application/json" and
> "application/json-seq" IANA Media Types, respectively.
>
> I would like to ask for your advice/help with a related file format, JSON
> Lines:
>
>
>
> https://jsonlines.org
>
>
>
> I think there is value in having this format registered as a Standards
> Tree IANA Media Type.
>
> Per the RFC6838 process, this requires Expert Review and IETF/IESG
> approval.
>
> Not being a member of those organizations, how can I encourage such
> registration?
>
>
>
> Motivation:
>
>
>
> Unfortunately, JSON Lines is not valid JSON (technically) and is different
> from JSON Text Sequences.
>
> However, JSON Lines is a frequently used file format; for example, it is
> used by many database products, including Cloud services like AWS Athena,
> Snowflake, and others.
>
>
>
> Since it is not valid JSON, using "application/json" as media type leads
> to processing failures and mishandling.
>
> Since it uses the newline as separator, without RS Unicode Information
> Separator Two record separators, "application/json-seq" is not a substitute
> Media Type, and the ecosystem of tools do not, in general, support JSON
> Text Sequences format.
>
>
>
> In principle, good JSON programming libraries should allow streamed
> processing of JSON content, both in emitting it and in reading it, but in
> practice, libraries for JSON tend to require an entire JSON object to be
> held in memory.
>
> Since HTTP emits one response per request, this implies only a single JSON
> object per response, if using "application/json" as Media Type; this is
> problematic for large data.
>
>
>
> In my experience, JSON Lines has become a very useful and conventional
> file format, since it interoperates well with Unix text utilities while
> remaining highly interoperable with many JSON tools.
>
>
>
> Cf.:
>
>    - RFC6838
>    - https://www.iana.org/assignments/media-types/application/json
>    - https://www.iana.org/assignments/media-types/application/json-seq
>    - https://www.iana.org/assignments/media-types/application/ld+json
>    - https://stackoverflow.com/questions/51690624/json-lines-mime-type
>    - https://github.com/wardi/jsonlines/issues/9
>
>
>
> --
> *Wratko HLAVINA*
>
> Sequence Curation, Organization, Enhancements (Technical Program Manager)
>
> NCBI Building 45 Floor 4 Room AS13D-121
> Slack: whlavina / Phone: 301-402-9730 / FAX: 301-480-2484 / Calendar:
> https://bit.ly/2QU2EGB
>
>
>
>
> Well, this is JSON texts separated by newline characters.
> I think the original JSON sequences proposal started exactly like this
> (with newlines)
> this is how I remember our e-mail discussions - and then the not too
> surprising practical ivory tower like discussion waves injected the long
> forgotten RS into the picture.
>
> Reading the RFC again I suggest to not reuse the json-seq media type in
> this case, as that specification  assumes skipping to RS tokens between
> JSON texts which these newline separated JSON streams will not offer.
>
> I suggest to rather request a new media type from IANA and would not
> object having it start with text/json-
>
> Please enjoy all a healthy and wonderfully non-semantic Year version 2021,
> Stefan
>
> _______________________________________________
> json mailing list
> json@ietf.org
> https://www.ietf.org/mailman/listinfo/json
>
> _______________________________________________
> json mailing list
> json@ietf.org
> https://www.ietf.org/mailman/listinfo/json
>