Re: [media-types] Media subtypes containing "+"

Eric Prud'hommeaux <eric@w3.org> Fri, 05 February 2021 09:16 UTC

Return-Path: <eric@w3.org>
X-Original-To: media-types@ietfa.amsl.com
Delivered-To: media-types@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2E51D3A0DDE for <media-types@ietfa.amsl.com>; Fri, 5 Feb 2021 01:16:18 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ahh9SBzouzYO for <media-types@ietfa.amsl.com>; Fri, 5 Feb 2021 01:16:15 -0800 (PST)
Received: from raoul.w3.org (raoul.w3.org [128.30.52.128]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 91A0D3A0DD7 for <media-types@ietf.org>; Fri, 5 Feb 2021 01:16:15 -0800 (PST)
Received: from lfbn-cle-1-848-144.w92-171.abo.wanadoo.fr ([92.171.55.144] helo=w3.org) by raoul.w3.org with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from <eric@w3.org>) id 1l7xDk-0002WD-GQ; Fri, 05 Feb 2021 09:16:08 +0000
Date: Fri, 05 Feb 2021 10:16:04 +0100
From: Eric Prud'hommeaux <eric@w3.org>
To: Manu Sporny <msporny@digitalbazaar.com>
Cc: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>, "Murray S. Kucherawy" <superuser@gmail.com>, rhiaro <amy@rhiaro.co.uk>, media-types@ietf.org
Message-ID: <20210205091604.GW860985@w3.org>
References: <e2ee2ce0-641f-de3e-b1b6-d375b24328ad@rhiaro.co.uk> <029ad5c8-b441-3a1e-997d-af1187bc8149@rhiaro.co.uk> <CAL0qLwYAnCSi6XQ2u8d-Xpt0SezpAiVbhGyDorrDm3vN-Sk9FA@mail.gmail.com> <44a5b068-6843-b176-66ff-7926d598bb36@it.aoyama.ac.jp> <988ced9f-bfa1-5e66-2c81-07edcba541b4@digitalbazaar.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <988ced9f-bfa1-5e66-2c81-07edcba541b4@digitalbazaar.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/media-types/UYB4fvQXIv5qqbiKu6o_bBrf4FA>
Subject: Re: [media-types] Media subtypes containing "+"
X-BeenThere: media-types@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "IANA mailing list for reviewing Media Type \(MIME Type, Content Type\) registration requests." <media-types.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/media-types>, <mailto:media-types-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/media-types/>
List-Post: <mailto:media-types@ietf.org>
List-Help: <mailto:media-types-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/media-types>, <mailto:media-types-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 05 Feb 2021 09:16:18 -0000

On Sun, Dec 27, 2020 at 11:45:33AM -0500, Manu Sporny wrote:
> On 12/24/20 3:26 AM, Martin J. Dürst wrote:
> > If I understand correctly, you would un-gzip it first, and only then 
> > unzip it.
> 
> Yes... but remember you could have an application that's only really
> interested in un-gzip'ing and stopping there. That's a perfectly
> reasonable thing to do based on what ONE of the media subtypes is
> telling you. If all you can do as an application is ungzip... well,
> that's all you can do and the subtype makes that clear.
> 
> > In a slightly more realistic example, let's use svg+xml+gzip. Gzip is
> > more generic (because it can be applied to any kind of data) than
> > xml, and you definitely first have to un-gzip before feeding the 
> > document into an XML parser.
> > 
> > But you may be right that there are cases where the terminology 
> > 'generic'/'specific' may not be adequate. foo+gzip+zip (foo gzipped
> > and then zipped) may make as much (or as little!) sense as
> > foo+zip+gzip,
> 
> Yes, agreed that "specific" is not adequate, because there are multiple
> "specific" identifiers that are valid, especially for the following use
> case:
> 
> application/did+ld+json
> 
> These are all valid subtypes: "json", "ld+json", "did+ld+json" resulting
> in these valid media type interpretations:
> 
> application/json
> application/ld+json
> application/did+ld+json

I don't think this is a good example of "specific" being
inadquate. It's pretty obviously a DID, which can be processed as
JSON-LD, which can even fall back to JSON. A better example might be a
doc that unions a couple of RDF graphs, e.g. vc and did. Here, we'd
make an arbitrary choice between application/vc+did+ld+json and
application/did+vc+ld+json, neither of which are trivially
recongizable as matching application/did+ld+json.

The problem is that there's no way to express that "did+vc" satisfies
any of them separately. AFAICT, every case where "specific" is not
adequate comes down to that ANY use case. We'd like a semantics like:
  All(application, All(json, All(ld, Any(did, vc))))

One cogent approach would be to say that media types capture generic
formats and their profiles capture vocabularies/use cases. For example
`application/svg+XML` would be `application/XML; profile=svg` and my
did+vc example would be `application/ld+json; profile=did,vc`. If you
need to capture nuances beyond the embedded vocabulary, you would
invent a profile (currently addressed by inventing a media type).

The problem is the migration path. We get a lot done with media types
and it would be nice to not have to freak out and update every piece
of code simultaneously. However we introduced an Any semantics,
there'd be an uncomfortable moment where freshly-written, Any-aware
applications were running on Any-unaware libraries, APIs and browsers,
which is typically ground for rebellion.

A not-too-painful way to get there would be to require e.g. profiles
on vocabulary-specific media types to give HTTP, libraries, servers,
browsers and file managers time to migrate to using the profile. The
web platform moves pretty quickly and it's not like we'd be breaking
old apps with the migration. Eventually, we might get to a point where
new lanuages and protocols like DID wouldn't need to register a new
media type.

PROPOSAL:

1. Register vocabulary-specific media types with a required profile.

2. (Conservatively) register types for combinations of vocabs ordered
   alphabetically ("did+vc+...", not "vc+did+..."), including profiles
   that decomposes to identify both vocabs.

3. Invent profiles for existing X+xml media types.

4. Update 7303 (+xml) and 6839 (+json) to encourage the use of
   accompanying profiles.

5. Years hence, evaluate whether we can quit registering
   vocab-specific media types.

I can see this migrating starting in specific communities, e.g.
Semantic Web and creeping out from there to the Web Platform.

If we're not willing to try something so disruptive, we may as well
publish application/did+ld+json without a profile and let them get on
with their work.



> That's what we're trying to say... does that hold together?
> 
> > The problem with this approach may be that currently existing 
> > implementations, as well as some future implementations, may just
> > look at the rightmost + and the following alphabetic characters.
> > Therefore, you need to make sure that people constructing/proposing
> > suffix combinations will choose the right order, and that means that
> > you have to have some language about suffix ordering anyway.
> 
> Ok, got it. Would adding language that explains the above (and the
> previous response to Murray) make sense?
> 
> We expect that many applications don't do this sort of thing today,
> rather either matching on the complete mime-type and doing processing,
> or not. We are just trying to be complete about the mental model of
> media subtypes and thought we were documenting something that was
> already more or less there... but perhaps not? In any case, does adding
> language about suffix ordering and how to interpret it address your
> concern, Martin?
> 
> -- manu
> 
> -- 
> Manu Sporny - https://www.linkedin.com/in/manusporny/
> Founder/CEO - Digital Bazaar, Inc.
> blog: Veres One Decentralized Identifier Blockchain Launches
> https://tinyurl.com/veres-one-launches
> 
> _______________________________________________
> media-types mailing list
> media-types@ietf.org
> https://www.ietf.org/mailman/listinfo/media-types