Re: [media-types] I-D Action: draft-ietf-mediaman-suffixes-00.txt

Manu Sporny <msporny@digitalbazaar.com> Thu, 17 February 2022 19:44 UTC

Return-Path: <msporny@digitalbazaar.com>
X-Original-To: media-types@ietfa.amsl.com
Delivered-To: media-types@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 211833A1055 for <media-types@ietfa.amsl.com>; Thu, 17 Feb 2022 11:44:38 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.612
X-Spam-Level:
X-Spam-Status: No, score=-2.612 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, NICE_REPLY_A=-0.714, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rZqrLy8KVg0l for <media-types@ietfa.amsl.com>; Thu, 17 Feb 2022 11:44:33 -0800 (PST)
Received: from mail.digitalbazaar.com (mail.digitalbazaar.com [96.89.14.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 009323A105B for <media-types@ietf.org>; Thu, 17 Feb 2022 11:44:32 -0800 (PST)
Received: from [73.152.135.186] (helo=[10.4.10.95]) by mail.digitalbazaar.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.89) (envelope-from <msporny@digitalbazaar.com>) id 1nKmhY-0000uH-Kt for media-types@ietf.org; Thu, 17 Feb 2022 14:44:31 -0500
To: media-types@ietf.org
References: <163839922518.1124.7984157361303473511@ietfa.amsl.com> <CAL0qLwYCBxgZQKTx3gi=XKLvMSsuL33bvvh3+ebHysyvn-JMLg@mail.gmail.com> <b1f63558-848b-fa1c-4583-52ae50bdc18e@digitalbazaar.com>
From: Manu Sporny <msporny@digitalbazaar.com>
Message-ID: <06e74fc4-7679-0052-1e45-15d46b12715a@digitalbazaar.com>
Date: Thu, 17 Feb 2022 14:44:28 -0500
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.14.0
MIME-Version: 1.0
In-Reply-To: <b1f63558-848b-fa1c-4583-52ae50bdc18e@digitalbazaar.com>
Content-Type: text/plain; charset="utf-8"
Content-Language: en-GB
Content-Transfer-Encoding: 7bit
X-SA-Exim-Connect-IP: 73.152.135.186
X-SA-Exim-Mail-From: msporny@digitalbazaar.com
X-SA-Exim-Version: 4.2.1 (built Tue, 02 Aug 2016 21:08:31 +0000)
X-SA-Exim-Scanned: Yes (on mail.digitalbazaar.com)
Archived-At: <https://mailarchive.ietf.org/arch/msg/media-types/1qSTwuZnL4uPICY1HdeCIIowiO8>
Subject: Re: [media-types] I-D Action: draft-ietf-mediaman-suffixes-00.txt
X-BeenThere: media-types@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "IANA mailing list for reviewing Media Type \(MIME Type, Content Type\) registration requests." <media-types.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/media-types>, <mailto:media-types-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/media-types/>
List-Post: <mailto:media-types@ietf.org>
List-Help: <mailto:media-types-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/media-types>, <mailto:media-types-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Feb 2022 19:44:38 -0000

On 2/17/22 12:51 PM, Manu Sporny wrote:
> Murray, I screwed up explaining this...

That'll teach me to try to fire off an email that requires precision right
before a scheduled call. Attempt #2, below...

Let's take Russ' recent example w/ image/svg+xml+gzip.

I think what we want to be able to say is that there are multiple valid ways
of processing content that is associated with that media type:

image/svg+xml+gzip - the consuming application understands application/gzip
and image/svg+xml

image/svg+xml - as long as some preliminary application can gunzip the data, a
secondary application that understands image/svg+xml can process it

application/gzip - some preliminary application understands gzip and that's
where the story ends... but perhaps that's fine because the application's job
is to just crawl the Internet and store all gzip'd data for archival purposes.

What Section 2.1 was attempting to do is to say "it's okay to process a media
type if you only understand part of the media type... media types are not
opaque"... and I believe this is where your concerns came in, Murray. The
section is clearly wrong when we apply its logic to something like
image/svg+xml+gzip, because +gzip can exist across a number of different
top-level types... it doesn't naturally map to "application/gzip".

So, the question then becomes, how do we reasonably map "+gzip" to
"application/gzip" via the Media Types[1] table?

For the "application/*" space, it's fairly easy (and Section 2.1 makes a lot
more sense). The application takes the subtype and just starts chopping off
items delimited by the "+" sign until you get a media type you understand:

application/did+ld+json - DID Document processors
application/ld+json - JSON-LD processors
application/json - JSON processors

However, as stated above, the mental model breaks down when you go outside of
"application/*". For example, how do we say that image/svg+xml+gzip can be
processed as "application/gzip" if an application understands that media type?

Perhaps the fix to this is to say that suffixes MUST be registered and
associated with a concrete media type? For example, we might need a new class
of entries the media types table:

Structured Suffixes Registry
----------------------------

+gzip -> application/gzip
+zip -> application/zip
+json -> application/json
+ld+json -> application/ld+json
+xml -> text/xml

Thoughts? What sort of horrors will doing that unleash?

-- manu

[1]https://www.iana.org/assignments/media-types/media-types.xhtml

-- 
Manu Sporny - https://www.linkedin.com/in/manusporny/
Founder/CEO - Digital Bazaar, Inc.
News: Digital Bazaar Announces New Case Studies (2021)
https://www.digitalbazaar.com/