Re: [media-types] [apps-discuss] A proposal for a new top-level media type: archive

Matthew Kerwin <matthew@kerwin.net.au> Thu, 25 September 2014 02:32 UTC

Return-Path: <phluid61@gmail.com>
X-Original-To: media-types@ietfa.amsl.com
Delivered-To: media-types@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 65AB91A6F13 for <media-types@ietfa.amsl.com>; Wed, 24 Sep 2014 19:32:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.361
X-Spam-Level:
X-Spam-Status: No, score=-0.361 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FM_FORGED_GMAIL=0.622, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_SOFTFAIL=0.665] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WRuZ5FjR8iKL for <media-types@ietfa.amsl.com>; Wed, 24 Sep 2014 19:32:05 -0700 (PDT)
Received: from pechora8.dc.icann.org (pechora8.icann.org [IPv6:2620:0:2830:201::1:74]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B89131A0109 for <media-types@ietf.org>; Wed, 24 Sep 2014 19:32:03 -0700 (PDT)
Received: from mail-qa0-x236.google.com (mail-qa0-x236.google.com [IPv6:2607:f8b0:400d:c00::236]) by pechora8.dc.icann.org (8.13.8/8.13.8) with ESMTP id s8P2VgQU002197 for <media-types@iana.org>; Thu, 25 Sep 2014 02:32:02 GMT
Received: by mail-qa0-f54.google.com with SMTP id n8so3993354qaq.13 for <media-types@iana.org>; Wed, 24 Sep 2014 19:31:42 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=qrJuCSLs8uVsViHbRmuGu5SXJ2HxS54W7nR7ruIfgxg=; b=LHqEv9fmk15p7muFrdDeEEiCeoiHYkx8MQJffLenJEefNUR4axyq0ML2hPp90Gn6V1 erGCvdllRBbpMvDOtX/OzY+BKp5gpkZ/Odjmv6+OsM2xE31LtsclnIvmNjWXv2sTaGCR /CcSYO5bua00sL9lhOnk6O0JGAI4FrBm93Zi3HZ4dSsChth84ByTNeMqxDMgUzgQLBeI XNlWZfVQc0VxhTLiulnDTb1H9HCMg9okV5LjZ7Tli7cXfFVcX2zIKjCxDRNOjKqXQWz8 yDjMZKE9UDbshiYFnVEiyABghKa5qrqPvxLptuW+Jy8hzk72K9rjBtImS3hFfarWd4Rl Th6w==
MIME-Version: 1.0
X-Received: by 10.140.20.151 with SMTP id 23mr12491886qgj.24.1411612302646; Wed, 24 Sep 2014 19:31:42 -0700 (PDT)
Sender: phluid61@gmail.com
Received: by 10.140.25.150 with HTTP; Wed, 24 Sep 2014 19:31:42 -0700 (PDT)
In-Reply-To: <54235269.2060002@seantek.com>
References: <54235269.2060002@seantek.com>
Date: Thu, 25 Sep 2014 12:31:42 +1000
X-Google-Sender-Auth: 32Sl9MWAjzqL-Gk8LnUD2PpcT-U
Message-ID: <CACweHNAt_FeNSY1v3gA_HWLODJgy6RweDaeOYFPrVS-A_Mue_g@mail.gmail.com>
From: Matthew Kerwin <matthew@kerwin.net.au>
To: Sean Leonard <dev+ietf@seantek.com>
Content-Type: multipart/alternative; boundary="001a11c12392dca5cc0503da998d"
X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.3 (pechora8.dc.icann.org [IPv6:2620:0:2830:201::1:74]); Thu, 25 Sep 2014 02:32:02 +0000 (UTC)
Archived-At: http://mailarchive.ietf.org/arch/msg/media-types/1zceZxZGu8LGHa5jjJKqW7yMw0A
X-Mailman-Approved-At: Wed, 24 Sep 2014 19:36:52 -0700
Cc: media-types@iana.org, IETF Apps Discuss <apps-discuss@ietf.org>
Subject: Re: [media-types] [apps-discuss] A proposal for a new top-level media type: archive
X-BeenThere: media-types@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "IANA mailing list for reviewing Media Type \(MIME Type, Content Type\) registration requests." <media-types.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/media-types>, <mailto:media-types-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/media-types/>
List-Post: <mailto:media-types@ietf.org>
List-Help: <mailto:media-types-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/media-types>, <mailto:media-types-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 25 Sep 2014 02:32:07 -0000

On 25 September 2014 09:23, Sean Leonard <dev+ietf@seantek.com> wrote:

> Colleagues on media-types and apps-discuss:
>
> I would like to propose that the IETF create a new top-level media type:
> archive.
>
>
Colour me interested.



>
> I think it's important to register archive formats as a distinct type from
> application, because there are common semantics that apply.
>
>
+1



> Archives are ubiquitous on the Internet. Even if archives are used
> "infrequently" across the Internet architecture, they are obviously used at
> the endpoints. Improper transmission of archives has become a major source
> of labeling and security issues.
> ​
> ​
>
>
Archives (of the single-file variety) are used a lot, but we usually call
it Content-Encoding. However I digress; I can see there being a value in
knowing the difference between a tarball and an ISO, instead of having my
browser guess which handler to launch based on the final characters of the
URL, or the first *n* bytes of the file.



> ​
> ​

​
> Remarkably, most archive formats have not been registered as media types
> (except for application/zip, which is an oldie). Therefore, it's pretty
> much a "clean field". Furthermore, there is a trend of a lot of widely
> available tools to support multiple formats, so the probability is good
> that if you pass some archive/* labeled content to an archive application,
> it will be able to do something intelligent with it.
> ​
> ​
>
>
​Actually there's a pretty decent list, pulled from a quick scan:

   - application/zip
   - application/gzip
   - application/zlib​
   - application/vnd.dece-zip
   - application/vnd.easykaraoke.cdgdownload
   - application/vnd.google-earth.kmz
   - application/vnd.ms-cab-compressed
   - application/vnd.osgi.bundle
   - application/vnd.software602.filler.form-xml-zip

... and probably countless others. Some of them are closer to
content-encodings than archives per se, and a good many are just zip files
with a different extension, but they're still archives. And yes, I agree,
if they were all "archive/*" instead of "application," I could probably
pass them all to 7zip or winrar without a second thought.



> ​
> ​

Where do we start? Maybe we should talk about it? I don't think it's as
> simple as drafting an Internet-Draft. Maybe there should be a BOF or
> working group. Experts with file system and archival experience should get
> involved.
>
>
​I think this conversation already the start. The immediate questions I
have, and my instinctive reactions (without having put any thought at all
into it), are:

   1. How much value is there in distinguishing tarball from an ISO image
   from ... (as opposed to bundling them all under application/octet-stream)
   -- my feeling is: quite a bit
   2. How much value is there in grouping archives together in one
   registry, away from other application/* types
   -- again, I think: probably a fair bit
   3. ​How does this relate to content-encoding, if at all?
   -- a HTTP resource with Content-Type:archive/tar|Content-Encoding:gzip
   is a different beast from a Content-Type:application/gzip, so they can
   probably play together just fine.

The real meaty starting point, though, would be to register a type (or
types) in the new registry. No point setting it up with nothing in. Was
there one in particular you wanted to see added?

-- 
  Matthew Kerwin
  http://matthew.kerwin.net.au/

On 25 September 2014 09:23, Sean Leonard <dev+ietf@seantek.com> wrote:

> Colleagues on media-types and apps-discuss:
>
> I would like to propose that the IETF create a new top-level media type:
> archive.
>
> Basically, archive would be a top-level type for all types of archive
> formats.
> https://en.wikipedia.org/wiki/Archive_file
> https://en.wikipedia.org/wiki/List_of_archive_formats
>
> I think it's important to register archive formats as a distinct type from
> application, because there are common semantics that apply. In fact, these
> semantics are very similar to multipart and message top-level types.
>
> The archive data types are all storage formats for *files*, as opposed to
> *content*. Each file has its own security implications, along with metadata
> that also has security implications (user and group permissions, access
> bits, executable bits, ACLs). At the highest level, an Internet-connected
> application ought to be able to identify that a particular piece of content
> is of this type (as opposed to the opaque application type), so it can make
> decisions about the content that are unique to archives, namely, dealing
> with the security issues, and presenting uniform user interfaces to
> handling such archives. Content bundling types like message (RFC 5322),
> multipart, and application/cms (CMS) are conceptually distinct. All those
> types can contain content that can get split off into files, but their
> purpose is not to replicate file system data.
>
> Archives are ubiquitous on the Internet. Even if archives are used
> "infrequently" across the Internet architecture, they are obviously used at
> the endpoints. Improper transmission of archives has become a major source
> of labeling and security issues.
>
> Remarkably, most archive formats have not been registered as media types
> (except for application/zip, which is an oldie). Therefore, it's pretty
> much a "clean field". Furthermore, there is a trend of a lot of widely
> available tools to support multiple formats, so the probability is good
> that if you pass some archive/* labeled content to an archive application,
> it will be able to do something intelligent with it.
>
> The following major sub-types of archives, all belong in a common
> top-level media type: [from Wikipedia]
> * archiving only (concatenate files): tar
> *  multi-function (concatenate, compress, encrypt, etc.): zip, rar, 7z,
> arc, arj, the list goes on and on...
> * software packaging: cab, msi, pup, pet, apk, rpm...
> * disk image: ISO-9660 (CD/DVD/Blu-Ray), Apple Disk Image, virtual floppy
> disks, formerly-known-as-TrueCrypt, etc.
> * backup: (a large quantity of proprietary formats)
>
> I know that the TLMT matter has been brought up before with fonts. <
> http://www6.ietf.org/mail-archive/web/apps-discuss/current/msg03447.html>
>
> Where do we start? Maybe we should talk about it? I don't think it's as
> simple as drafting an Internet-Draft. Maybe there should be a BOF or
> working group. Experts with file system and archival experience should get
> involved.
>
> Sean
>
> _______________________________________________
> apps-discuss mailing list
> apps-discuss@ietf.org
> https://www.ietf.org/mailman/listinfo/apps-discuss
>



-- 
  Matthew Kerwin
  http://matthew.kerwin.net.au/