Re: [apps-discuss] Review of draft-levine-application-gzip-01.txt

John C Klensin <john-ietf@jck.com> Sat, 14 April 2012 16:18 UTC

Return-Path: <john-ietf@jck.com>
X-Original-To: apps-discuss@ietfa.amsl.com
Delivered-To: apps-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3CAEC21F848B for <apps-discuss@ietfa.amsl.com>; Sat, 14 Apr 2012 09:18:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.46
X-Spam-Level:
X-Spam-Status: No, score=-102.46 tagged_above=-999 required=5 tests=[AWL=0.139, BAYES_00=-2.599, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id twGE+8YQcBos for <apps-discuss@ietfa.amsl.com>; Sat, 14 Apr 2012 09:18:30 -0700 (PDT)
Received: from bsa2.jck.com (bsa2.jck.com [70.88.254.51]) by ietfa.amsl.com (Postfix) with ESMTP id 77B4321F8483 for <apps-discuss@ietf.org>; Sat, 14 Apr 2012 09:18:29 -0700 (PDT)
Received: from [198.252.137.7] (helo=PST.JCK.COM) by bsa2.jck.com with esmtp (Exim 4.71 (FreeBSD)) (envelope-from <john-ietf@jck.com>) id 1SJ5b1-000Htx-3w; Sat, 14 Apr 2012 12:13:07 -0400
Date: Sat, 14 Apr 2012 12:18:24 -0400
From: John C Klensin <john-ietf@jck.com>
To: John Levine <johnl@taugh.com>, apps-discuss@ietf.org
Message-ID: <A66F1731667F902A3BE63855@PST.JCK.COM>
In-Reply-To: <20120414141741.69972.qmail@joyce.lan>
References: <20120414141741.69972.qmail@joyce.lan>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Subject: Re: [apps-discuss] Review of draft-levine-application-gzip-01.txt
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/apps-discuss>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 14 Apr 2012 16:18:31 -0000

--On Saturday, April 14, 2012 14:17 +0000 John Levine
<johnl@taugh.com> wrote:

>> And, of course, there is the separate issue that caused us to
>> reject various "zip" and related types years ago -- it is
>> basically a content-transfer-encoding (even if it also used
>> locally) and not a content/media type.
> 
> You're right, but these days you could make much the same
> argument about application/pdf.  (Text? Images? Encrypted?
> Forms to fill in? Active content?  Whatever else they've
> addred to PDF lately?)
> 
> In view of Ned's comments, I'm inclined to leave it as
> application/gzip.

John,

I think there are two separate issues here.  One is what this
should be called given that we decide it is an application type.
I agree that application/gzip is a reasonable answer to that
question.   The other is whether we give up significant
functionality by abandoning the distinction between a media
(content) type and a content-[transfer-]encoding.   The latter
concerns me independent of the level of current practice, if
only because, once we "officially" start down that slope, I
don't see any model by which we stop.

In the latter regard, I don't see your analogy to PDF as
relevant.  It would be relevant had we defined text/pdf,
image/pdf, etc., but we didn't -- we defined it as an
application type from the beginning.  Independent of what goes
on inside a PDF document (even one that is a aggregate of
several kinds of things), it still meets the basic criteria for
an application/ subtype, namely "go off and find a competent PDF
processor and let it do its thing".  The problem with compressed
formats (or, worse, compressed aggregate ("archive") ones like
zip or tar.gz) is that the compressor/uncompressor are just
reversible transformation formats of something else -- where
that something else might be one or more objects of
substantially any media type.  

Now I don't think the original model works entirely well for the
compressor-aggregator case either, first because one might end
up needing
   content-transfer-encoding=SomeZip
encoded with, e.g., base64, and, if the compression form is also
an aggregating one, one could have 
   SomeZip (text-file, image-file, image-file,...)

There is no clear way to express that compressed aggregate in
Media-type language either.  At the same time, the binary data
problem notwithstanding, it has hard for me to understand why a
strict compression type shouldn't, in principle, be a CTE (and,
while the discussion has mostly focused on gzip, zlib, as a pure
stream compression format, seems like even more of a candidate
for a CTE.

Precisely because of the aggregation issue, "application/zip" is
a more complex case than "application/gzip" but we see a lot of
"application/x-zip" in the wild too.

The above is no surprise.  We were at least aware of the issue
when MIME was designed.  I'm just reluctant to give up on the
distinction without a more general and careful examination of
the model than one gets if one starts, as it appears to me that
your document appears to start, with "this is useful, there is a
lot of it out there (modulo details of the keyword chosen), so
let's standardize it".

Expressed a little differently, this spec, especially for zlib,
changes or ignores the distinction between content-type and
content-transfer-encoding of the base MIME specification.   As
such, it effectively updates that specification unless there is
a clear explanation of why the distinction doesn't apply to this
case.  So I believe that the document has some obligation to
either:

	(1) Explicitly update RFC 2045, especially the
	discussion in the last half of Section 6.4, to split
	whatever hairs need to be split to justify a stream
	compression method that can, in principle, be applied to
	media of any type as a media type rather than as a CTE.
	FWIW, updating 2045 probably requires a standards track
	spec, possibly a separate one.    Or
	
	(2) Explain why that discussion either isn't applicable
	or should be ignored because of the special
	circumstances of this case.

As a hypothetical question, if someone sat down now and wrote up
a spec for two new content-transfer-encodings, Zlib and ZlibB64,
explaining in those specs that, while application/x-zlib is in
use, it was a mistake and should be deprecated in favor of the
C-T-E form, what reaction would it get?  I'm not predicting that
such a spec would be written, but thinking about the possibility
may help in the discussion.

best,
   john