Re: [ietf-types] Request for registration of application/gzip and application/zlib

Bjoern Hoehrmann <derhoermi@gmx.net> Wed, 22 February 2012 22:55 UTC

Return-Path: <derhoermi@gmx.net>
X-Original-To: ietf-types@ietfa.amsl.com
Delivered-To: ietf-types@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B194221E8051 for <ietf-types@ietfa.amsl.com>; Wed, 22 Feb 2012 14:55:22 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.84
X-Spam-Level:
X-Spam-Status: No, score=-3.84 tagged_above=-999 required=5 tests=[AWL=-1.241, BAYES_00=-2.599]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Y46TFGTJqzLK for <ietf-types@ietfa.amsl.com>; Wed, 22 Feb 2012 14:55:21 -0800 (PST)
Received: from mailout-de.gmx.net (mailout-de.gmx.net [213.165.64.23]) by ietfa.amsl.com (Postfix) with SMTP id 5C9AA21F8518 for <ietf-types@ietf.org>; Wed, 22 Feb 2012 14:55:21 -0800 (PST)
Received: (qmail invoked by alias); 22 Feb 2012 22:55:19 -0000
Received: from dslb-094-223-180-032.pools.arcor-ip.net (EHLO HIVE) [94.223.180.32] by mail.gmx.net (mp071) with SMTP; 22 Feb 2012 23:55:19 +0100
X-Authenticated: #723575
X-Provags-ID: V01U2FsdGVkX1+Quyx7YuOQB1HCjONkIl71eUX3Dl+b6+UQz195iQ WsMVlExBZTAx6Y
From: Bjoern Hoehrmann <derhoermi@gmx.net>
To: Ned Freed <ned.freed@mrochek.com>
Date: Wed, 22 Feb 2012 23:55:24 +0100
Message-ID: <ctrak7l80s1o4313m2h77b5rojji28j055@hive.bjoern.hoehrmann.de>
References: <alpine.BSF.2.00.1202211047280.29127@joyce.lan> <7bv7k75ur1utsvkk8jvdlp47tt8nuab9e6@hive.bjoern.hoehrmann.de> <01OC9QB5028200ZUIL@mauve.mrochek.com>
In-Reply-To: <01OC9QB5028200ZUIL@mauve.mrochek.com>
X-Mailer: Forte Agent 3.3/32.846
MIME-Version: 1.0
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-Y-GMX-Trusted: 0
Cc: John R Levine <johnl@taugh.com>, ietf-types@ietf.org
Subject: Re: [ietf-types] Request for registration of application/gzip and application/zlib
X-BeenThere: ietf-types@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Media \(MIME\) type review" <ietf-types.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-types>, <mailto:ietf-types-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ietf-types>
List-Post: <mailto:ietf-types@ietf.org>
List-Help: <mailto:ietf-types-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-types>, <mailto:ietf-types-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 22 Feb 2012 22:55:22 -0000

* Ned Freed wrote:
>> * John R Levine wrote:
>> >    Encoding considerations: needs base64 or other encoding that allows
>> >    arbitrary binary data
>
>> The value should be "8bit" (for both types).
>
>???? These are binary formats, plain and simple. There's no encoding included
>in them that would result in line-oriented output. 
>
>I don't mind changing the wording to simply say "binary" if that is clearer,
>but "8bit" is flat-out incorrect.

My apologies, I meant "binary". The nomenclature does confuse me, and I
looked the definitions up again when making the comment, but apparently
it came out wrong. The point is that RFC 4288 says "one of these key-
words", not free-form text.

>> >    Additional information:
>> >
>> >       Magic number(s): first byte is usually 0x78 but can also be 0x08,
>> >       0x18, 0x28, 0x38, 0x48, 0x58, or 0x68.
>
>> This is confusing.
>
>Then suggest text. Not sure how else you can say this.

I have not checked whether the above is all that can be said, but if it
is, something like "The first byte is one of ... where 0x78 is the most
common".

>> >       File extension(s): none
>> >       Macintosh file type code(s): none
>> >
>> >    Person and email address to contact for further information: see
>> >    http://www.zlib.net/
>
>> The form in RFC 4288 wants you to use "&" in place of "and".
>
>No, really, it doesn't. Nothing says the form has to be followed to this
>degree.

Well, I look at it from a automated data extraction perspective, and
little details like this make it harder to develop such tools. There
being no good reason to use "&" sometimes and "and" other times, I'd
prefer consistency, but I do agree that this is not a blocker.

>> >4.  Security Considerations
>> >
>> >    Zlib and gzip compression can be used to compress arbitrary binary
>> >    data such as hostile executable code.  Also, data that purports to be
>> >    in zlib or gzip format may not be, and fields that are supposed to be
>> >    flags, lengths, or pointers, could contain anything.  Applications
>> >    should treat any data with due scepticism.
>
>> I would prefer simply referencing the two format RFCs, the types as such
>> do not introduce additional security considerations.
>
>This may need to be reworded to make it clearer, but AFAIK the point that these
>formats decode automatically in some contexts and are therefore sometimes used
>to get hostile code past overly simplistic scans is not mentioned in those
>documents and *is* a known security concern. As such, it needs to be mentioned.

That is something the format RFCs should discuss, it's not a media type
issue. With HTTP for instance you do not have automatic decompression
based on the media type, but rather in a separate mechanism.
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/