Re: Registration of media typeimage/svg+xml

ned+xml-mime@mrochek.com Thu, 18 November 2010 21:50 UTC

Received: from hoffman.proper.com (localhost [127.0.0.1]) by hoffman.proper.com (8.14.4/8.14.3) with ESMTP id oAILojUG090168 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 18 Nov 2010 14:50:45 -0700 (MST) (envelope-from owner-ietf-xml-mime@mail.imc.org)
Received: (from majordom@localhost) by hoffman.proper.com (8.14.4/8.13.5/Submit) id oAILojRZ090167; Thu, 18 Nov 2010 14:50:45 -0700 (MST) (envelope-from owner-ietf-xml-mime@mail.imc.org)
X-Authentication-Warning: hoffman.proper.com: majordom set sender to owner-ietf-xml-mime@mail.imc.org using -f
Received: from mauve.mrochek.com (mauve.mrochek.com [66.59.230.40]) by hoffman.proper.com (8.14.4/8.14.3) with ESMTP id oAILogWJ090160 for <ietf-xml-mime@imc.org>; Thu, 18 Nov 2010 14:50:44 -0700 (MST) (envelope-from ned+xml-mime@mrochek.com)
Received: from dkim-sign.mauve.mrochek.com by mauve.mrochek.com (PMDF V6.1-1 #35243) id <01NUEK0CIU4G00AUIU@mauve.mrochek.com> for ietf-xml-mime@imc.org; Thu, 18 Nov 2010 13:50:39 -0800 (PST)
Received: from mauve.mrochek.com by mauve.mrochek.com (PMDF V6.1-1 #35243) id <01NUEFTLYONK007FL5@mauve.mrochek.com> (original mail from NED@mauve.mrochek.com) for ietf-xml-mime@imc.org; Thu, 18 Nov 2010 13:50:31 -0800 (PST)
From: ned+xml-mime@mrochek.com
Cc: Chris Lilley <chris@w3.org>, ietf-types@iana.org, ietf-xml-mime@imc.org, Alexey Melnikov <alexey.melnikov@isode.com>
Message-id: <01NUEK09JNWC007FL5@mauve.mrochek.com>
Date: Thu, 18 Nov 2010 13:26:51 -0800 (PST)
Subject: Re: Registration of media typeimage/svg+xml
In-reply-to: "Your message dated Thu, 18 Nov 2010 21:20:04 +0100" <4CE58A74.2060503@gmx.de>
MIME-version: 1.0
Content-type: TEXT/PLAIN; format=flowed
References: <1364503167.20100617162624@w3.org> <1715145489.20101118190255@w3.org> <4CE57C38.4080307@gmx.de> <282747763.20101118210121@w3.org> <4CE58A74.2060503@gmx.de>
To: Julian Reschke <julian.reschke@gmx.de>
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=mrochek.com; s=mauve; t=1290114408; bh=asRhbvshYgAY9m20ZjmT6iKehhKoXebqljeR+yIsG8g=; h=From:Cc:Message-id:Date:Subject:In-reply-to:MIME-version: Content-type:References:To; b=A+zI0UY1kWhAYmX3zSBWzd/g0fb8nD6TJXHnUnmrjkpLstp36T7Urt7eo2xOGsq7g attIgN5nJK5530sGG+judBow+ypsZ80IuICPbOk46SAoLryDoUXYvCno+TOFk+fGdZ AiTSS7NWqS1Ujqpzz16jqkj0swv9h/y5AQLHDAqU=
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

> On 18.11.2010 21:01, Chris Lilley wrote:
> > On Thursday, November 18, 2010, 8:19:20 PM, Julian wrote:
> >
> > JR>  On 18.11.2010 19:02, Chris Lilley wrote:
> >>> ...
> >>> Security considerations:
> >>> ...
> >>>       SVG documents may be transmitted in compressed form using gzip
> >>>       compression. For systems which employ MIME-like mechanisms, such
> >>>       as HTTP, this is indicated by the Content-Encoding or
> >>>       Transfer-Encoding header, as appropriate; for systems which do
> >>>       not, such as direct filesystem access, this is indicated by the
> >>>       filename extension and by the Macintosh File Type Codes. In
> >>>       addition, gzip compressed content is readily recognised by the
> >>>       initial byte sequence as described in [RFC1952] section 2.3.1.
> >>> ...
> >
> > JR>  1) What does this have to do with "Security Considerations"?
> >
> > Please read BCP 13, RFC 4288 section 4.6 "Security requirements" where you will find
> >
> >        A media type that employs compression may provide an opportunity
> >        for sending a small amount of data that, when received and
> >        evaluated, expands enormously to consume all of the recipient's
> >        resources.  All media types SHOULD state whether or not they
> >        employ compression, and if they do they should discuss
> >        what  steps need to be taken to avoid such attacks.

Read the section again. It is clearly talking about media types that employ
compression *internally*, not compression done at other layers.

Any media type can, and often is, compressed at other layers. Discussion
of such actions has no business being in any particular media type
registration.

> Agreed.

> But then it would need to be clearly stated, that, you know, the content
> can be gzipped and still be image/svg+xml.

> Can it?

This is indeed the question: If I have a static object of type image/svg+xml,
is it inside a gzip container or not? If the answer to this question is
yes, then:

(a) The security consideration section is appropriate, but needs to be
    clarified to state specifically that the media type employs compression
    internally, and

(b) The +xml on the type name MUST be removed, because the type cannot simply
    be processed directly as XML, which is what +xml means.

The same actions apply if the answer is "sometimes".

If, however, the answer is never - and I'm pretty sure it is - then all mention
of compression needs to be dropped from this registration, as it is doing
nothing useful and is just making things unclear. At most you might want a note
about it in the encoding consideration sections saying external compression is
often used with this type. Again, lots of media types are compressed at other
layers; this has nothing to do with the image/svg+xml media type specifically. 

> Because otherwise if you're talking about compression on the transport
> layer, this doesn't need to be stated here. It confuses layers.

Exactly.

> > JR>  2) I find the whole paragraph misleading; I'd like to see a clear
> > JR>  statement about whether the stream of octets resulting from gzipping SVG
> > JR>  can be labeled as "image/svg+xml" or not
> >
> > Not by itself, no. In a MIME context, it must be labelled as Content-type:
> > image/svg+xml **AND** Transfer-Encoding: gzip. Please note the AND.

Sorry, you cannot make that a requirement of a media type. At most you can
suggest that a compressed encoding may be helpful. But general purpose media
types like this travel over all sorts of different transports all the time -
including ones that lack support for particular compression mechanisms - so the
notion that they're constrained to a particular transport is simply a fantasy.

> So why we do have the paragraph above in the first place?

Exactly.

> *Any* media type can be used with Content-Encoding: gzip over HTTP.

> > This is not the same thing as Content-type: application/octet-stream and  Transfer-Encoding: gzip - because that conveys the encoding, but omits the content type.

> Nobody said that.

> > In other words the encoding label ADDS TO the media type; it does not
> remove the type.

> "The Content-Encoding entity-header field is used as a modifier to the
> media-type. When present, its value indicates what additional content
> codings have been applied to the entity-body, and thus what decoding
> mechanisms must be applied in order to obtain the media-type referenced
> by the Content-Type header field. Content-Encoding is primarily used to
> allow a document to be compressed without losing the identity of its
> underlying media type." --
> <http://greenbytes.de/tech/webdav/rfc2616.html#rfc.section.14.11>

> So once you apply the Content-Encoding you have to undo it to get back
> the payload specified by the Content-Type. It's orthogonal. It doesn't
> make the payload an instance of the media type *until* you undo the
> encoding.

Correct.

> > Indeed, this is why separate labelling of encoding was added. Back in the early days people would use gzipped VRML or gzipped PostScript, and attempted to register application/gzip; but since they were using the Media Type to hold the encoding information they had lost important information, so VRML viewers were sent PostScript and so on.  Some people said this was okay, unzip and then look at the filename extension. But a much better way was to add the encoding headers.
> >
> > JR>  (please consider transports
> > JR>  other than HTTP, such as a file system that actually supports typing by
> > JR>  Internet media types).
> >
> > Please feel free to file a bug report for the BeOS filesystem saying that
> > it should support labelling of encodings in addition to media types.

How a particular OS elects to label files, and the restrictions it imposes
through it's choice of labelling, are 100% irrlevant to the matter at hand.

> > Speaking as a former BeOS user myself, I still consider modern SVG
> > implementations (of which there are many) to be a rather more numerous and
> > relevant consideration than a promising, but obsolete and abandoned, operating
> > system from 15 years ago.

Frankly, it would not matter if you were talking about all versions of Windows
here. Many if not most operating systems have botched media type handling in
some way or other; the solution to this problem isn't to break existing media
type semantics.

> I really honestly (!) have no idea what you're referring to.

> For the media type registration what's relevant is what kind of octet
> sequences you can label with the type you register.

> So, I hear you saying: "it can be gzipped when used in a MIME context if
> and only if you label it with "content-encoding: gzip".

> That's true, and nobody disagrees with it. It's true for *any* media
> type. It doesn't require any additional statements.

Quite right.

> > JR>  If yes, that's a violation of "+xml" (and the last sentence points into
> > JR>  this direction). If not, please remove the paragraph above.
> >
> > JR>  3) If the intent is to say that "svgz" acts as file extension for
> > JR>  gzipped SVG, and *that* content can be served over HTTP as-is with
> >
> > JR>          Content-Type: image/svg+xml
> > JR>          Content-Encoding: gzip
> >
> > That is exactly what it says, yes
> >
> > JR>  than this is obviously ok
> >
> > I'm glad its obviously OK.

> But the way it's stated is totally misleading.

> Please keep in mind that I only joined this discussion after other
> people complained (I stumbled into it during a conversation at the IETF
> meeting in Maastricht).

> > JR>  because it follows from RFC 2616, and has
> > JR>  *nothing* to do with the media type (except for the extension
> > JR>  recommendation).
> >
> > So you oppose reminding people how to detect such gzipped content?
> >
> > Why would you want to do that?

> Because it makes it sound like detecting gzipped content by inspecting
> the header is an acceptable way to handle this media type.

That's exactly what it does, and that along with the other confusion really
is not OK.

				Ned