PP15: Does Applicability Matter for Applications?
Lisa Dusseault <lisa@osafoundation.org> Sun, 27 January 2008 20:21 UTC
Return-path: <discuss-bounces@apps.ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1JJE0A-0005XR-Rd; Sun, 27 Jan 2008 15:21:14 -0500
Received: from discuss by megatron.ietf.org with local (Exim 4.43) id 1JJE09-0005Qs-7S for discuss-confirm+ok@megatron.ietf.org; Sun, 27 Jan 2008 15:21:13 -0500
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1JJE08-0005Pm-Th for discuss@apps.ietf.org; Sun, 27 Jan 2008 15:21:12 -0500
Received: from laweleka.osafoundation.org ([204.152.186.98]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1JJE06-0004mt-LZ for discuss@apps.ietf.org; Sun, 27 Jan 2008 15:21:12 -0500
Received: from localhost (laweleka.osafoundation.org [127.0.0.1]) by laweleka.osafoundation.org (Postfix) with ESMTP id 25787142245 for <discuss@apps.ietf.org>; Sun, 27 Jan 2008 12:21:13 -0800 (PST)
X-Virus-Scanned: by amavisd-new and clamav at osafoundation.org
Received: from laweleka.osafoundation.org ([127.0.0.1]) by localhost (laweleka.osafoundation.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id AozqVdgPUBeb for <discuss@apps.ietf.org>; Sun, 27 Jan 2008 12:21:05 -0800 (PST)
Received: from [192.168.1.101] (unknown [74.95.2.169]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by laweleka.osafoundation.org (Postfix) with ESMTP id C3C3A142217 for <discuss@apps.ietf.org>; Sun, 27 Jan 2008 12:21:05 -0800 (PST)
Mime-Version: 1.0 (Apple Message framework v752.3)
To: Apps Discuss <discuss@apps.ietf.org>
Message-Id: <BB5ABA7B-23FE-4AB7-9AC9-41CB1334C292@osafoundation.org>
Content-Type: multipart/alternative; boundary="Apple-Mail-13--180238426"
References: <31D151A3D66E404AACBBB0247ACA54A7029D3A@STNTEXCH11.cis.neustar.com>
From: Lisa Dusseault <lisa@osafoundation.org>
Subject: PP15: Does Applicability Matter for Applications?
Date: Sun, 27 Jan 2008 12:21:00 -0800
X-Mailer: Apple Mail (2.752.3)
X-Spam-Score: -4.0 (----)
X-Scan-Signature: df1883a27a831c1ea5e8cfe5eb3ad38e
X-BeenThere: discuss@apps.ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: general discussion of application-layer protocols <discuss.apps.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/discuss>, <mailto:discuss-request@apps.ietf.org?subject=unsubscribe>
List-Post: <mailto:discuss@apps.ietf.org>
List-Help: <mailto:discuss-request@apps.ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/discuss>, <mailto:discuss-request@apps.ietf.org?subject=subscribe>
Errors-To: discuss-bounces@apps.ietf.org
Begin forwarded message: > From: "Peterson, Jon" <jon.peterson@neustar.biz> > Does Applicability Matter for Applications? > Jon Peterson > > > There is a longstanding maxim of Internet protocol design that > successful application protocols invariably devolve into generic > transports. The canonical example would be HTTP; the parody in > RFC3093 (the "Firewall Enhancement Protocol") is frighteningly > close to the truth. A number of protocols which have little to do > with the delivery of hypertext take advantage of the ubiquitous > deployment of HTTP, piggybacking on it or masquerading as it on the > wire. > > It is widely believed that this slide into the generic is > undesirable. Most application protocols have a specific intended > sphere of applicability, and this applicability informs the design. > HTTP was certainly not intended to be a transport for arbitrary data. > > So why does this occur? > > One reason is because there is a certain barrier to entry for new > protocols on the Internet. Once a protocol enjoys widespread > implementation and deployment, the Internet has adapted to its > presence: endpoints support it, middleboxes allow it, servers are > optimized for it. It becomes difficult to sell devices that inhibit > a dominant protocol. When a new application goes shopping for a > protocol, the level of effort required to overcome this barrier to > entry is a significant consideration, and riding on the coat-tails > of an existing dominant protocol appears an attractive path to > rapid deployment. > > This paper argues that furthermore, a class of tools exist that can > render application protocols generic, a class that will be referred > to here as "genericizers". In the case of HTTP, SOAP would be a > prime example of this sort of tool. The effect of a genericizer is > to broaden the applicability of a protocol by enabling its fields > or payloads to carry unanticipated material with different, > potentially radically different, characteristics than the protocol > designers intended. From a standardization perspective, > genericizers furthermore perform this function without requiring > any modification to the underlying protocol. These tools thus allow > implementers, and designers of derivative specifications, to > reinterpret the applicability of a protocol entirely. > > As a case study of the phenomenon of genericizers, this paper > explores the data URL (RFC2397) and considers the manner in which > proposed uses of the data URL impact and genericize protocols in > the RAI Area, particularly ENUM and SIP. > > Broadly, the purpose of the data URL is to provide literal data by- > value when the use of a URI is required, but a reference is > undesirable. RFC2397 suggests that the intended use was to provide > relatively small chunks of inline data, ranging from text strings > (the default MIME type for data) to encoded binary representation > of modestly-sized image files. The intended applicability of the > data URL is more or less unlimited; RFC2397 says little more than > "Some applications that use URLs also have a need to embed (small) > media type data directly inline", and that "The "data:" URL scheme > is only useful for short values." > > It is arguable whether or not data URL meets the definition of a > URI - not because it fails to yield a resource, given the term > 'resource' is defined quite liberally in RFC2396, but because there > is no meaningful sense in which it serves as an identifier. As > RFC2396 says, "An identifier is an object that can act as a > reference to something that has identity." The data URL is not a > reference, it is a literal. Similarly, a URL is defined as "the > subset of URI that identify resources via a representation of their > primary access mechanism", and a data URL clearly does not > constitute a representation nor does it reflect any sort of access. > Already these deviations from the conventional purpose of URIs and > URLs suggest that that use of the data scheme might have unintended > consequences. > > Consider the use of the data URI in ENUM (RFC3761). ENUM is a > mechanism for using the DNS to discover URIs associated with > telephone numbers. For this purpose ENUM builds off the DDDS > framework, albeit ENUM benefits from only a subset of DDDS's > capabilities (for instance, it has no pressing need for the order > v. preference distinction, the replacement field, or a non-greed > LHS of the regular expression, not to mention that lookups targets > a pre-established "golden root" domain). > > The data URL entered the ENUM community through draft-ietf-enum- > cnam, a document which proposes a way to look up a text string > associated with a telephone number (the text string contains the > name one would see when Caller ID is displayed on a telephone). > This string is stored as a data URL within a NAPTR record. On some > level, the motivation for this work is obvious. ENUM as a query- > response protocol is implemented on the target devices for this > application, the devices need some additional query-response > functionality; ENUM is in other words perceived as a dominant > protocol, and enum-cnam proposes to piggyback additional data onto > it. > > But once a genericizer has been introduced, ENUM no longer > neccesarily shows "how DNS can be used for identifying available > services connected to one E.164 number", since the data URL in no > way identifies services. Instead, it renders ENUM a generic > database protocol whose keys are telephone numbers and values are > arbitrary data. Once the capability to parse data URLs is present > in ENUM resolvers, arbitrary data then can be served via the DNS. > Effectively this allows domain administrators to embed TXT RRs > within NAPTR RRs, with the added bonus of MIME typing to allow > various binary data types. Proposals that have been informally > discussed with this regard include embedding public keys, ringtones > and vCards within data URLs in NAPTR RRs. It would not be much of a > stretch to suggest that HTML documents could be served directly > from the DNS in a similar fashion. All of these proposals have > familiar implications on DNS response message size, caching, > security, privacy, and so on. None of these problems arise in the > use of typical URIs, which identify resources in the network and > thus provide a layer of indirection. ENUM assumes the existence of > that layer of indirection; without it, ENUM's architectural > underpinnings look increasingly suspect. > > The current version of draft-ietf-enum-cnam has abandoned the data > URL, due to pushback - it does however propose a new 'pstndata' URL > scheme, with more or less identical properties but a slightly > constrained applicability to PSTN-related data. However, despite > this setback the data URL still enjoys a vogue in ENUM circles. One > current notable draft is draft-ietf-enum-unused-02, which proposes > the use of a text string within a data URL to indicate that a > particular telephone number is not in service in the PSTN. > > The data URL has also begun to made a few appearances in the SIP > WG, as a manner of transporting large chunks of data in headers. > The SIP (RFC3261) architecture distinguishes envelope from body in > a manner similar to email; intermediaries inspect and modify > headers in the process of routing requests, whereas message bodies > are payloads that are delivered to applications at the endpoints. > While intermediaries are not strictly forbidden from inspecting SIP > message bodies, there is no standard routing procedure that relies > on them doing so, and intermediaries are forbidden explicitly from > modifying SIP message bodies. However, there are members of the SIP > community who would like intermediaries to have control over the > bodies of SIP messages, mostly so that SDP can be modified to > enforce policies familiar to operators in the PSTN. So, as is the > case with email, there has for some time been an impetus in the SIP > WG from this contingent end the tyranny of endpoints over bodies, > but here has not been a standard way to permit unilateral > modification of bodies by intermediaries. > > However, given that SIP message bodies could conceivably be encoded > as data URLs, and numerous SIP header fields permit arbitrary URIs > as their value, there is apparently an easy workaround. Ordinarily, > location information such as PIDF-LO (RFC4119) is carried as a body > within SIP; for the Location header field proposed by draft-ietf- > sip-location-conveyance, it has been proposed that a data URI be > used when an intermediary needs to insert location information. > Similar suggestions have been made related to some bodies used for > security properties, and for SDP itself. > > Leaving aside practical concerns about the length of SIP headers > that parsers can withstand, the effect of moving information from > bodies to headers fundamentally changes the SIP architecture. For > example, SIP's security model is focused on providing end-to-end > security services for bodies, but not for headers. Moreover, > applications at the endpoints that want to consume bodies should > have some reasonable sense of who created them. That is clear in > the RFC3261 architecture, but much less so if bodies migrate into > headers. The acceptance of more or less any proposal to encode data > URLs in SIP headers would open the doors to numerous > > So what's to be done? > > This study sheds light on genericizers in order to make them easier > to identify when they arise in future standardization efforts, and > also to illustrate that they have significant architectural > implications. Given how long a protocol like RFC2397 has been > around, it is unlikely that it will be deprecated; nor is it > reasonable for protocols that use URIs to single out that the data > URI is inappropriate for use in a particular field - as the enum- > cnam "pstndata" URI proposal illustrates, it is quite easy to > circumvent such a prohibition. However, the lessons learned from > studying the manner in which genericizers are leveraged may assist > in preventing similar architectural loopholes in the future. > > Unfortunately, the primary reason why the data URL is not more > widely used in the IETF is, in all likelihood, that relatively few > participants are aware of it. > > Finally, it is important to note that the data URL did not create > the impetus in ENUM or SIP to genericize their architectures; the > data URL is merely an enabler used by the advocates of those > positions to advance more generic architectures without having to > contest the underlying design choices of ENUM and SIP. >
- PP15: Does Applicability Matter for Applications? Lisa Dusseault