RE: [Rfid] XML vs. Text vs. Binary

"David Husak" <dhusak@revasystems.com> Thu, 21 July 2005 23:56 UTC

Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1DvkuB-0005sW-TA; Thu, 21 Jul 2005 19:56:43 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1DvkuA-0005sR-Tf for rfid@megatron.ietf.org; Thu, 21 Jul 2005 19:56:43 -0400
Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id TAA03475 for <rfid@ietf.org>; Thu, 21 Jul 2005 19:56:39 -0400 (EDT)
Received: from ms08.mse3.exchange.ms ([69.25.50.144]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1DvlOG-0003TQ-UT for rfid@ietf.org; Thu, 21 Jul 2005 20:27:51 -0400
X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: quoted-printable
Subject: RE: [Rfid] XML vs. Text vs. Binary
Date: Thu, 21 Jul 2005 19:56:22 -0400
Message-ID: <0E03681B885F3B4296B999E34435A16E01234C2B@ms08.mse3.exchange.ms>
Thread-Topic: [Rfid] XML vs. Text vs. Binary
Thread-Index: AcWNmsKEQ+QL96zVRH6n9Xw+kxqyKgAdu9WQ
From: David Husak <dhusak@revasystems.com>
To: Margaret Wasserman <margaret@thingmagic.com>, rfid@ietf.org
X-Spam-Score: 0.0 (/)
X-Scan-Signature: a0534e6179a1e260079328e8b03c7901
Content-Transfer-Encoding: quoted-printable
Cc:
X-BeenThere: rfid@lists.ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Control and Access of Infrastructure for RFID Operations Discussion List <rfid.lists.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/rfid>, <mailto:rfid-request@lists.ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/rfid>
List-Post: <mailto:rfid@lists.ietf.org>
List-Help: <mailto:rfid-request@lists.ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/rfid>, <mailto:rfid-request@lists.ietf.org?subject=subscribe>
Sender: rfid-bounces@lists.ietf.org
Errors-To: rfid-bounces@lists.ietf.org

First of all, let me say that I have no axe to grind wrt a text (I'll
lump XML into this category for now) vs. a binary protocol. I just want
a solution that's simple, unambiguous and efficient-enough to be
implemented on a lightweight reader.

That said, I do think that there are important factors to consider apart
from the code-size and protocol-overhead-on-the-wire issues*:

1. Direct-access vs. TLV-binary vs. sequential-text-parsing. 

In a compact, fixed binary encoding, it's easy for an implementation to
access protocol fields directly via indexed load and store, i.e. an
implementation need only access the bytes it needs, and it knows where
those bytes are. 

In a type-length-value binary encoding, an implementation can hop from
parameter to parameter by accessing only the parameter type and length
bytes. 

A text-based encoding, where parameters are typically delimited by
special character sequences, requires accessing all bytes until you find
the parameter of interest. **

I apologize for the pedagoguery of the foregoing, but I wanted to
illustrate the fact that the amount of processing and attendant latency
and latency-variation in generating and interpreting these messages can
vary considerably (from a few instructions to a few thousand
instructions) depending on the approach chosen. I'll suggest that the
latency variation incurred by a text-based encoding, especially on a
modest processor, is a real problem in the context of a protocol, like
SLRRP, with real-time control requirements.

For SLRRP, we chose the TLV-binary approach as the best compromise
between flexibility and efficiency. 


2. Utility of human-readability in the context of SLRRP.

I'll grant that for application protocols like FTP, there's real utility
in having an encoding that's human-readable and -usable, because it's
reasonable to expect a human or equivalent may regularly exercise the
protocol, and may need to do so with readily-available tools. (Though,
even in those cases the protocol is front-ended by a CLI that has a help
facility, etc.)

I think that the likelihood of SLRRP being successfully used in this
manner is very close to zero, due to the sub-second real-time nature of
the task-at-hand (reading tags in a dynamic environment) at the
layer-at-hand.

One must also recognize that a large fraction of the parameters that are
controlled in SLRRP are not numbers, they are selectors or enumerated
types, which makes it even less likely that an engineer typing UTF-8
SLRRP commands over a telnet connection can get it right without some
help/cheat-sheets/tools, in which case the value of the text encoding is
diminished.

One could argue that human-readability might be useful for debugging
during product development, early trials, etc. Maybe, but you'll
probably have or want a SLRRP codec for your sniffer in the lab
anyway... 

Another advantage of a text-based encoding that's generally acknowledged
is the possibility of using text-editing tools to manipulate and compare
the protocol contents, usually in the context of configuration models
that permit bulk dumps and restores of configuration info. This
advantage also does not seem to apply in the present case.

Net of all of that, in the present case, I just don't see that the
benefit of human-readability outweighs the cost.


3. Other layers

Which brings me to an important point, which is that many of the
text-based encoding advantages do indeed apply at higher-layers, for
example the ALE layer in the EPCglobal framework. Text-based encoding is
almost certainly the right choice at that level of abstraction, on up.


4. Management and configuration interface

At the risk of crossing the streams, it's important to recognize that
SLRRP is not a management and configuration protocol. It is a control
and data transport protocol for a real-time data acquisition device. The
requirements and the correct choice of encoding may/will vary when we
consider the two functions.

It may well be the case that we decide that the right choice for the
management or configuration interface at the SLRRP level is text-based,
and therefore we've already got the text- or XML-parser, so, modulo the
performance impact, we should just use it for the SLRRP path as well.
Maybe.

BTW, RFC3535 is a good read and appears to be quite comprehensive and
compact. Though, I'll point out the fact that SNMP creates the
requirement for specific client applications, was rated a 'neutral' by
the task force, and was not characterized as a "significant barrier" to
its use. There are, of course a number of well-understood SNMP
'minuses,' but this isn't one of them.

I must say, that based on the cursory description in RFC3535, COPS-PR
seems like it has some really interesting properties for managing RFID
reader networks. Unfortunately, it seems like, at least as of May 2003,
it had an uncertain future.



Dave



----------------------------


*I agree that these are non-issues. Memory is really cheap. And the only
thing cheaper than memory is bandwidth in an enterprise LAN.

**As an aside, the example of encoding the number '3' being more
efficient in text vs. as 32-bit integer is a bit specious. Obviously, if
you've allocated 32-bits to the field, you're also probably interested
in expressing numbers with more than four digits, in which case the
binary encoding is more compact... even before accounting for the fact
that the text-encoded '3' really looks something like
startdelimiter+parameterID+'3'+enddelimiter.


------------------------
David J. Husak
Reva Systems Corporation
dhusak@revasystems.com


> -----Original Message-----
> From: rfid-bounces@lists.ietf.org [mailto:rfid-bounces@lists.ietf.org]
On
> Behalf Of Margaret Wasserman
> Sent: Wednesday, July 20, 2005 10:19 PM
> To: rfid@ietf.org
> Subject: [Rfid] XML vs. Text vs. Binary
> 
> 
> We've had some discussion on this list about XML encoding vs. binary
> encoding...
> 
> I'm not sure that we are in agreement on all of the related points,
> but the concerns with XML seem to be code size on the reader and
> protocol overhead on the wire.  These concerns can be minimized by
> the use of a restricted XML subset, such as canonical XML (see RFC
> 3076).  However, there still seem to be some concerns along those
> lines.
> 
> The only advantages that I remember being discussed for a binary
> encoding were the complement to the concerns with XML:  smaller code
> size on the reader and less protocol overhead.
> 
> I think, though, that our discussions have missed a major benefit of
> any text-based encoding (whether a protocol-specific text encoding or
> canonical XML):  the ability to access the device using text
> processing tools (such as Perl scripts) and/or for a human to
> interact with the device directly.
> 
> In 2002, the IAB held a Network Management workshop that is
> documented in RFC 3535.  I would suggest that folks on this list read
> this report which can be found at:
> 
> http://www.ietf.org/rfc/rfc3535.txt?number=3535
> 
> One of the interesting findings of that workshop was that one of the
> significant barriers to the use of SNMP as a configuration or control
> protocol is that it uses a binary encoding, which means that
> SNMP-specific client software (an SNMP browser or manager) is needed
> to interact with the device via SNMP.  This prevents SNMP access
> using text processing tools or via human interaction.
> 
> I would not like to see the industry create the same problem with an
> RFID control protocol.
> 
> If there is real evidence that canonical XML would require too much
> code size on the reader and/or would result in an unacceptable level
> of protocol overhead, perhaps we could consider a protocol-specific
> text-based encoding (similar to FTP, SMTP and HTTP)?  I don't believe
> that parsing a well-defined text-based encoding would require much
> more code than processing a binary encoding.  And, in some cases a
> text based encoding would actually result in less data on the wire --
> for instance a 32 bit integer with the value of 3 would be encoded in
> one byte of text ("3"), while it would require 4 bytes in binary
> encoding ("0x00000003').
> 
> Personally I like canonical XML, because I think it strikes a good
> balance between being human readable and machine-parsable.  I also
> like the fact that the syntax is already well-defined.  However, I
> think that well-defined, protocol-specific textual encodings could
> also achieve that balance, perhaps with less impact in the areas of
> concern (code size and protocol overhead).
> 
> What do others think?  Should we at least consider a text-based
encoding?
> 
> Margaret
> 
> 
> 
> _______________________________________________
> Rfid mailing list
> Rfid@lists.ietf.org
> https://www1.ietf.org/mailman/listinfo/rfid

_______________________________________________
Rfid mailing list
Rfid@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/rfid