[rohc] Sorted?!

"West, Mark (ITN)" <mark.a.west@roke.co.uk> Wed, 27 February 2002 16:06 UTC

Received: from optimus.ietf.org (ietf.org [132.151.1.19] (may be forged)) by ietf.org (8.9.1a/8.9.1a) with ESMTP id LAA29392 for <rohc-archive@odin.ietf.org>; Wed, 27 Feb 2002 11:06:57 -0500 (EST)
Received: from optimus.ietf.org (localhost [127.0.0.1]) by optimus.ietf.org (8.9.1a/8.9.1) with ESMTP id LAA28398; Wed, 27 Feb 2002 11:03:53 -0500 (EST)
Received: from ietf.org (odin [132.151.1.176]) by optimus.ietf.org (8.9.1a/8.9.1) with ESMTP id LAA28369 for <rohc@optimus.ietf.org>; Wed, 27 Feb 2002 11:03:51 -0500 (EST)
Received: from rsys000a.roke.co.uk (rsys000a.roke.co.uk [193.118.201.102]) by ietf.org (8.9.1a/8.9.1a) with SMTP id LAA29239 for <rohc@ietf.org>; Wed, 27 Feb 2002 11:03:46 -0500 (EST)
Received: by rsys001a.roke.co.uk with Internet Mail Service (5.5.2653.19) id <1XV80982>; Wed, 27 Feb 2002 16:03:14 -0000
Received: from roke.co.uk (itn-pool4.roke.co.uk [193.118.194.54]) by rsys002a.roke.co.uk with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id 1S9WN4NX; Wed, 27 Feb 2002 16:03:06 -0000
From: "West, Mark (ITN)" <mark.a.west@roke.co.uk>
To: rohc <rohc@ietf.org>
Message-ID: <3C7D033A.7090203@roke.co.uk>
Date: Wed, 27 Feb 2002 16:03:06 +0000
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.4) Gecko/20011019 Netscape6/6.2
X-Accept-Language: en-us
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------InterScan_NT_MIME_Boundary"
Subject: [rohc] Sorted?!
Sender: rohc-admin@ietf.org
Errors-To: rohc-admin@ietf.org
X-Mailman-Version: 1.0
Precedence: bulk
List-Id: Robust Header Compression <rohc.ietf.org>
X-BeenThere: rohc@ietf.org

Hi all,

First of all, this is *not* about default algorithms.  Honest ;-)

However, from the recent discussion about 'deflate' there were questions 
regarding the use of dynamic Huffman trees rather than static ones.

If you look at how RFC 1951 defines the storage of the dynamic Huffman 
tree information (a compressed list of the code length for each value), 
this is at odds with the definition of the HUFFMAN instruction (which 
requires the code-lengths to be in increasing order).

To solve this, it is necessary to construct a mapping from one 
representation to the other.  This can be easily done by sorting the 
list.  Clearly a sort could be written in UDVM byte-code, although I 
suspect this to be a rather painful process, with a non-trivial overhead.

The alternative (and it has been suggested that this might make sense 
from the perspective of other compression algorithms) is to introduce a 
'SORT' instruction.

At its simplest, this would sort an 'r' row by 'c' column table of 
2-byte entries (where the start address, 'r' and 'c' are the parameters).

Of course, the biggest concern with such an instruction (I would think) 
is with the DoS security implication.
Following on from Carsten's security e-mail, if we set the 'cost' of 
SORT to be 'O(n.log(n))' we are constraining the implementation.
If we set it to 'O(n^2)' you would need to make many cycles available to 
be able to perform any kind of sort (which generally opens up a DoS 
vulnerability).
(Of course, these issues are essentially the same regardless of whether 
the sort is implemented as a single instruction or in byte-code)

So, if we feel that sorting is a useful operation (and I contend that it 
just might be, if you want to support to the full range of 'deflate' 
compressed formats), should we have a SORT instruction?

Cheers,

Mark.

-- 
Mark A. West, Consultant Engineer
Roke Manor Research Ltd., Romsey, Hants.  SO51 0ZN
Phone +44 (0)1794 833311   Fax  +44 (0)1794 833433

(Yes, I do know that my disclaimer is in an attachment.  And, no, I 
didn't ask for it to be that way)