Re: [rohc] TCP/IP EPIC profile

"West, Mark (ITN)" <mark.a.west@roke.co.uk> Wed, 06 March 2002 05:26 UTC

Received: from optimus.ietf.org (ietf.org [132.151.1.19] (may be forged)) by ietf.org (8.9.1a/8.9.1a) with ESMTP id AAA16870 for <rohc-archive@odin.ietf.org>; Wed, 6 Mar 2002 00:26:54 -0500 (EST)
Received: from optimus.ietf.org (localhost [127.0.0.1]) by optimus.ietf.org (8.9.1a/8.9.1) with ESMTP id AAA04180; Wed, 6 Mar 2002 00:24:14 -0500 (EST)
Received: from ietf.org (odin [132.151.1.176]) by optimus.ietf.org (8.9.1a/8.9.1) with ESMTP id AAA04141 for <rohc@optimus.ietf.org>; Wed, 6 Mar 2002 00:24:11 -0500 (EST)
Received: from rsys000a.roke.co.uk (rsys000a.roke.co.uk [193.118.201.102]) by ietf.org (8.9.1a/8.9.1a) with SMTP id AAA16589 for <rohc@ietf.org>; Wed, 6 Mar 2002 00:24:09 -0500 (EST)
Received: by rsys001a.roke.co.uk with Internet Mail Service (5.5.2653.19) id <1XV9A4M4>; Wed, 6 Mar 2002 05:22:22 -0000
Received: from roke.co.uk (ras_fennel2.roke.co.uk [193.118.206.44]) by rsys002a.roke.co.uk with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id 1S9WN7A6; Wed, 6 Mar 2002 05:22:18 -0000
From: "West, Mark (ITN)" <mark.a.west@roke.co.uk>
To: Qian Zhang <qianz@microsoft.com>
Cc: Julije Ozegovic <julije@fesb.hr>, rohc <rohc@ietf.org>
Message-ID: <3C855C1F.4030502@roke.co.uk>
Date: Wed, 06 Mar 2002 00:00:31 +0000
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.4) Gecko/20011019 Netscape6/6.2
X-Accept-Language: en-us
MIME-Version: 1.0
Subject: Re: [rohc] TCP/IP EPIC profile
References: <D8B1DF394D228543B41027F0D07F5B1C01266002@bjs-msg-01.fareast.corp.microsoft.com>
Content-Type: multipart/mixed; boundary="------------InterScan_NT_MIME_Boundary"
Sender: rohc-admin@ietf.org
Errors-To: rohc-admin@ietf.org
X-Mailman-Version: 1.0
Precedence: bulk
List-Id: Robust Header Compression <rohc.ietf.org>
X-BeenThere: rohc@ietf.org

Hi Qian,

Thanks for these comments, which raise some important and interesting
issues.  I've added some comments of my own!

Cheers,

Mark.


 >
 > [Julije] Detailed profiles tend to result in complex 
applications/protocols.
 >
 > [Mark] I'd have put that the other way around (complex protocols =>
 > detailed profiles), but I think I agree...
 >
 > [Julije] The purpose of this posting is to point out the complexity of
 > whole thing, regarding the fact that almost every line of profile ends
 > up with (at least) one data structure in end user application.
 >
 > [Mark] Yes, it is quite complex.  So is RFC 3095 ;-) I would agree that
 > it is more 'dense' than RFC 3095, for example, as most of the
 > information is packed into the profile.
 >
 > Certainly every field (or meta-field) requires a data-structure.  Again,
 > I'm not sure how this is different from any other header compression 
scheme?
 >
 > [Qian] I agree that the complex protocol such as TCP/IP will cause a
 > detailed profile. What I would like to point out is that the compressed
 > format generation should be a tradeoff between the efficiency and the
 > complexity.
 >
 > Mark, would you please comments on the complexity about generating a
 > compressed format based on the input profile. Suppose there are 30
 > fields need to be considered, each field may have two choices, how about
 > the complexity for determining the compressed format?

I assume that you are referring to the complexity in the compressor,
rather than in the off-line processing to build the Huffman prefixes
(which is obviously not so time-critical)?

Overall, I would suggest that the complexity of compression (and this is
a general statement about header compression) is broadly linear with
respect to the number of fields: double the number of fields - double
the complexity.

Taking your more specific example, let's assume that we have 30 fields
with 2 choices.  (It's worth pointing out that this is a slightly 
unusual case in that a number of fields are likely to only have a 
single, fixed encoding and others will have more than 2.  However, as an 
example to discuss the issue it works just fine, of course)

One of the important pieces of information that needs to be factored in,
though, is the relative likelihood of the various choices.  I generally
assume (though this may not always be true) that encodings are tested in
decreasing order of probability.  After all, if one encoding is likely
to be used 99% of the time, it's probably best to try that one first!

Anyway, back to the example.

In the best case, you will test 30 encodings (one per field) to end up
with the compressed packet.

In the worst case, you will test 60 encodings (2 per field).  It is also
likely in this case that you will have to send an IR-DYN packet.  With
typical probability values, the least likely format is really very, very
unlikely (think 1%^30...)

The average case depends upon the probability bias, but in general is
going to be quite low.  (In this case, perhaps an average of 1.1
encodings per field, or around 33 overall?)

Since it is clear that the number of fields is important, we mustn't 
lose sight of the fact that in addition to the defined fields in a 
protocol header, there are a bunch of additional meta-fields used for 
compression.  (For example, RTP TS scale factor, IP ID NBO flag, etc...)

Something else to bear in mind is that some of the encodings are more
complex than others.  VALUE, for example, is very simple since there is
no context history to worry about; only a simple test between two
values.  STATIC is simple, but is still requires checking the context
history.

Anyway, this all leads me to suggest that the complexity is based on:
- the  number of actual fields, plus the number of meta-fields
- some factor to indicate the average search depth per field
- a factor based on the depth of the context

(I should point out that you can describe ROHC RTP in this way and
perform a similar analysis)

For anyone who hasn't had a look, we have submitted an I-D that contains
an EPIC profile corresponding to (a subset of) ROHC RTP:

http://search.ietf.org/internet-drafts/draft-surtees-rtp-epic-00.txt

You may like to see how the description in the profile maps onto the
description of the field processing in RFC 3095.  (3095 is a little more
verbose; the draft is somewhat terse, but anyway...)

 >
 > I would like to raise a discussion on the rough number of compressed
 > formats that plan to support for TCP/IP compression. Considering the
 > complex behaviors for TCP/IP protocol itself, personally, I think we may
 > need more formats than RFC 3095. However, considering that the
 > compression ratio is not the only target for TCP/IP header compression,
 > we also may not need to generate too much kinds of format.
 >

 > Can we come out a reasonable number of the max_formats for the TCP/IP
 > compressed header and provide a simpler profile to describe the behavior
 > of each field?
 >

This is an interesting question.

 From our perspective, if you build the compressed packet up by
processing on a field-by-field basis (as described in the draft), then
the number of formats is not a very important issue.  (Rather, it is
possible to support a very large number of formats).

RFC 3095 has very few basic packet formats.  However, once you start 
factoring in the extensions and the variability in extension 3 (plus the 
additional flags), there are actually quite a lot of formats.

We should bear in mind that the compressed packets are generated by an
algorithmic process -- each packet does not have a separate encoding 
function.
The mapping between selected encodings and packet format (and vice
versa) should be simple and equally quick for all formats.
The quantity of data required to describe a profile (and the formats) to
a compressor/decompressor is minimal.

(I'm at home at the moment, and don't have any numbers to hand, but I'll 
send some figures when I'm next in the office)

What is interesting to note is that the predicted benefit of adding more
formats decreases as the number of formats gets larger.

It is, however, clearly an issue that we will have to decide with regard
to the TCP profile.  I think that this requires us to start getting an
idea of the probabilities that should be assigned to encodings.  This
lets us have some indication of the effect of adding (or removing) formats.


-- 
Mark A. West, Consultant Engineer
Roke Manor Research Ltd., Romsey, Hants.  SO51 0ZN
Phone +44 (0)1794 833311   Fax  +44 (0)1794 833433

(Yes, I do know that my disclaimer is in an attachment.  And, no, I
didn't ask for it to be that way)