RE: [rohc] TCP/IP EPIC profile

"Hongbin Liao (Intl Staffing)" <i-hbliao@microsoft.com> Mon, 11 March 2002 08:48 UTC

Received: from optimus.ietf.org (ietf.org [132.151.1.19] (may be forged)) by ietf.org (8.9.1a/8.9.1a) with ESMTP id DAA01722 for <rohc-archive@odin.ietf.org>; Mon, 11 Mar 2002 03:48:08 -0500 (EST)
Received: from optimus.ietf.org (localhost [127.0.0.1]) by optimus.ietf.org (8.9.1a/8.9.1) with ESMTP id DAA22811; Mon, 11 Mar 2002 03:32:43 -0500 (EST)
Received: from ietf.org (odin [132.151.1.176]) by optimus.ietf.org (8.9.1a/8.9.1) with ESMTP id DAA22782 for <rohc@optimus.ietf.org>; Mon, 11 Mar 2002 03:32:41 -0500 (EST)
Received: from mail-jpn.microsoft.com (mail-jpn.microsoft.com [207.46.71.29]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id DAA01464 for <rohc@ietf.org>; Mon, 11 Mar 2002 03:32:36 -0500 (EST)
Received: from jpn-imc-01.fareast.corp.microsoft.com ([157.60.4.29]) by mail-jpn.microsoft.com with Microsoft SMTPSVC(5.0.2195.2966); Mon, 11 Mar 2002 17:32:09 +0900
Received: from 157.60.4.29 by jpn-imc-01.fareast.corp.microsoft.com (InterScan E-Mail VirusWall NT); Mon, 11 Mar 2002 17:32:08 +0900
Received: from bjs-msg-01.fareast.corp.microsoft.com ([157.60.72.120]) by jpn-imc-01.fareast.corp.microsoft.com with Microsoft SMTPSVC(5.0.2195.2966); Mon, 11 Mar 2002 17:32:06 +0900
X-MimeOLE: Produced By Microsoft Exchange V6.0.5762.3
content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------InterScan_NT_MIME_Boundary"
Subject: RE: [rohc] TCP/IP EPIC profile
Date: Mon, 11 Mar 2002 16:30:37 +0800
Message-ID: <F4C77846CEE593418BE5AB7B6A83111E0465F4BC@bjs-msg-01.fareast.corp.microsoft.com>
Thread-Topic: [rohc] TCP/IP EPIC profile
Thread-Index: AcHGk4k1z1hmB8i+RAibcL2kBHEMagCQQdWg
From: "Hongbin Liao (Intl Staffing)" <i-hbliao@microsoft.com>
To: "West, Mark (ITN)" <mark.a.west@roke.co.uk>
Cc: "Julije Ozegovic" <julije@fesb.hr>, "rohc" <rohc@ietf.org>
X-OriginalArrivalTime: 11 Mar 2002 08:32:06.0091 (UTC) FILETIME=[3A12A9B0:01C1C8D7]
Sender: rohc-admin@ietf.org
Errors-To: rohc-admin@ietf.org
X-Mailman-Version: 1.0
Precedence: bulk
List-Id: Robust Header Compression <rohc.ietf.org>
X-BeenThere: rohc@ietf.org

Hi, Mark

    Thanks for your comments. However, there comes several more issues.

    1. whether can TCP/IP EPIC profile correctly describe the behavior of TCP? Or, whether EPIC is powerful enough to describe the complicated behavior of TCP/IP?

    In EPIC, fields are assumed to be independent from each other. Once each field's behaviors are described well, the whole protocol's behaviors are also well-studied. However, in practice, fields in a protocol may not behave independently completely from each other. There may be some connections (or causality) among several fields. For example, most TCP traffics only contain one-way traffic (WWW browsing, FTP downloading, etc.), i.e., only SEQ changes on the forward path (from server to client) and only ACK changes on the backward path (from client to server). The ACK on the forward path and SEQ on the backward path remain constant. However, in TCP/IP EPIC profile, the probabilities for SEQ and ACK are specified seperately:

seqno-co = LSB(8,63,5%) | LSB(14, 4096, 80%) | LSB(20,16384,10%) | IRREGULAR(32,5%)

ackno-co = LSB(8,0,5%) | LSB(14,0,80%) | LSB(20,0,10%) | IRREGULAR(32,5%)

That means, the most frequent packet format is SEQ(LSB-14)/ACK(LSB-14) with probability 64% according to the EPIC profile. However, the most frequent formats are SEQ(LSB-14)/ACK(LSB-8) and SEQ(LSB-8)/ACK(LSB-14) instead. The result is that the compressor uses the shortest Huffman prefix with a not so frequent case and a longer prefix with the most frequent case. The overall performance downgrades.

    Not only SEQ and ACK have this kind of issue in TCP/IP, TCP options, WINDOW and etc. also have the such an issue. EPIC TCP/IP profile give a wrong, at least not appropriate, probability for each combination of encodings of all fields. To solve it, we have to give the probabilities of each combination of encodings of all fields instead of the probailities of each field individually. However, it seems that EPIC has no such a method to do that. Also, it's impractical to write a profile listing each combination of encoding of all fields.

    2. MAX_FORMATS

    To alleviate the memory requirement on compressor and decomressor, EPIC uses MAX_FORMATS to restrict the packet formats generated from profiles. However, if a list of encoding methods doesn't fall in the supported formats, how EPIC encodes it? How to give the accurate information for each encoding method in this kind of lists? Or, the worst encoding method is used for this kind of lists? However, whatever is used, there will be an extra overhead for the encoding methods not falling in the MAX_FORMATS. The issue is how efficiently EPIC handles this situation. It may depends on how to set MAX_FORMATS. The issue is that, is there such a MAX_FORMATS which alleviate the memory requirements and reduce the extra overhead simultaneously? How to determine it? Could you have a rough number on it?


Hongbin L.
11/03/2002




> -----Original Message-----
> From: West, Mark (ITN) [mailto:mark.a.west@roke.co.uk]
> Sent: Friday, March 08, 2002 7:21 PM
> To: Hongbin Liao (Intl Staffing)
> Cc: Qian Zhang; Julije Ozegovic; rohc
> Subject: Re: [rohc] TCP/IP EPIC profile
>
>
>
> >
> > It's not so clear to me. Suppose there is the same situation as
> > previously described, there are 30 fields and each field has 2
> > choices. If all prefixes must be stored at the
> > compressor/decompressor, there will need at least 2*30
> bytes (1GB!).
> > Clearly, it's impractical. However, I am not clear how
> hash-tables can
> > alleviate such a situation. Would u give me some examples
> and time &
> > space requirements for such a situation?
>
>
> ok.  Let's take this example of 30 fields with 2 choices.  I
> can store a
> '0' or a '1' bit for each choice as I move from field to field.  This
> clearly gives me a space with 2^30 possible outcomes.
>
> Since it is obviously impractical to make use of all possible
> formats,
> we have to restrict the number of formats that are actually used.
>
> The point that I was trying to make, though, was that we can see a
> theoretically constant-time mapping: if I assume a table that
> contains
> all 2^30 possible outcomes, each can have a flag indicating
> whether the
> combination is present in the reduced list of formats.  If the
> combination is one of the MAX_FORMATS possibilities, then the
> table also
> contains the Huffman prefix.
>
> Of course, in this case, even if the number of formats used
> is small, we
> still have this huge (2^30 entry) table.
>
> If I set MAX_FORMATS to 400, say, then I am only keeping the 400 most
> likely headers.  This is less than 1 millionth of the possible
> combinations, so it's fair to say that the table to do the mapping is
> quite sparse!
>
> What I am suggesting is that we create a table with a minimum of 400
> entries (but more realistically something like 500 - 600) and
> hash the
> 30-bit format indicator down to a 9 or 10-bit hash table index.
>
> Provided that the hash function is
> - efficient (in time)
> - relatively collision resistant
> then you should retain very close to constant-time mapping, without a
> large cost in memory.
>
> Does that clarify things?
>
> Cheers,
>
>
> Mark.
>
>
> --
> Mark A. West, Consultant Engineer
> Roke Manor Research Ltd., Romsey, Hants.  SO51 0ZN
> Phone +44 (0)1794 833311   Fax  +44 (0)1794 833433
>
> (Yes, I do know that my disclaimer is in an attachment.  And, no, I
> didn't ask for it to be that way)
>