[rohc] Re: NBO- TCP/IP EPIC profile

"West, Mark (ITN)" <mark.a.west@roke.co.uk> Mon, 04 March 2002 15:17 UTC

Received: from optimus.ietf.org (ietf.org [132.151.1.19] (may be forged)) by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA28250 for <rohc-archive@odin.ietf.org>; Mon, 4 Mar 2002 10:17:13 -0500 (EST)
Received: from optimus.ietf.org (localhost [127.0.0.1]) by optimus.ietf.org (8.9.1a/8.9.1) with ESMTP id KAA04783; Mon, 4 Mar 2002 10:14:46 -0500 (EST)
Received: from ietf.org (odin [132.151.1.176]) by optimus.ietf.org (8.9.1a/8.9.1) with ESMTP id KAA04751 for <rohc@optimus.ietf.org>; Mon, 4 Mar 2002 10:14:44 -0500 (EST)
Received: from rsys000a.roke.co.uk (rsys000a.roke.co.uk [193.118.201.102]) by ietf.org (8.9.1a/8.9.1a) with SMTP id KAA28111 for <rohc@ietf.org>; Mon, 4 Mar 2002 10:14:39 -0500 (EST)
Received: by rsys001a.roke.co.uk with Internet Mail Service (5.5.2653.19) id <1XV9AN57>; Mon, 4 Mar 2002 15:12:56 -0000
Received: from roke.co.uk (itn-pool4.roke.co.uk [193.118.194.54]) by rsys002a.roke.co.uk with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id 1S9WNZW0; Mon, 4 Mar 2002 15:12:55 -0000
From: "West, Mark (ITN)" <mark.a.west@roke.co.uk>
To: Julije Ozegovic <julije@fesb.hr>
Cc: rohc <rohc@ietf.org>
Message-ID: <3C838EF7.4010506@roke.co.uk>
Date: Mon, 04 Mar 2002 15:12:55 +0000
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.4) Gecko/20011019 Netscape6/6.2
X-Accept-Language: en-us
MIME-Version: 1.0
References: <3C7BF6E0.9070307@fesb.hr> <3C7D1823.7070003@roke.co.uk> <3C7E2FA3.4050608@fesb.hr> <3C7E6085.6030703@roke.co.uk> <3C7F7D32.5060606@fesb.hr> <3C8340A1.2080602@roke.co.uk> <3C836C4E.8070809@fesb.hr>
Content-Type: multipart/mixed; boundary="------------InterScan_NT_MIME_Boundary"
Subject: [rohc] Re: NBO- TCP/IP EPIC profile
Sender: rohc-admin@ietf.org
Errors-To: rohc-admin@ietf.org
X-Mailman-Version: 1.0
Precedence: bulk
List-Id: Robust Header Compression <rohc.ietf.org>
X-BeenThere: rohc@ietf.org

Hi Julije,

I don't know what it's going to take to convince you of my point of 
view, but I should warn you that I don't give up easily...

;-)

Mark.


Julije Ozegovic wrote:

> Hi Mark,
> 
> nice to hear from you ;)
> 
> As you can realize, the difference in our approaches is because you are 
> doing top-down, while myself I am concerned with bottom-up design. This 
> is why we give the same problem different flavors.


I'm not sure that I do realise why we have this difference.  Anyway, 
without worrying too much about the why, let's try and resolve it.

> 
> I can agree with all your writing, but also I have a strong feeling that 
> the point is missed. What is of concern follows:
> 


I'm inclined to agree that a point is being missed.  Which point and by 
whom is less clear!  Now - if you can just agree with my conclusions...

;-)


> 1. ALL fields are sent to the network in NBO

> 2. Only IP_ID can have ARBITRARY value


<pedantry>
Many fields can have arbitrary values.  The IP ID is interesting 
because, although it can have an arbitrary value, it is commonly 
implemented in a more structured way.
</pendantry>


> 3. Only "some" hosts (not to mention which ones)
>    reverse byte order for IP_ID
> 
> ==> therefore, there is no possibility
>     for NBO confusion except for IP_ID


If you are talking about a universe consisting only of TCP/IP header 
compression, maybe.

For RTP the encoding is different.  Who knows for other protocols?

(As a hypothetical example, if the Timestamp mechanism from RFC 1323 
were only used for RTTM and not for PAWS, then the same could apply to 
the TS values.  Clearly this is not the case, but there may be similar 
issues with other protocols.  We just can't say...)

> 
> 4. IP_ID is sometimes generated randomly,
>    and sometimes just incremented


Yes.


> 5. Subsequent IP_ID numbers are assigned to packets
>    which can belong to different data flows
> 
> ==> We can expect IP_ID to be periodic to some extent,
>     depending on the number of data flows
>     and their respective rates at the sender
> 


Both these points are covered in RFC 3095 and the TCP behavior model 
draft <draft-west-tcpip-field-behavior-00.txt> [-01 has been submitted], 
where 3 classes of IP ID behaviour are described.

However, what this boils down to is that we either see the IP ID as 
random or more-or-less sequential.


> 6. LSB coding is therefore optimal for IP_ID,
>    RTP profile with INFERRED-OFFSET is possibly
>    a misuse, however it is official.


A *misuse*!?  Have I missed something here?  Did I just hallucinate the 
entire history of RFC 3095?  (I hope not as this would make me question 
my already limited grasp on sanity)

Treating the IP ID as an offset from the RTP SN is the only way of 
achieving the maximum compression ratio for ROHC RTP.  In the case where 
the IP ID moves in step with the RTP SN, it is clearly the most 
efficient way of dealing with the IP ID as it can be elided altogether. 
  When the IP ID jumps, it is no less efficient to handle it this way.
(Ultimately, of course, LSB encoding is used here, but on the offset 
rather than the absolute value).


>    What I have proposed can be modified to
> 
>    IP_ID = LSB(70%,x,y) | SWAP-LSB(20%,x,y) | IRREGULAR(10%)
> 


Hang on - haven't we been here before?

Let me try and state, as clearly and unambiguously as possible what the 
above line actually says:

"70% of the time, standard LSB encoding will be used.  20% of the time, 
the same LSB encoding will be used, but with byte-swapped input.  10% of 
the time, the full 16-bit IP ID value will be sent."

Or:

"70% of my packets will have IP ID in NBO and 20% will have byte-swapped 
IP IDs."

I really need to talk to your supplier of tea-leaves, crystal-balls or 
however else you can determine the percentages of byte-swapped values 
that your compressor/decompressor will have to processs ;-)

More importantly, you are introducing a 'hidden' NBO flag into the 
Huffman prefixes.  You are creating two encodings which are equivalent 
except for the byte order and which are separate nodes in the tree. 
Whether the value is byte-swapped or not will be encoded in *every* 
packet.  How can this possibly be efficient?


> 7. NBO() is ambiguous, the question is:
>    how application can recognize swapped field?


Hold on a second!  NBO is a *component* of SWAP-LSB.  It has to be!  So 
NBO is *exactly* as ambiguous (or unambiguous) as SWAP-LSB.


> 8. of course, it is possible
>    - to compare new value with the old one from the context
>    - decide whether LSB coding is appropriate
>    - if not, swap bytes
>    - test LSB on swapped value
>    - if not, it should be random
>    - set NBO flag
>    - communicate NBO flag through IR/DYN
>    - send subsequent packets with swapped LSB


Well, basically, yes.

I could pick nits, but it's really not worth it.


> 9. What if IP_ID is random?
>    there is a possibility that LSB or NBO-LSB
>    succeed on random IP_ID alternatively;
>    in worst case random IP_ID will cause
>    sequence of IR/DYN packets!
> 


Random IP ID behaviour would have to be managed regardless of the 
byte-swap issue.  This is the case where FORMAT is necessary.  It is 
clearly useful here since the entire basis of the encoding changes:  the 
full 2-octets of data are sent in every packet instead of using a choice 
of more efficient encodings.

And, yes, a dumb implementation of FORMAT will cause the low-probability 
IRREGULAR encoding from the sequential behaviour version to be used 
repeatedly.  Probably causing a lot of IR-DYN packets to be sent.
Dumb implementations of FORMAT cannot hurt interoperability, but they 
can affect efficiency.  Which is another reason to only use them where 
it's necessary.


> ==> SWAP-LSB approach is optimal
>     - it uses basic ID_bits mechanism,
>     - it does not need separate NBO indicators
> 


It's not optimal.  (Even for very lax definitions of optimal...)
It *does* need a separate NBO indicator but it hides it in the indicator 
flags prefix, rather than admitting to it.

<challenge>
It's trivial to implement SWAP-LSB, anyway, so why don't we compare the 
two approaches and see who gets the lower average compressed packet sizes?
</challenge>


> NBO-flag:
> 
> from proposed profile, as far as I can see, NBO-flag is passed as 
> IRREGULAR through IR/DYN, and just tested STATIC in CO; therefore it has 
> the same mechanism and importance of the format selector itself.
> 


There are 2 crucial differences between NBO and FORMAT.
- You *can* change NBO in a compressed packet.  You *can't* change FORMAT.
- FORMAT (at least) doubles the number of formats (or makes your 
format-set vs. max-formats trade-off harder).  NBO doesn't...


> Now about format:
> 
> I did not propose SWAP-LSB to be used inside FORMAT, just said that it 
> CAN be used, and that it is more format-friendly.
> 


See above.  But if you *were* to use SWAP-LSB, then you would *have* to 
use it in FORMAT.  I still regard this as using a sledgehammer to crack 
a nut...


> Format is mentioned because its main purpose is to generate separate 
> format sets for differently behaving TCP/IP flows, and IP_ID behavior is 
> an example of that. Why separate formats to sets? Just to make sure that 
> some important formats of one type of flows will not be discarded 
> because of low overall probability. That's why we can expect 
> NBO/SWAP-LSB inside FORMAT.
> 


No.  FORMAT is, as I think I've mentioned before, a way of choosing 
between different encoding sets.  Random vs sequential IP ID is an 
excellent example of this.

However, byte-swapping can be considered orthogonally to the encoding 
issue.  That is, with NBO we test the byte order and change it if 
necessary.  Then we apply the *same* LSB encodings.


> What is of concern here, NBO inside FORMAT is double indication (format 
> selector + NBO selector). It seem to be too complex for me, somehow 
> makes me nervous (from the application point of view).
>


I don't see the problem here, frankly.
The FORMAT selects between random and non-random behaviour.  NBO lets it 
account for byte-swapping in the non-random case.  Again, it is obvious 
that SWAP-LSB has *exactly* the same issues.

 
> Finally, about IRREGULAR-PADDED, VALUE and SCALE:
> 
> I agreee that they are NBO sensitive, but whether that results with any 
> practical consequence?
> 


I'm obviously missing something here.  If you introduce NBO, then you 
can use any other method with it, such as those above.  Ok, it may be 
relatively unlikely that this will happen.  But, if we go down the 
SWAP-LSB route, then to achieve the same level of flexibility, you 
should have SWAP-IRREGULAR-PADDED, SWAP-VALUE and SWAP-SCALE, and...

And, don't forget that terribly misused encoding INFERRED-OFFSET.  If 
you accept that we need SWAP-INFERRED-OFFSET, then you're trying to 
replace my one flexible encoding component with two inflexible encodings.


> Best regards,
> Julije
> 
> 
> 


I'm going to try something dangerous and summarise:

- IP ID is one concrete example of a field that does not have a defined 
representation and so, because of implementation issues, may have a 
byte-swapping problem
- It is necessary to account for this in an EPIC profile
- The encodings used for the normal and byte-swapped representations are 
(ignoring the byte-swapping) the same
- We cannot know in advance what proportion of packets will contain 
byte-swapped values
- The byte-swapping state will not change frequently
   (and should only be sent when it changes)
- There are at least two candidate encodings (LSB and INFERRED-OFFSET)
- Any solution must look at the historical values of the field and 
decide whether or not it is considered byte-swapped

Regarding the solutions (my opinion, of course ;-)

- SWAP-LSB lacks flexibility and forces the use of FORMAT, which is 
undesirable.
- NBO is a flexible approach (equivalent to the NBO flag in RFC 3095, 
but in a more generic way)

If you want to change my mind, you will need to convince me of:

- How SWAP-LSB can be used *without* FORMAT and without reducing the 
encoding efficiency
- How SWAP-LSB's decision on byte-swapping is different from (more 
importantly, better than) the NBO approach
- How this approach can be extended to cover the other 4 methods that 
are affected by byte ordering.


ok?

-- 
Mark A. West, Consultant Engineer
Roke Manor Research Ltd., Romsey, Hants.  SO51 0ZN
Phone +44 (0)1794 833311   Fax  +44 (0)1794 833433

(Yes, I do know that my disclaimer is in an attachment.  And, no, I 
didn't ask for it to be that way)