[Forces-protocol] Re: Data encoding -- first part

Alan DeKok <alan.dekok@idt.com> Mon, 18 October 2004 20:22 UTC

Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id QAA19903 for <forces-protocol-web-archive@ietf.org>; Mon, 18 Oct 2004 16:22:55 -0400 (EDT)
Received: from megatron.ietf.org ([132.151.6.71]) by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1CJeE1-0000fI-Bx for forces-protocol-web-archive@ietf.org; Mon, 18 Oct 2004 16:35:25 -0400
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1CJdd1-0006x6-G2; Mon, 18 Oct 2004 15:57:11 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1CJdW3-0004oG-AB for forces-protocol@megatron.ietf.org; Mon, 18 Oct 2004 15:49:59 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA17459 for <forces-protocol@ietf.org>; Mon, 18 Oct 2004 15:49:52 -0400 (EDT)
Received: from [66.46.215.210] (helo=post.ott.idt.com) by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1CJdhy-0008Ls-TE for forces-protocol@ietf.org; Mon, 18 Oct 2004 16:02:21 -0400
Received: from [127.0.0.1] (dhcp195.ott.idt.com [192.168.102.195]) by post.ott.idt.com (Postfix) with ESMTP id A9A7D1F0001; Mon, 18 Oct 2004 15:49:18 -0400 (EDT)
Message-ID: <41741D78.4070205@idt.com>
Date: Mon, 18 Oct 2004 15:46:00 -0400
From: Alan DeKok <alan.dekok@idt.com>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.3) Gecko/20040910
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: Zsolt Haraszti <zsolt@modularnet.com>
References: <468F3FDA28AA87429AD807992E22D07E025791E5@orsmsx408> <002d01c4b50b$1ecc9c10$020aa8c0@wwm1> <1098102734.1042.134.camel@jzny.localdomain> <1098113089.2364.12.camel@localhost.localdomain> <1098115003.2884.67.camel@localhost.localdomain> <4173FB88.1000008@idt.com> <1098126011.2884.162.camel@localhost.localdomain>
In-Reply-To: <1098126011.2884.162.camel@localhost.localdomain>
Content-Type: text/plain; charset="us-ascii"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 36b1f8810cb91289d885dc8ab4fc8172
Content-Transfer-Encoding: 7bit
Cc: "Khosravi, Hormuzd M" <hormuzd.m.khosravi@intel.com>, ram.gopal@nokia.com, "Yang, Lily L" <lily.l.yang@intel.com>, "Joel M. Halpern" <jhalpern@megisto.com>, forces-protocol@ietf.org, Jamal Hadi Salim <hadi@znyx.com>, "Steven Blake (petri-meat)" <slblake@petri-meat.com>, Ellen M Deleganes <ellen.m.deleganes@intel.com>, Weiming Wang <wmwang@mail.hzic.edu.cn>
Subject: [Forces-protocol] Re: Data encoding -- first part
X-BeenThere: forces-protocol@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: forces-protocol <forces-protocol.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/forces-protocol>, <mailto:forces-protocol-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/forces-protocol>
List-Post: <mailto:forces-protocol@ietf.org>
List-Help: <mailto:forces-protocol-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/forces-protocol>, <mailto:forces-protocol-request@ietf.org?subject=subscribe>
Sender: forces-protocol-bounces@ietf.org
Errors-To: forces-protocol-bounces@ietf.org
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 87a3f533bb300b99e2a18357f3c1563d
Content-Transfer-Encoding: 7bit

Zsolt Haraszti wrote:

>>   I think the reference is "ANSI/IEEE Standard 754-1985"
>
> I will check (or let me know when you know for sure :).

http://grouper.ieee.org/groups/754/

http://stevehollasch.com/cgindex/coding/ieeefloat.html

   Yup.

> There may some misunderstanding here, so let me rewind a bit.
> 
> N in STRING[N] defines that max size of the string _in the model_.
> This is analogous to the
> 
> 	char 	if_name[16]
> 
> C construct (as opposed to the char *if_name).

   Ah, OK.  Having the protocol avoid the zeros is nice, but it means 
that many structs are now packed to variable-sized data.

> Let me try to clarify it a bit:  Suppose you define a STRUCT A data type
> per ForCES data model.  Suppose furthermore that you implement that
> struct in the FE/CE software by means of a C struct: struct A.  The C
> struct will have a certain memory layout in the host memory.  That 
> layout is what I referred to as "ANSI C struct representation" (I may
> have used the wrong reference here; I shall look it up to be sure).

   The problem is that ANSI C struct representation is 
platform-specific.  Some platforms can pack multiple 'uint8_t' into a 
32-bit word.  Others can't.

   As an extreme example (not applicable here), big/little-endian 
systems pack bit arrays differently.  So I wouldn't depend on anything 
other than order, and maybe 32-bit alignment, for platform-independent C 
structs.  This makes mapping ForCES protocol packed data into C structs 
somewhat problematic.

> The statement above is that for structs which do not have variable
> size fields, the ForCES encoded STRUCT will have the same layout as
> the C struct in memory (minus endianness).  The obvious advantage of
> this is that you can memcpy() between the message buffer and you
> C struct.

   I would suggest adding a note that implementors are STRONGLY 
CAUTIONED to validate that the sizes of the structs are the same, and 
that the padding is the same between ForCES and the C compiler.  This 
kind of "memcpy" of complex structures has historically been open to 
implementation flaws and security attacks.

> Almost, but not quite.  Since we want to make this layout compatible
> with C implementations, we need to include padding between some of the
> elements.  The stated alignment rules serve this purpose.

   OK.  It's just that if you have variable rules for padding, the PACK 
function can't really be recursive, as it varies depending on whether or 
not it's in a struct, or not.

> I believe the provided example illustrates the rules well.

   Yes.

>>   If that's true, then it will be possible to write a function which 
>>takes a ForCES STRUCT definition and a pointer to a C structure 
>>containing the relevant data, and return a PACKed DATA field.  While 
>>this process often won't be necessary, it may be useful to give a sample 
>>function in pseudo-code, or C, to do this packing.  That way people 
>>implementing the protocol have a "known good" place to start from.
> 
> Are you suggesting putting a pseudo-code into the description
> here or into the final standard (or both)?

   Yes.  Code will clarify a LOT of issues.

>>   It would be great to have some sample code which read in simple 
>>STRUCT definitions, and produced the packed DATA.  Having a function to 
>>produce ASCII art would be even better...
> 
> 
> You mean "STRUCT definition and actual value, and produced the packed
> DATA"?                      ^^^^^^^^^^^^^^^^

   Sure.

> We can have a contest on who could post that program first.
> Input: a) an the XML data specification per current ForCES model
>           document
>        b) value assignment for the fields
> Output: an ASCII art showing the packet data.

   Hmm... I don't know if I'll have time.  But it would be extremely 
useful, even for other RFC's containing ASCII art.

>>   Variable-sized entries are severely problematic.
> 
> I agree that they are more trouble-some.  But I find it too restrictive
> to prohibit them.  Arrays of arrays will have this.  Or, if your array
> is based on a struct that has string field(s), will have this problem.

   How about having a packed length for structures?  That is, reserve a 
32-bit word at the start of the struct for it's length, PACK the struct, 
and then update the length field with the resulting value.

   For structs of variable size, this should make it easy to quickly 
find the N'th element in an array of those structs.  The clearest 
benefit is that you don't have to parse or even know the entries of the 
struct, in order to find the N'th element.

> This is one of those places where I would think good designs will avoid
> such ARRAYs as much as possible, but in some cases it cannot be avoided
> so we must support it in the protocol.

   I agree.  I would, however, like to see the difference between 
fixed-size structs && variable-size structs highlighted.

   The problem is that if a data structure has a sub-structure 
*anywhere* in it which contains a STRING[N], then the entire data 
structure is "tainted", and it's size is variable instead of fixed.

   If we expect that most STRUCTs will contain STRING[N], then it may be 
reasonable to assume that ALL STRUCT packing includes a 32-bit prefix of 
the size of the struct, as this will decrease the complexity of the parser.


   If STRING[N] is the only variable-sized data type in the protocol, 
then it may be reasonable instead to separate the string packing from 
the struct packing.  i.e. Rather than packing the string in place, pack 
all of the strings together, and pack a "pointer" in the struct to the 
string.

   I'm not sure I like that idea, though, as packing will require 
multiple passes to get all of the cross-references correct.

   Alan DeKok.



_______________________________________________
Forces-protocol mailing list
Forces-protocol@ietf.org
https://www1.ietf.org/mailman/listinfo/forces-protocol