Return-Path: <fluffy@iii.ca>
X-Original-To: core@ietfa.amsl.com
Delivered-To: core@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1])
 by ietfa.amsl.com (Postfix) with ESMTP id ECACF1B2F1A
 for <core@ietfa.amsl.com>; Tue,  1 Dec 2015 10:59:40 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 1.699
X-Spam-Level: *
X-Spam-Status: No, score=1.699 tagged_above=-999 required=5
 tests=[BAYES_50=0.8, J_CHICKENPOX_64=0.6, MIME_8BIT_HEADER=0.3,
 RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44])
 by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id IQb2WXC2unG1 for <core@ietfa.amsl.com>;
 Tue,  1 Dec 2015 10:59:38 -0800 (PST)
Received: from smtp125.iad3a.emailsrvr.com (smtp125.iad3a.emailsrvr.com
 [173.203.187.125])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by ietfa.amsl.com (Postfix) with ESMTPS id 20DAA1B2F19
 for <core@ietf.org>; Tue,  1 Dec 2015 10:59:38 -0800 (PST)
Received: from smtp24.relay.iad3a.emailsrvr.com (localhost.localdomain
 [127.0.0.1])
 by smtp24.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id 0344818043F;
 Tue,  1 Dec 2015 13:59:36 -0500 (EST)
X-Auth-ID: fluffy@iii.ca
Received: by smtp24.relay.iad3a.emailsrvr.com (Authenticated sender:
 fluffy-AT-iii.ca) with ESMTPSA id 86FCA18041E; 
 Tue,  1 Dec 2015 13:59:35 -0500 (EST)
X-Sender-Id: fluffy@iii.ca
Received: from [192.168.4.100] ([UNAVAILABLE]. [128.107.241.185])
 (using TLSv1 with cipher DHE-RSA-AES256-SHA)
 by 0.0.0.0:465 (trex/5.5.4); Tue, 01 Dec 2015 13:59:36 -0500
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 9.1 \(3096.5\))
From: Cullen Jennings <fluffy@iii.ca>
In-Reply-To: <8309DD6A-FED6-4E3D-86E8-FDF842BC9458@ericsson.com>
Date: Tue, 1 Dec 2015 11:59:37 -0700
Content-Transfer-Encoding: quoted-printable
Message-Id: <C5EC00C2-49E4-45C2-A78C-FD818BDF04C7@iii.ca>
References: <1d3a2378c7df499e84f3edae6f5d1f96@NOKWDCFIEXCH02P.nnok.nokia.com>
 <8309DD6A-FED6-4E3D-86E8-FDF842BC9458@ericsson.com>
To: =?utf-8?Q?Ari_Ker=C3=A4nen?= <ari.keranen@ericsson.com>
X-Mailer: Apple Mail (2.3096.5)
Archived-At: <http://mailarchive.ietf.org/arch/msg/core/5BgjB_BiJq4MOZYxIRIEygRaPpg>
Cc: =?utf-8?Q?Christian_Ams=C3=BCss?= <c.amsuess@energyharvesting.at>,
 "draft-jennings-core-senml@tools.ietf.org"
 <draft-jennings-core-senml@tools.ietf.org>, core <core@ietf.org>
Subject: [core] SenML Re:  SemML time series data representation?
X-BeenThere: core@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Constrained RESTful Environments \(CoRE\) Working Group list"
 <core.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/core>,
 <mailto:core-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/core/>
List-Post: <mailto:core@ietf.org>
List-Help: <mailto:core-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/core>,
 <mailto:core-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 01 Dec 2015 18:59:41 -0000


There's a a further problem that has come up with this syntax. The JSON =
ends up having an array of variant types - that is the outer array has =
entries that can be an object or an array. Given that many languages =
don't support variant arrays, it turns out many JSON libraries don't =
seem to support this.=20


> On Nov 19, 2015, at 12:43 PM, Ari Ker=C3=A4nen =
<ari.keranen@ericsson.com> wrote:
>=20
> Hi Markus,
>=20
> The current syntax allows you to have multiple bases and hence you =
could drop the n elements if needed. Something like:
>=20
>  [{"bn": "urn:dev:mac:0024befffe804ff1/voltage",
>    "bt": 1276020076,
>    "bu": "A",
>    "ver": 1},
>   [ { "u": "V", "v": 120.1 } ],
>   {"bn": "urn:dev:mac:0024befffe804ff1/current"},
>   [
>     { "t": -5, "v": 1.2 },
>     { "t": -4, "v": 1.30 },
>     { "t": -3, "v": 0.14e1 },
>     { "t": -2, "v": 1.5 },
>     { "t": -1, "v": 1.6 },
>     { "t": 0,  "v": 1.7 } ]
>  ]
>=20
> But for the further compression you suggested (just values) there is =
no mechanism.
>=20
>=20
> Cheers,
> Ari
>=20
>> On 18 Nov 2015, at 18:37, Isomaki Markus (Nokia-TECH/Espoo) =
<markus.isomaki@nokia.com> wrote:
>>=20
>> Hi,
>>=20
>> I've not followed CORE or SenML discussions for a while, so apologies =
if this a FAQ. I noticed there is a discussion about SenML streaming and =
that triggered a question related to a project I'm working on. Basically =
I would like to send a large number of sensor readings that have been =
measured with a constant sample rate. This could be even tens of =
thousands of samples at a time. In SenML, is there any reasonable way to =
represent this type of time series in a compact manner? In the draft I =
see this kind of example:
>>=20
>>  [{"bn": "urn:dev:mac:0024befffe804ff1/",
>>    "bt": 1276020076,
>>    "bu": "A",
>>    "ver": 1},
>>   [ { "n": "voltage", "u": "V", "v": 120.1 },
>>     { "n": "current", "t": -5, "v": 1.2 },
>>     { "n": "current", "t": -4, "v": 1.30 },
>>     { "n": "current", "t": -3, "v": 0.14e1 },
>>     { "n": "current", "t": -2, "v": 1.5 },
>>     { "n": "current", "t": -1, "v": 1.6 },
>>     { "n": "current", "t": 0,  "v": 1.7 } ]
>>  ]
>>=20
>> This kind of works but it would be quite redundant to literally send  =
 "n":"current","t":N, "v":   ten thousand times. The current format we =
are using has additional metadata such as sample rate (or sample =
interval), and also the measurement type and the measurement unit can be =
given only once. This means we can just send an array of actual =
measurement results of the same type and sample interval, e.g. [1.2, =
1.30, 0.14e1, 1.5, 1.6, 1.7], in a compact manner.
>>=20
>> Is this possible in SenML? Would seem like a useful feature for many =
purposes where sensors report data in batches.=20
>>=20
>> Markus=20
>>=20
>>> -----Original Message-----
>>> From: core [mailto:core-bounces@ietf.org] On Behalf Of EXT Cullen =
Jennings
>>> Sent: Wednesday, November 18, 2015 4:15 AM
>>> To: Christian Ams=C3=BCss <c.amsuess@energyharvesting.at>
>>> Cc: draft-jennings-core-senml@tools.ietf.org; core <core@ietf.org>
>>> Subject: Re: [core] SenML JSON syntax and collection+senml+json
>>>=20
>>>=20
>>> Random thoughts on a  few subjects:
>>>=20
>>> I feel like SenML is getting to complex and we should ask if we can =
put it on a
>>> diet. Perhaps this streaming is just too much to put into it. An =
alternative is to
>>> not have SenML do streaming but allow a protocol using it to support
>>> steaming by sending many SenML objects with the convention that is =
any
>>> given object did not have a base value, then the base values from =
the
>>> previous SENML object applied. I'm not sure if this is a good idea =
or not but
>>> I'm just saying that if things start to get too complicated to do =
streaming
>>> inside SenML, we can punt it up a layer.
>>>=20
>>>=20
>>> Complexity :
>>>=20
>>> I'm sure someone will think I am nuts for suggesting that SenML is =
looking
>>> too complicated but as another example ... take InfluxDB which is =
pretty
>>> good for stuff like this. Ive been using it for a cloud DB for =
streaming RT
>>> measurements. It deprecated JSON and replaced it with "Line =
Protocol"
>>> which is effectively the sensor name followed by space separated  =
followed
>>> by the value followed by CRLF. That produced noticeable improvements =
in
>>> real deployments over general JSON. A big part of SenML was to *not* =
be be
>>> general JSON and be a very restricted subset of JSON such that it =
could
>>> achieve the performance of something like "Line Protocol" or proto =
bufs and
>>> still have some extensibility story.
>>>=20
>>> So Line Protocol would send the example from later in this email as =
a single
>>> line with
>>>=20
>>> urn:dev:mac:0024befffe804ff1/voltage u=3DV 120.1
>>>=20
>>>=20
>>>=20
>>> MetaData:
>>>=20
>>> The more I think about metadata and data the less I know which is =
what.
>>> Consider
>>>=20
>>> [ {"bn": "urn:dev:mac:0024befffe804ff1/"},
>>>   [ { "n": "voltage", "t": 0, "u": "V", "v": 120.1 } ]  ]
>>>=20
>>> You could argue the only thing that is not metadata is 120.1
>>>=20
>>> I think the goal of SenML is to have a record that has a minimal set =
of info
>>> that is often needed to interpret the data in one record. The base =
names
>>> were added merely as compression scheme to reduce duplication of =
same
>>> bits several times. I'm not real wound up about it some of it is =
meta data or
>>> not.
>>>=20
>>>=20
>>>=20
>>> Streaming:
>>>=20
>>> When I first read that line that said the latest SenML draft =
"requires support
>>> of streaming" I thought that was wrong but the more I thought about =
it, yes,
>>> I think this is a very serious problem with the current proposal. I =
was thinking
>>> about sensor data being send from a small device to a big cloud =
device and
>>> this might work OK but in the case of data going to another small =
device, this
>>> is a problem. It does highlight the problem of max size for a SenML =
data.
>>>=20
>>> Perhaps we need two different formats - a SenML object and a SenML
>>> stream. That would allow protocols that used this to be clear about =
if they
>>> used one or the other or both and with HTTP or CoAP, the normal
>>> approaches could be used to negotiate them.
>>>=20
>>>=20
>>>=20
>>>> On Nov 17, 2015, at 3:44 PM, Christian Ams=C3=BCss
>>> <c.amsuess@energyharvesting.at> wrote:
>>>>=20
>>>> Hello Michael,
>>>> hello SenML and core-interfaces people,
>>>>=20
>>>> I'd like to pick up the topic of streamable SenML from the context =
of
>>>> the `SenML JSON syntax` syntax thread from before IETF94.
>>>>=20
>>>> To summarize what I know of the state of things:
>>>>=20
>>>> * JSON SenML can't enforce that the base {name, time} entries =
precede
>>>> the entries list while still being JSON. To parse a generic SenML
>>>> message, it is thus required to keep the whole message in memory.
>>>>=20
>>>> An alternative syntax is proposed [{base dict}, [entries]]; that =
can
>>>> be extended to allow repetitions thereof (with incremental base
>>>> values), or the distinction between base and entry data could be
>>>> lifted further.
>>>>=20
>>>> This assumes that the "e" record list takes a special role in SenML
>>>> by  being the workhorse list of data, which conflicts with:
>>>>=20
>>>> * CoRE interfaces serves collections as both data and metadata in a
>>>> unified SenML structure, where resource states are given in the
>>>> classical "e" array, and the metadata next to it in an "l" array as =
in
>>>> application/link-format+json.
>>>>=20
>>>> A notation for treating the "l" array as an "e" element was =
proposed,
>>>> but did not resonate well with Michael (from the CoRE interface =
side);
>>>> I'd like to take up the line of discussion from there:
>>>>=20
>>>> On Tue, Oct 20, 2015 at 12:52:19PM -0700, Michael Koster wrote:
>>>>> It=E2=80=99s more than a simple visual relationship. I=E2=80=99m =
used to JSON tools
>>>>> that create an in-memory data structure that conforms to the JSON
>>>>> serialization. With the =E2=80=9Cold=E2=80=9D SenML model, the =
elements of the object
>>>>> identified by =E2=80=9Cbn=E2=80=9D are rendered as an array within =
the element
>>>>> identified by =E2=80=9Cbn=E2=80=9D and tagged by =E2=80=9Ce=E2=80=9D=
.
>>>>>=20
>>>>> The new construct more than just enables streaming, it forces =
serial
>>>>> interpretation, i.e. it *requires* streaming.
>>>>=20
>>>> Yes, and that's the very point. If I'm to parse SenML on a =
constrained
>>>> device, especially given that the sender can use its extensibility =
to
>>>> send along data that is not expected by the receiver, that means =
that
>>>> I need to be prepared to store whichever length the complete =
message
>>> has.
>>>>=20
>>>> For an example of a situation when this can be an issue, take an
>>>> update to a DMX (RGB spots or other light installations) =
controller. A
>>>> PUT to atomically update the complete scene of connected devices in
>>>> JSON serialization can easily take up 10k plus network overhead in
>>>> network buffer space even without any additional metadata from =
SenML
>>>> extensions, but (if read in a serializable way) implementations =
could
>>>> get away with a single-MTU-buffer network implementation plus 1k =
for
>>>> double-buffered state.
>>>>=20
>>>> Another example (from my everyday CoAP communication, but not
>>>> involving embedded parsing) are history readouts of sensor values,
>>>> which can exceed 100kB for devices with intermittent network
>>> connectivity.
>>>>=20
>>>>> Would it make sense to create a new content-format that optimizes =
for
>>>>> streaming processing?
>>>>=20
>>>> This is not about streaming Big Data around to the point where big
>>>> devices need to go into "streaming mode" (though it's useful there
>>>> too), this is about (not the most common, but still relatively) =
normal
>>>> situations and not returning 4.13 from small devices any time =
someone
>>>> doesn't chunk up his request to small multiples of the MTU.
>>>>=20
>>>> I don't like to exaggerate, so please take this with a grain of =
salt
>>>> and aware that this is written in the heat of the argument: If we
>>>> don't find an agreeable serialization that can be processed in a
>>>> streaming fashion, we might right as well put a hard limit to the
>>>> maximum size of a SenML representation, that are a required minimum
>>>> for SenML implementors to support. What would that be, 4k? 16k?
>>>>=20
>>>>>> In my opinion, it raises the question of how generic SenML should
>>>>>> attempt to be. My personal view of it is that SenML is a way of
>>>>>> encapsulating several resource representations (be they of =
different
>>>>>> points in time or different resource) in a single message. With =
that
>>>>>> in mind, maybe the following would work for you (rephrasing your
>>>>>> example into senml-02 syntax, with comments):
>>>>>=20
>>>>> SenML is already being used to represent simple collections in =
CoRE
>>>>> Interfaces, OMA LWM2M, and OIC. Whether to have it be extensible =
and
>>>>> evolvable or not is certainly a tradeoff against complexity and
>>>>> stream processing ability. I would lean toward evolvability.
>>>>=20
>>>> Concerning evolvability:
>>>>=20
>>>> That shouldn't be a show stopper: extensions can still go both in =
the
>>>> base dictionary and in the events; it's just they wouldn't profit =
from
>>>> the guaranteed sequence.
>>>>=20
>>>> An approach I don't like in its current form but that could point =
the
>>>> direction for something more elegant is to indicate the "key" of
>>>> subsequent lists in the base dictionary; with your "l" example, =
that
>>>> could be
>>>>=20
>>>>  [ {"bn": "/collection1/", "next-object": "e"},
>>>>    [{"n": "item1", "sv": "value1"}, ...],
>>>>    {"next-object": "l"},
>>>>    [{"href": "item1", ...}, ...}
>>>>  ]
>>>>=20
>>>> As said, it's not pretty, nor what I'd endorse as-is, but
>>>> extensibility and easy-to-parse sequence don't necessarily =
conflict.
>>>>=20
>>>> Concerning focus of SenML:
>>>>=20
>>>> Simple collections seems to be a good outline; would you also agree =
to
>>>> "simple collections of resource representations and their =
metadata"?
>>>>=20
>>>>>> What do you think of the above arrangement?
>>>>>=20
>>>>> I think it=E2=80=99s a substantial compromise in the ability to =
represent
>>>>> data structure to get streaming processing ability. But I do like =
the
>>>>> idea of a =E2=80=9Cov=E2=80=9D element for object values.
>>>>=20
>>>> Does that refer to the new serialization format in general or to
>>>> packing the link list into an entity response in particular? In the
>>>> latter case, please elaborate -- the latter "happened" with the
>>>> infrastructure I've been using (under certain conditions, my batch
>>>> resources contain their application/link-format as "s": entries), =
I've
>>>> found it practical, and it would come in much more handy with =
"ov":link-
>>> format+json.
>>>>=20
>>>> Best regards
>>>> Christian
>>>>=20
>>>> --
>>>> Christian Ams=C3=BCss                      | Energy Harvesting =
Solutions GmbH
>>>> founder, system architect             | headquarter:
>>>> mailto:c.amsuess@energyharvesting.at  | Arbeitergasse 15, A-4400 =
Steyr
>>>> tel:+43-664-97-90-6-39                | =
http://www.energyharvesting.at/
>>>>                                    | ATU68476614
>>>=20
>>> _______________________________________________
>>> core mailing list
>>> core@ietf.org
>>> https://www.ietf.org/mailman/listinfo/core
>> _______________________________________________
>> core mailing list
>> core@ietf.org
>> https://www.ietf.org/mailman/listinfo/core
>=20

