[core] Review of CoRAL

Christian Amsüss <christian@amsuess.com> Wed, 31 October 2018 16:45 UTC

Date: Wed, 31 Oct 2018 17:45:36 +0100
From: Christian Amsüss <christian@amsuess.com>
To: T2TRG@irtf.org, draft-hartke-t2trg-coral@ietf.org
Cc: core@ietf.org
Message-ID: <20181031164534.GA4995@hephaistos.amsuess.com>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="pWyiEgJYm5f9v55/"
Content-Disposition: inline
User-Agent: Mutt/1.10.1 (2018-07-13)
Archived-At: <https://mailarchive.ietf.org/arch/msg/core/1f-2P7wvuiTVmRkN670S1gMVe8o>
Subject: [core] Review of CoRAL
Precedence: list

Hello CoRAL observers,

I've read and partially implemented draft-hartke-t2trg-coral-05, and
would like to offer a review of it.

My two largest individual issues with this are that I'm not convinced of
the necessity or suitability of the textual format, and that I don't see
yet how forms would be used in practice (that's out of scope for the
document, but limits the depth of my review).

In general, the described format would be useful to me, both for new
applications and to fill in for link-format where RFC6690 is too
limited.

The following is a rather unsorted list of things that struck me as
unclear or odd; none of them should stand in the way of a research or
working group adoption or be otherwise insurmountable; many are also
just editorial comments.

* "The CoRAL data and interaction model is a superset of the Web Linking
  model" / "can describe simple resource metadata in a way similar to
  the Resource Description Framework (RDF)":

  From the examples in 2.1 (in particular, because the rel becomes a
  full URI), it appears the othr way round: That the data model is a
  superset of the RDF model, and that the document describes an
  equivalence of a subset of RDF statements to typed links (as used in
  link headers, link format and HTML).

* Browsing context: This growing history appears hard to do in a
  constrained context. Can an agent still work with any given maximum
  history length, in particular 1 or even 0?

* Forms: I haven't followed all the W3C & surroundings discussions on
  forms, so I can't say much there as I don't yet understand how they
  were to actually work. Examples that go beyond "POST to create" and
  "DELETE to delete" would be helpful when they'd go beyond the
  intuitive. -- But I assume that as this matures, any formative github
  discussions will evolve into referrable documents.

* If the first use of the http://www.iana.org/assingments/relation/
  prefix for rel values pointed to RFC4287, it would be clearer to the
  reader that this is already established practice.

* Representations: I think it would be helpful to state something in the
  area of the second paragraph along the lines of "The producer of the
  CoRAL document claims that the embedded representation is fresh at
  least for as long as the CoRAL document is fresh", if so intended.

  It's obvious that the "full, partial or inconsistent" part is
  intentionally giving much leeway to the CoRAL producer. Could this
  possibly be narrowed down for resources under the same authority? A
  statement like "(under some condition), the presence of a
  representation states that there exists a request to the target that
  would result in a response with the given payload" could go into a
  direction where an agent could populate its own cache with it and work
  on from there.

* Binary data format: When relations expressed as uint are first
  mentioned (and possibly also with numeric form-field-name), a forward
  reference to the topic of profiles would be helpful.

* Textual format: I find this more confusing than helpful. The format is
  too far from turtle to allow intuition to be taken from there
  (especially as there are simple and qualified names, where turtle has
  only qualified names but the qualifier may be empty), too far from the
  binary representation to help understanding the binary format (eg.
  someone only editing in text format would not see that naming a
  resource linked as rel="index" <index> would be beneficial, but could
  be misled to believe that application/octet-stream can be a compact
  default or that representations can have more attributes -- plus
  people were led to think that CoRAL is large on the wire).

  My impression is that CoRAL would not be written by humans, let alone
  stored or transported as that. I think the document would be better
  off if no textual format were defined, but full turtle (or a slim
  superset thereof that's still within N3) were used to express the
  semantics of a document (which would need an RDF serialization of
  forms and representations, but that's more of a benefit than a
  downside IMO).

  Concrete example from 2.1:

    @prefix : <http://www.iana.org/assignments/relation/> .

    <> :next <./chapter4>;
       :icon </favicon.png>;
       :license <http://creativecommons.org/licenses/by/4.0/>.

  or from 2.2:

    @prefix : <http://example.org/vocabulary#> .
    @prefix coral: <urn:ietf:rfc:XXXX#> .

    <> :task [ = </tasks/1>;
         :description "Pick up the kids"
      ];
      :task [ = </tasks/2>;
         :description "Return the books to the library";
         coral:delete [ form:method coap:DELETE;
                        form:iri </tasks/2>
                      ]
      ]

* Why is <http://TBD/> used for attrs, but <urn:ietf:rfc:XXXX#...> used
  for the core form relations? They could at least share structural
  similarity.

* attrs: title and title* are described as target attributes in RFC8288
  and can thus easily occur in CoRAL documents.

* ibd: Then, it can say something like "MUST NOT occur in a CoRAL
  document as attributes, but are expressed as link relations, nested
  links and link relations, respectively" to indicate that their
  information is not lost in the transition.

* Any test vectors or examples to get started on the binary format would
  be helpful, and could be augmented by (CDDL automatically?) annotated
  extended diagnostic notation, like (as in the above, assuming a
  navigation profile where someone forgot that favicon is a typical rel)

  [[/link/ 2, /rel:next/ 4,
    /<./chapter4>/ [6, "chapter4"]],
   [/link/ 2, "http://www.iana.org/assignments/relation/icon",
    /</favicon.png>/ [5, 0, 6, "favicon.png"]],
   [/link/ 2, /rel:license/ 10, 
    /<http://creativecommons.org/licenses/by/4.0/>/ [1, "http", 2:
    "creativecommons.org", 6, "licenses", 6, "by", 6, "4.0", 6, ""]]
  ]

* FoaF example: Just for my understanding, would anything be wrong with
  some application using <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
  as the rel to foaf:Person here? (I figure that few RDF applications
  treat rdf:type and iana:type as equivalent properties).

  Later that's clarified a bit in A.1 where it says "same purpose", but
  is rel:type in actual use anywhere? Otherwise, I suggest to go for
  rdf:type right away, as that is in use. (It's more verbose when
  expressed in non-CoRAL link formats, though.)

* Only occurred to me when writing the above
  extended-diagnostic-notation example: Might it make sense to allow
  profile defined URIs as well, like common licenses or (when looking a
  the FoaF example) types? They'd need to be told apart from numeric
  literals that could just as well show up in a link target, but the iri
  rule could get an additional `?(profiled: 9, uint)` entry that could
  only exist on its own and expands like a numeric rel.

* What to do with language-tagged attributes (like title*)? The takeaway
  from links-json is that they can be decoded easily but need to keep
  their language information. That's easily expressed in RDF (turtle:
  `<> :title "Überschrift"@de`). links-json went for
  `{"title":{"de":"Überschrift"}}`, which seems a bit crude to me but
  could work just as well for CoRAL, as could a tagged CBOR array or a
  plain array if we stared using arrays with discriminators that are not
  part of the IRI discriminators as something different than IRIs (a la
  `[/link/ 2, /attr:title/ 42, [/language-tagged/ -1, "de",
  "Überschrift"]]`).

* Appendix C: Do I understand correctly that there are some CBOR encoded
  relative references that can not (even when knowing the rel in
  append-relation) be expressed in a relative-ref? In particular, those
  would be the append-* types. If so, it would be prudent to point out
  right next to where it says that "CBOR-encoded IRI References are not
  capable of expressing all IRI references".

* Most of the URI resolution gets away without string comparisons. I'd
  like to suggest making "../" an explicit URI option (say, a `*(parent:
  5.5)` that'd go between path.type and and path. References with
  dot-segments inside them are probably safe to put in the "not capable
  of expressing all" category.

  As I understand URI resolution, we don't need a "./" because that only
  needs to be explicit when starting with something that has a colon in
  it that's not the scheme's colon, otherwise "./x/y" is always
  equivalent to "x/y".

* Speaking of expressiveness of the encoded URIs: (Especially as urn: is
  used in the document,) would it hurt to grow the encoded references
  such that they can encode urns? It might be sufficient to allow scheme
  without a host/port but a path.

  Apropos port: Why must this be present after a host name? This
  complicates round-tripping to URIs (because the algorithm has to know
  to fill in default ports of all protocols), limits applicability to
  portless protocols and adds some bytes -- and in the embedded
  implementation, adding a default port can be as easy as just setting
  it at initialization time before parsing.

* Appendix C: A state transition diagram like the ones on
  http://json.org/ could be as expressive as the list in C.3 while being
  easier to grasp.

* How are compressed URIs expressed in CBOR? Flat [6, "foo", 6, "bar"]
  or nested [[6, "foo"], [6, "bar"]]? My first reading (and current
  implementation) resulted in "nested", supported by the `(option,
  value) = href[0]` line in C4, but reading C.1 with the CDDL spec next
  to it (and running the cddl tool) rather indicates "flat" to me.

  Either way, some examples would be great.

* Nit: calling a sequence of options _absolute_ brings up the
  association of the "absolute URI", which is not the same as "URI"
  (which is by definition not a reference, but sometimes it helps to say
  "full URI"). An Absolute URI would not have a fragment identifier (and
  is defined to ease definign protocols where the fragment is never
  sent), whereas the non-relative or full option sequence described here
  can easily (and should be able to) contain a fragment identifier.

* Is there a difference between a CoRAL registry and a CoRAL profile?
  Both terms are used.

Best regards
Christian

-- 
To use raw power is to make yourself infinitely vulnerable to greater powers.
  -- Bene Gesserit axiom

Attachment: signature.asc

[core] Review of CoRAL Christian Amsüss
Re: [core] [T2TRG] Review of CoRAL Klaus Hartke
Re: [core] [T2TRG] Review of CoRAL Klaus Hartke
Re: [core] [T2TRG] Review of CoRAL Christian M. Amsüss
Re: [core] [T2TRG] Review of CoRAL Klaus Hartke
Re: [core] Review of CoRAL Christian Amsüss
[core] Possible and impossible CIRIs (was: Review… Christian Amsüss
Re: [core] [T2TRG] Review of CoRAL Klaus Hartke
Re: [core] [T2TRG] Review of CoRAL Christian M. Amsüss
[core] Review of CIRIs (was: Review of CoRAL) Klaus Hartke
Re: [core] [T2TRG] Review of CoRAL Klaus Hartke
Re: [core] [T2TRG] Review of CoRAL Carsten Bormann
Re: [core] [T2TRG] Review of CoRAL Frank MATTHIAS KOVATSCH

[core] Review of CoRAL

Attachment: signature.asc