LC comments on draft-laurie-pki-sunlight-05

=JeffH <> Tue, 01 January 2013 21:50 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 0A49D1F0C3E for <>; Tue, 1 Jan 2013 13:50:57 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -100.774
X-Spam-Status: No, score=-100.774 tagged_above=-999 required=5 tests=[AWL=-1.109, BAYES_50=0.001, IP_NOT_FRIENDLY=0.334, USER_IN_WHITELIST=-100]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 1drOoEiwFxkO for <>; Tue, 1 Jan 2013 13:50:55 -0800 (PST)
Received: from ( []) by (Postfix) with SMTP id 0FC4A1F0CE4 for <>; Tue, 1 Jan 2013 13:50:54 -0800 (PST)
Received: (qmail 7143 invoked by uid 0); 1 Jan 2013 21:50:31 -0000
Received: from unknown (HELO ( by with SMTP; 1 Jan 2013 21:50:31 -0000
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;; s=default; h=Content-Transfer-Encoding:Content-Type:Subject:CC:To:MIME-Version:From:Date:Message-ID; bh=k28ZxOMVbmKh9wzSH4GhvwSw/wJDPo9m8OU1IT4Y9tc=; b=3BwsQhUR1KAd7OWpVdeBgYHSckcRxbzusGEQf1J57BGoKxpOg7bdUffVpGjmQXmQDqFWXDFeCa6uGj9Xmpg3IRQ9L92RzQBA9Air1UDB0NuS4z6NLkTX6VOipJMM9x4C;
Received: from [] (port=60450 helo=[]) by with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.76) (envelope-from <>) id 1Tq9jB-0003uF-QU; Tue, 01 Jan 2013 14:50:30 -0700
Message-ID: <>
Date: Tue, 01 Jan 2013 13:50:27 -0800
From: =JeffH <>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/17.0 Thunderbird/17.0
MIME-Version: 1.0
To:, IETF Discussion List <>
Subject: LC comments on draft-laurie-pki-sunlight-05
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Identified-User: {} {sentby:smtp auth authed with}
Cc: Emilia Kasper <>, Ben Laurie <>, Adam Langley <>
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: IETF-Discussion <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 01 Jan 2013 21:50:57 -0000


Here are some last call comments on draft-laurie-pki-sunlight-05.

Overall the spec is in basically overall reasonable shape but I do have some 
substantive comments that if I'm not totally misunderstanding things (which 
could be the case) ought to be discussed and addressed in some fashion.

The plain overall comments are to some degree "take 'em or leave 'em" depending 
upon folks' sense of urgency to get the spec through the IETF pipeline, but the 
degree likely depends upon the observer.

I hope this is helpful,


comments on draft-laurie-pki-sunlight-05

substantive comments (in somewhat arbitrary order)

1. The client messages S4 don't explicitly lay out the syntax for request 
messages or responses. E.g., for S4.1 "Add Chain to Log", is the input a 
stand-alone JSON text array, or a JSON text object containing a JSON text array?

The term "JSON object" as used in the first paragraph is ambiguous and perhaps 
what is mean is simply "JSON texts" or "JSON text objects or JSON text arrays". 
RFC4627 clearly defines "JSON text", and should be cited. But RFC4627 is a 
little ambiguous itself regarding "JSON object" and so I suggest these definitions:

     JSON text object:   A JSON text matching the "object" ABNF production
        in Section 2.2 of [RFC4627].

     JSON text array:   A JSON text matching the "array" ABNF production
        in Section 2.3 of [RFC4627].

Also, the syntax for GETS isn't fully specified. Are the URL parameters to be 
encoded as (order independent) key-value pairs, or just as order-dependent 
values?  Which separator character is to be used between parameters? RFC3986 
should be cited.

Examples for both JSON text inputs and outputs, as well as URL parameters would 
be helpful.

2. "4. Client Messages" doesn't define error handling, i.e., responses to inputs 
the log service doesn't understand and/or is unable to parse, and/or have other 
errors. If the log service is to simply return a 4xx or 5xx error code, this 
should at least be mentioned.

3. There appear to be three defined methods for TLS servers to provide TLS 
clients with CT data, in S3.2.  For this experiment, which approach is mandatory 
to implement for servers and clients?  Or, is it the case that participating TLS 
clients (ie web browsers etc) implement all three methods, and TLS servers can 
choose any of them?

Also, S3.2 probably doesn't belong in S3 and perhaps should be a separate 
top-level section on its own, and have three subsections, one for each method.

4. "Leaf Hash" as used in S4.5 appears to be formally undefined. It apparently 
would be:

       SHA-256(0x00 || MerkleTreeLeaf) should also be noted in S3.3.

5. The recursive equations in S2.1 describe how to calculate a Merkle Tree Hash 
(MTH) (aka "root hash"), and thus as a side effect generate a Merkle Tree, for a 
given set of input data. However, there doesn't seem to be a defined algorithm 
(or even hints, really) for adding further inputs to an existing tree. Even 
though this may be reasonably left as an exercise for implementers, it should 
probably be discussed to some degree in the spec. E.g., note that leaf hashes 
are "frozen" and various interior tree node hashes become "frozen" as the tree 
grows. Is it not sub-optimal to employ the obvious default brute-force mechanism 
of rebuilding a tree entirely from scratch when new inputs are available?  Would 
not a recursive algorithm for adding new inputs to an existing tree be 
straightforward to provide?

6. Signed tree heads (STHs) are denoted in terms of "tree size" (number of 
entries), but SCTs are denoted in terms of a timestamp.  Should there be a log 
client message supporting the return of the nearest STH (and thus tree size) to 
a given timestamp?

7. S3 paragraph 2 states that "TLS clients MUST reject certificates that do not 
have a valid SCT for the end-entity certificate" (i.e., hard-fail).  Presummably 
this requirement is only for TLS clients participating in the CT experiment and 
that understand this protocol. This, or whatever the requirement actually is, 
should be further explained.

For example, does the simple presence of SCT(s) in the TLS handshake serve to 
signal to participating TLS clients that hard-fail is expected if there are any 
issues with CT validation?

8. The spec implies, but doesn't clearly describe, especially in S3.1, that the 
hashes are "labels" for tree entries, and that given a leaf hash, the log 
implementation should be able to look up and present the LogEntry data defined 
in that section.

9. Validating an SCT presummably requires having the Log Service's public key, 
yes?  This isn't clearly discussed, and also the mention of how one obtains a 
log service's public key is out of scope is buried in 2nd para of S4 -- it 
should be discussed in a separate clearly entitled subsection.

10. Unless I'm totally missing it, there isn't an explicit description of how 
one (eg a TLS client) goes about validating/verifying an SCT.

Various overall comments:

O-1. The phrase "this experiment" is used in S2.1 -- should describing this as 
an experiment be more explicitly done in the abstract and introduction sections? 
  What about the document title?

O-2. Should explicitly say in abstract and introduction that operationally, the 
logs are to be materialized as (experimental?) network services having the 
protocol operations for submissions and queries that are defined in this spec.

O-3. The bare term "client" is used in various places where either the term "log 
client" or "TLS client" is being implied -- these should be made explicit. Also 
the roles of log clients and TLS clients should be more thoroughly 
presented/explored, in part because they can intersect.

O-4. These things seem to be duplicate names:

      "root hash" and "Merkle Tree Hash (MTH)"

      "Tree Head Signature" and "Signed Tree Head (STH)"

..which makes parsing the spec more difficult than if one name is used 
consistently for each.

O-5. The terms "leaf certificate" and "final certificate" appear to be used where..

   End Entity certificate

   final End Entity certificate

..would be clearer and more consistent with TLS and PKIX terminology, and 
perhaps less confusing with the terms "leaf" and "leaf node" which are used when 
discussing the Merkle trees and their components.

O-6. I found the "history tree" paper (aka "[1]", cited here as [CrosbyWallach]) 
helpful in understanding how such trees are constructed, perhaps it should be 
more prominently mentioned. Plus the differences between the two algorithms 
should perhaps be more explicitly mentioned. E.g. in [CrosbyWallach] version-n 
tree stores n+1 inputs, while in CT a version-n tree (D[n]) stores n inputs.

[CrosbyWallach]  <>

O-7. The note mentioning "dummy leaves" in [CrosbyWallach] seems misleading. The 
difference is AFAICT that in [CrosbyWallach] all nodes at layer 1 and above 
(leaf entries are at layer 0), are "interior nodes", and have hashes created 
using 0x01. Thus in a tree with an odd number of entries (ie leaf nodes at layer 
0), there will be one leaf node under an interior node having only that one 
child. It's not that there is a "dummy leaf", it's that such an interior node's 
hash is constructed from just one child rather than two.

While in CT, if the input set is an odd number of entries, then the hash of the 
final single leaf is at layer 1, and is calculated as a leaf hash using 0x00. 
Thus CT "interior nodes" always have two children, but if the tree has an odd 
number of entries, the rightmost hash at layer 1 ("j" in the "binary Merkle tree 
with 7 leaves" figure) is a leaf node hash rather than an interior node hash.

O-8. [CrosbyWallach] discusses auditing and gossiping and could be cited as a 
source for further discussion on those topics.

O-9. The notion of "commitments" isn't well defined, and where "add a commitment 
to D[k:n]"  couldn't  "add an interior/intermediate node to D[k:n}"  be used?

Is not the term "commitment" used in [CrosbyWallach] equivalent to the 
sha256_root_hash (an STH component) in the spec?

[CrosbyWallach] uses the term "interior node(s)" while the spec uses 
"intermediate nodes" (in one place).

O-10. The recursive algorithms in S2 are dense and take effort to work through, 
perhaps adding simplistic example code (in an appendix) which implements, and/or 
actually working through the algorithms to arrive at some of the audit paths and 
consistency proofs in S2.1.3, would be helpful.

I desk-checked S2.1, and it seems correct, but didn't do S2.1.1 or S2.1.2.  The 
examples in S2.1.3 appear nominally correct but I didn't desk check them.

Should there be a reference to 
<> ?   And/or a note 
regarding available code and to contact the authors for more information? (as is 
done in RFC 2154)

O-11. S3.3 should mention the Maximum Merge Delay MMD where it says 
"periodically append". Also, in S3.3, "Signed Merkle Tree Update" should be a 
"Tree Head Signature" aka "signed tree head (STH)"?

O-12. S3.3, S3.4, S4.4, S4.5, and S4.7 mention the notion of logs "publishing" 
STHs, but no mechanism is described for explicitly "publishing". Is this meant 
to mean only that a "published" STH is available for retrieval by clients using 
the "Retrieve Latest Signed Tree Head" log client message?

Or, would there be a use case, eg introducing an existing log service to a log 
monitor, for requesting (or being able to enumerate) all published STHs from the 

O-13. signed tree heads (STHs) are denoted in terms of "tree size" (number of 
entries), but SCTs are denoted in terms of a timestamp.  Should there be a log 
client message supporting the return of the nearest STH (and thus tree size) to 
a given timestamp?

O-14. Detailed comments on S2...

> 2. Cryptographic components
> 2.1. Merkle Hash Trees
>    Logs use a binary Merkle hash tree for efficient auditing.  The
>    hashing algorithm is SHA-256 (note that this is fixed for this
>    experiment but it is anticipated that each log would be able to
>    specify a hash algorithm).  The input to the Merkle tree hash is a
>    list of data entries; these entries will be hashed to form the leaves
>    of the Merkle hash tree.  The output is a single 32-byte root hash.
>    Given an ordered list of n inputs, D[n] = {d(0), d(1), ..., d(n-1)},
>    the Merkle Tree Hash (MTH) is thus defined as follows:
>    The hash of an empty list is the hash of an empty string:
>    MTH({}) = SHA-256().

This MTH({}) construct doesn't appear to be used anywhere else in the spec 
(yes?), and so does it really need mentioning?

>    The hash of a list with one entry is:
>    MTH({d(0)}) = SHA-256(0x00 || d(0)).

The immediately above equation is for leaf entries (yes?), where in this 
notation n = 1, perhaps it should be stated explicitly:

     When n = 1, a leaf entry is denoted, and D[1] = {d(0)}. The leaf hash
     (LH) for a leaf entry is calculated as:

     MTH(D[1]) = LH(D[1]) = SHA-256( 0x00 || d(0) )

>    For n > 1, let k be the largest power of two smaller than n.

The unqualified "power of two" phrase is arguably ambiguous.
Suggested rephrase for this where it occurs throughout section 2..

     For n > 1, let k be a number which is the largest power of two
     such that k = 2^i, 0 <= i < n, and k < n.

>    The Merkle Tree Hash of an n-element list D[n] is then defined
>    recursively as

The above statement applies to the combination of the n = 1 equation above and 
the equation below, and so should perhaps be moved up above the n = 1 equation.

>    MTH(D[n]) = SHA-256(0x01 || MTH(D[0:k]) || MTH(D[k:n])),
>    where || is concatenation and D[k1:k2] denotes the length (k2 - k1)
>    list {d(k1), d(k1+1),..., d(k2-1)}.

The above phrase doesn't parse well and is somewhat ambiguous, here it is 
extracted for clarity:

  "D[k1:k2] denotes the length (k2 - k1) list {d(k1), d(k1+1),..., d(k2-1)}"

How about rephrasing it along the lines of this:

     D[k1:k2] denotes a sublist {d(k1), d(k1+1),..., d(k2-1)}, having
     (k2 - k1) elements, of the original input list D[n]. When (k2 - k1)
     is 1, a leaf hash is calculated.

                                          (Note that the hash calculation
>    for leaves and nodes differ.  This domain separation is required to
>    give second preimage resistance.)
>    Note that we do not require the length of the input list to be a
>    power of two.  The resulting Merkle tree may thus not be balanced,
>    however, its shape is uniquely determined by the number of leaves.
>    [This Merkle tree is essentially the same as the history tree [1]
>    proposal, except our definition omits dummy leaves.]

I suggest re-writing the first above Note along with the next paragraph in light 
of all above comments on S2 and [CrosbyWallach].

O-15.  Some comments on S3:

> 3. Log Format

this section isn't just about "format" of log - it's also about log 

>    Anyone can submit certificates to certificate logs for public
>    auditing, however, since certificates will not be accepted by clients
>    unless logged, it is expected that certificate owners or their CAs
>    will usually submit them.  A log is a single, ever-growing, append-
>    only Merkle Tree of such certificates.
>    When a valid certificate is submitted to a log, the log MUST
>    immediately return a Signed Certificate Timestamp (SCT).  The SCT is
>    the log's promise to incorporate the certificate in the Merkle Tree
>    within a fixed amount of time known as the Maximum Merge Delay (MMD).
>    If the log has previously seen the certificate, it MAY return the
>    same SCT as it returned before.

What if the submitted end entity cert is the same, but the certificate chain is 
different (yet valid)?

>                                     TLS servers MUST present an SCT from
>    one or more logs to the client together with the certificate.  TLS
>    clients MUST reject certificates that do not have a valid SCT for the
>    end-entity certificate.

[ see comment (7) above ]

>    Periodically, each log appends all its new entries to the Merkle
>    Tree, and signs the root of the tree.  Clients and auditors can thus

Should "Clients and auditors" actually be "TLS Clients, log monitors, and log 
auditors" ?

>    verify that each certificate for which an SCT has been issued indeed
>    appears in the log.

Add forward reference here to S4 and S5 ?

>                         The log MUST incorporate a certificate in its
>    Merkle Tree within the Maximum Merge Delay period after the issuance
>    of the SCT.
>    Logs MUST NOT impose any conditions on copying data retrieved from
>    the log.

s/copying data retrieved/retrieving or sharing data/

> 3.1. Log Entries
>    Anyone can submit a certificate to any log.  In order to enable
>    attribution of each logged certificate to its issuer, the log SHALL
>    publish a list of acceptable root certificates (this list might
>    usefully be the union of root certificates trusted by major browser
>    vendors).  Each submitted certificate MUST be accompanied by all
>    additional certificates required to verify the certificate chain up
>    to an accepted root certificate.  The root certificate itself MAY be
>    omitted from this list.
>    Alternatively, (root as well as intermediate) Certificate Authorities

Additionally?  which manner is the experiment going to operate, or is it TBD ?

>    may submit a certificate to logs prior to issuance.  To do so, a
>    Certificate Authority constructs a Precertificate by adding a special
>    critical poison extension (OID, whose
>    extnValue OCTET STRING contains ASN.1 NULL data (0x05 0x00)) to the
>    leaf TBSCertificate (this extension is to ensure that the

leaf == end entity ?

s/leaf certificate/end entity certificate/g   ?

>    Precertificate cannot be validated by a standard X.509v3 client), and
>    signing the resulting TBSCertificate [RFC5280] with either

>    o  a special-purpose (Extended Key Usage: Certificate Transparency,
>       OID Precertificate Signing Certificate.
>       The Precertificate Signing Certificate MUST be certified by the CA
>       certificate that will ultimately sign the leaf TBSCertificate

"sign the leaf TBSCertificate"  means to say "sign the actual 
issued-to-the-customer TBSCertificate component of the End Entity certificate" ?

>       (note that the log may relax standard validation rules to allow
>       this, so long as the final signed certificate will be valid),
>    o  or, the CA certificate that will sign the final certificate.

"final certificate" is the "issued-to-the-customer End Entity certificate" ?

>    Structure of the Signed Certificate Timestamp:

The SCT discussion here should probably be its own subsection.

>        enum { certificate_timestamp(0), tree_hash(1), 255 }
>          SignatureType;
>        enum { v1(0), 255 }
>          Version;
>          struct {
>              opaque key_id[32];
>          } LogID;
>          opaque CtExtensions<0..2^16-1>;
>    "key_id" is the SHA-256 hash of the log's public key, calculated over
>    the DER encoding of the key represented as SubjectPublicKeyInfo.

I'd place the above paragraph regarding "key_id" down below the 
SignedCertificateTimestamp definition.

>        struct {
>            Version sct_version;
>            LogID id;
>            uint64 timestamp;
>            CtExtensions extensions;
>            digitally-signed struct {
>                Version sct_version;
>                SignatureType signature_type = certificate_timestamp;
>                uint64 timestamp;
>                LogEntryType entry_type;
>                select(entry_type) {
>                    case x509_entry: ASN.1Cert;
>                    case precert_entry: ASN.1Cert;
>                } signed_entry;
>               CtExtensions extensions;
>            };
>        } SignedCertificateTimestamp;
>    The encoding of the digitally-signed element is defined in [RFC5246].

I would add a few words here summarizing that what happens here is that the 
digitally-signed struct here is replaced in the actual serialized binary 
structure by a struct DigitallySigned and cross-ref to S4.7 of RFC5246.

> 3.2. Including the Signed Certificate Timestamp in the TLS Handshake

This should be it's own top-level section as mentioned in comment (3).

>    The SCT data from at least one log must be included in the TLS
>    handshake, either by using an Authorization Extension [RFC5878] with
>    type 182, or by using OCSP Stapling (section 8 of [RFC6066]),

add to above sentence:

   or by embedding the the SCT(s) in the presented End Entity cert,

>                                                                  where
>    the response includes an OCSP extension with OID
> (see [RFC2560]) and body:
>        SignedCertificateTimestampList ::= OCTET STRING
>    At least one SCT MUST be included.  Server operators MAY include more
>    than one SCT.
>    Similarly, a Certificate Authority MAY submit the precertificate to

s/the precertificate/a precertificate/

>    more than one log, and all obtained SCTs can be directly embedded in
>    the final certificate, by encoding the SignedCertificateTimestampList

s/final certificate/actual End Entity certificate/   ?

>    structure as an ASN.1 OCTET STRING and inserting the resulting data
>    in the TBSCertificate as an X.509v3 certificate extension (OID
>  Upon receiving the certificate, clients
>    can reconstruct the original TBSCertificate to verify the SCT
>    signature.

This last step of "clients can reconstruct the original TBSCertificate" probably 
should be more thoroughly explained.

O-15.  Some comments on S4:

> 4. Client Messages

title should be "Log Client Messages" ?

>    Messages are sent as HTTPS GET or POST requests.  Parameters for
>    POSTs and all responses are encoded as JSON objects.  Parameters for

s/JSON objects/JSON texts/

see <>  (it should be cited)

>    GETs are encoded as URL parameters.  Binary data is base64 encoded as
>    specified in the individual messages.
>    The <log server> prefix can include a path as well as a server name
>    and a port.  It must map one-to-one to a known public key (how this
>    mapping is distributed is out of scope for this document).

s/distributed/constructed and distributed/  ?

>    In general, where needed, the "version" is v1 and the "id" is the log
>    id for the log server queried.

> 4.1. Add Chain to Log
>    POST https://<log server>/ct/v1/add-chain
>    Inputs
>    chain  An array of base64 encoded certificates.  The first element is

a JSON text array?

>       the leaf certificate, the second chains to the first and so on to
>       the last, which is either the root certificate or a certificate
>       that chains to a known root certificate.
>    Outputs
>    sct_version  The version of the SignedCertificateTimestamp structure,
>       in decimal.  A compliant v1 implementation MUST NOT expect this to
>       be 0 (i.e. v1).
>    id The log ID, base64 encoded.  Since clients who request an SCT for

s/clients/log clients/  ?

>       inclusion in the TLS handshake are not required to verify it, we

s/the TLS handshake/subsequent TLS handshakes/   ?

>       do not assume they know the ID of the log.
>    timestamp  The SCT timestamp, in decimal.
>    extensions  An opaque type for future expansion.  It is likely that
>       not all participants will need to understand data in this field.
>       Logs should set this to the empty string.  Clients should decode
>       the base64 encoded data and include it in the SCT.
>    signature  The SCT signature, base64 encoded.

"The SCT signature" means a SignedCertificateTimestamp structure ?

>    If the "sct_version" is not v1, then a v1 client may be unable to
>    verify the signature.  It MUST NOT construe this as an error.  [Note:
>    log clients don't need to be able to verify this structure, only TLS
>    clients do - if we were to serve the structure binary, then we could
>    completely change it without requiring an upgrade to v1 clients].

Does this "if we were to serve the structure binary...."  statement mean to say 
that since v1 log clients don't need to be able to verify the SCT signature over 
the various returned data items, that this operation could instead return an 
opaque binary blob?

O-16.  Some comments on S5:

> 5. Clients
>    There are various different functions clients of logs might perform.

Perhaps this section should be entitled "Log Client Roles" ?

this section doesn't mention the role of a (CA) log client that submits "certs 
and cert chains" to logs. Even though the latter role is mentioned elsewhere in 
the spec it should perhaps be mentioned here also.

> 5.1. Monitor
>    Monitors watch logs and check that they behave correctly.  They also
>    watch for certificates of interest.

"Monitor" should be "Log Monitor" ?

> 5.2. Auditor
>    Auditors take partial information about a log as input and verify
>    that this information is consistent with other partial information
>    they have.  An auditor might be an integral component of a TLS
>    client, it might be a standalone service or it might be a secondary
>    function of a monitor.

"Auditor" should be "Log Auditor" ?

> 8. Efficiency Considerations
>    The Merkle tree design serves the purpose of keeping communication
>    overhead low.
>    Auditing logs for integrity does not require third parties to
>    maintain a copy of each entire log.  The Signed Tree Heads can be
>    updated as new entries become available, without recomputing entire
>    trees.  Third party auditors need only fetch the Merkle consistency
>    proofs against a log's existing STH to efficiently verify the append-
>    only property of updates to their Merkle Trees, without auditing the
>    entire tree.

The above could be explained in more detail, and S5.1 should be 
cross-referenced. Is the last sentence above essentially a summary of step #8 in 
S5.1? Or are there differences?