Re: LC comments on draft-laurie-pki-sunlight-05

=JeffH <> Tue, 22 January 2013 03:12 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 7C35521F8588 for <>; Mon, 21 Jan 2013 19:12:02 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -102.265
X-Spam-Status: No, score=-102.265 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, IP_NOT_FRIENDLY=0.334, USER_IN_WHITELIST=-100]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id EeygjaoGuLlh for <>; Mon, 21 Jan 2013 19:11:59 -0800 (PST)
Received: from ( []) by (Postfix) with SMTP id C7B7921F85A2 for <>; Mon, 21 Jan 2013 19:11:51 -0800 (PST)
Received: (qmail 27349 invoked by uid 0); 22 Jan 2013 03:11:27 -0000
Received: from unknown (HELO ( by with SMTP; 22 Jan 2013 03:11:26 -0000
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;; s=default; h=Content-Transfer-Encoding:Content-Type:Subject:CC:To:MIME-Version:From:Date:Message-ID; bh=yoBVQjVkIn6KnMFBJ7sv9gF6m0/KqEfO9qov+J5ANWQ=; b=HB2hMVNSqDNBPY9m04qaCrxdjsPdzf5OPOIAI6jZ3VNBJ5whKH8OyuSl5NlbE/cRxQ8BXkbOG1OWSJzgxlIuCfrmmHWvvHT7RA6xnnOvHIMWyOBaAmjuy3y+F0GWFN+D;
Received: from [] (port=35782 helo=[]) by with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.80) (envelope-from <>) id 1TxUGj-0008Uo-JY; Mon, 21 Jan 2013 20:11:25 -0700
Message-ID: <>
Date: Mon, 21 Jan 2013 19:11:24 -0800
From: =JeffH <>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130106 Thunderbird/17.0.2
MIME-Version: 1.0
To: Ben Laurie <>
Subject: Re: LC comments on draft-laurie-pki-sunlight-05
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Identified-User: {} {sentby:smtp auth authed with}
Cc: Emilia Kasper <>, "" <>, IETF Discussion List <>, Adam Langley <>
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: IETF-Discussion <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 22 Jan 2013 03:12:02 -0000

apologies for latency, many meetings and a conference in the last couple of weeks.

BenL replied:
 > On 1 January 2013 21:50, =JeffH <> wrote:

[ in the below discussion:

  "the spec", "this spec" refers to draft-laurie-pki-sunlight-05.

"TLS-CT client"  refers to a TLS client capable of processing CT information 
that is included in the TLS handshake in any of the specified manners.

"ok" means in general: "ok, will check this in next rev of the spec..".


 >> comments on draft-laurie-pki-sunlight-05
 >> substantive comments (in somewhat arbitrary order)
 >> --------------------------------------------------

[ I demoted the comments wrt "JSON object" terminology and put them down at the 
end of this msg ]

 >> Also, the syntax for GETS isn't fully specified. Are the URL parameters to
 >> be encoded as (order independent) key-value pairs, or just as
 >> order-dependent values?  Which separator character is to be used between
 >> parameters? RFC3986 should be cited.
 > RFC 3986 says nothing about parameter format, though

correct, it doesn't, and I wasn't trying to imply that it did, sorry.  I was 
just trying to say that RFC3986 should be cited in this spec because this spec 
normatively employs URLs, but perhaps referencing RFC2616 HTTP is better because 
it defines "http_URL".

 >  - is there a
 > standard reference for that? I've refereced HTML 4.01, but perhaps
 > there's a better one?

hm, AFAICT, there is not a standard for URI query component formating and thus 
parameter encoding, so this spec will have to explicitly specify something. 
Section 3.4 of RFC3986 gives allowed chars for the query component, but that's 
about it.

Have you mocked up code that parses the log client messages? If so, what query 
component syntax does it handle?

 >> 2. "4. Client Messages" doesn't define error handling, i.e., responses to
 >> inputs the log service doesn't understand and/or is unable to parse, and/or
 >> have other errors. If the log service is to simply return a 4xx or 5xx error
 >> code, this should at least be mentioned.
 > For now, I will specify 4xx/5xx. We may have more to say once we've
 > gained some experience.


 >> 3. There appear to be three defined methods for TLS servers to provide TLS
 >> clients with CT data, in S3.2.  For this experiment, which approach is
 >> mandatory to implement for servers and clients?  Or, is it the case that
 >> participating TLS clients (ie web browsers etc) implement all three methods,
 >> and TLS servers can choose any of them?
 > The latter.

That should be made very clear. Is the reason for doing so to obtain operational 
experience wrt the three defined methods such that they perhaps can be narrowed 
down in the future, or is the expectation that TLS-CT clients will need to 
support all three methods in perpetuity?

 >> 4. "Leaf Hash" as used in S4.5 appears to be formally undefined. It
 >> apparently would be:
 >>       SHA-256(0x00 || MerkleTreeLeaf)
 >> should also be noted in S3.3.
 > You are right.


 >> 5. The recursive equations in S2.1 describe how to calculate a Merkle Tree
 >> Hash (MTH) (aka "root hash"), and thus as a side effect generate a Merkle
 >> Tree, for a given set of input data. However, there doesn't seem to be a
 >> defined algorithm (or even hints, really) for adding further inputs to an
 >> existing tree. Even though this may be reasonably left as an exercise for
 >> implementers, it should probably be discussed to some degree in the spec.
 >> E.g., note that leaf hashes are "frozen" and various interior tree node
 >> hashes become "frozen" as the tree grows. Is it not sub-optimal to employ
 >> the obvious default brute-force mechanism of rebuilding a tree entirely from
 >> scratch when new inputs are available?  Would not a recursive algorithm for
 >> adding new inputs to an existing tree be straightforward to provide?
 > I dunno about straightforward.

yeah, agreed.

 > I'll think about it.

ok. At least providing some hints would be useful it seems.

 >> 6. Signed tree heads (STHs) are denoted in terms of "tree size" (number of
 >> entries), but SCTs are denoted in terms of a timestamp.  Should there be a
 >> log client message supporting the return of the nearest STH (and thus tree
 >> size) to a given timestamp?
 > I'm not sure why? Any STH (that includes that SCT) will do.

Hm, it was sort of a gut feel that it might be useful, but perhaps not.

S5.2. Auditor says..

    A certificate accompanied by an SCT can be verified against any STH
    dated after the SCT timestamp + the Maximum Merge Delay by requesting
    a Merkle Audit Proof using Section 4.5.

S4.5 get-proof-by-hash stipulates tree_size as an input,  but
if a log auditor doesn't already have tree_size, then I suppose it first calls 
S4.3 get-sth, which will return a timestamp and a tree_size, which if generated 
max merge delay (MMD) after the SCT was gen'd, ought to be sufficient, yes?

I don't see in the spec where/how MMD is published.  Does MMD vary per log 
service?  The latter isn't stipulated in the spec it seems AFAICT ?

 >> 7. S3 paragraph 2 states that "TLS clients MUST reject certificates that do
 >> not have a valid SCT for the end-entity certificate" (i.e., hard-fail).
 >> Presummably this requirement is only for TLS clients participating in the CT
 >> experiment and that understand this protocol.
 > Of course - what other way could it be? In other words, all RFCs can
 > only say what implementations that conform with them do.
 >> This, or whatever the
 >> requirement actually is, should be further explained.

I guess what I was getting at is that CT-conformant TLS clients should be 
differentiated from non-CT-conformant ones. Stipulating a name for CT-conformant 
TLS clients would clarify this (seems to me), e.g. "TLS-CT client" or something 

 >> For example, does the simple presence of SCT(s) in the TLS handshake serve
 >> to signal to participating TLS clients that hard-fail is expected if there
 >> are any issues with CT validation?
 > We are not saying what action a client should take when it rejects a
 > certificate, as is, I believe the usual practice.


 >> 8. The spec implies, but doesn't clearly describe, especially in S3.1, that
 >> the hashes are "labels" for tree entries, and that given a leaf hash, the
 >> log implementation should be able to look up and present the LogEntry data
 >> defined in that section.
 > We actually only require an entry to be retrievable by hash (in
 > effect) for the message in 4.7, which we (at least currently) label as
 > a debugging message - so I am not sure that logs really are required
 > to be able to do that - certainly the system would work fine if they
 > couldn't, I believe (other than being unable to provide the debugging
 > data).

I think what I was getting at was that this is stated at the end of S3.2..

    The leaves of the Merkle Tree are the hashes of the corresponding
    "MerkleTreeLeaf" structures.

..but it is somewhat confusing because (abstractly) each leaf also contains (or 
has a reference to) the LogEntry data (that's defined back in S3.1)  ...if I 
understand this correctly.

 >> Various overall comments:
 >> --------------------------
 >> O-1. The phrase "this experiment" is used in S2.1 -- should describing this
 >> as an experiment be more explicitly done in the abstract and introduction
 >> sections?  What about the document title?
 > The plan is this will be an Experimental RFC, which seems clear enough to me?

well, lots of people don't perceive the difference between the various RFC 
tracks, and I'd be inclined to make it abundantly pedantically clear that this 
spec is documenting a particular actual experiment (modulo what the sponsoring 
AD thinks/advises (hi Stephen :) )

 >> O-3. The bare term "client" is used in various places where either the term
 >> "log client" or "TLS client" is being implied -- these should be made
 >> explicit. Also the roles of log clients and TLS clients should be more
 >> thoroughly presented/explored, in part because they can intersect.
 > Section 5 "Clients" is an attempt to document the various client
 > roles, one or more of which may be embodied by any particular client
 > implementation, rather than try to deal with this intersection.
 > I have gone over mentions of "client" elsewhere to try to deal with
 > your point, though. Sometimes it just means "any client" so not all
 > instances have been changed.


 >> O-4. These things seem to be duplicate names:
 >>      "root hash" and "Merkle Tree Hash (MTH)"
 > Yes,
 >>      "Tree Head Signature" and "Signed Tree Head (STH)"
 > These are not the same - you are probably confused by the
 > digitally-signed struct, which is only used as input for a signature
 > and never appears in its own right. However, we haven't been super
 > clear about what's going on here and I've tried to clean that up.


 >> O-6. I found the "history tree" paper (aka "[1]", cited here as
 >> [CrosbyWallach]) helpful in understanding how such trees are constructed,
 >> perhaps it should be more prominently mentioned. Plus the differences
 >> between the two algorithms should perhaps be more explicitly mentioned. E.g.
 >> in [CrosbyWallach] version-n tree stores n+1 inputs, while in CT a version-n
 >> tree (D[n]) stores n inputs.
 >> [CrosbyWallach]  <>
 >> O-7. The note mentioning "dummy leaves" in [CrosbyWallach] seems misleading.
 >> The difference is AFAICT that in [CrosbyWallach] all nodes at layer 1 and
 >> above (leaf entries are at layer 0), are "interior nodes", and have hashes
 >> created using 0x01. Thus in a tree with an odd number of entries (ie leaf
 >> nodes at layer 0), there will be one leaf node under an interior node having
 >> only that one child. It's not that there is a "dummy leaf", it's that such
 >> an interior node's hash is constructed from just one child rather than two.
 > Not sure I really agree with this, but in any case I have reworded it
 > slightly. I've also changed the reference to be a "proper" one, shown
 > at the end of the I-D.


 >> While in CT, if the input set is an odd number of entries, then the hash of
 >> the final single leaf is at layer 1, and is calculated as a leaf hash using
 >> 0x00. Thus CT "interior nodes" always have two children, but if the tree has
 >> an odd number of entries, the rightmost hash at layer 1 ("j" in the "binary
 >> Merkle tree with 7 leaves" figure) is a leaf node hash rather than an
 >> interior node hash.
 >> O-8. [CrosbyWallach] discusses auditing and gossiping and could be cited as
 >> a source for further discussion on those topics.
 >> O-9. The notion of "commitments" isn't well defined, and where "add a
 >> commitment to D[k:n]"  couldn't  "add an interior/intermediate node to
 >> D[k:n}"  be used?
 >> Is not the term "commitment" used in [CrosbyWallach] equivalent to the
 >> sha256_root_hash (an STH component) in the spec?
 >> [CrosbyWallach] uses the term "interior node(s)" while the spec uses
 >> "intermediate nodes" (in one place).
 > CT is not actually derived from [CrosbyWallach], we just mention it as
 > a useful reference.

yeah, it is useful, it would've been much harder to figure out the spec without 
it. Is there an existing paper that the spec is based more closely on?

The differences between the spec and [CrosbyWallach] doesn't seem to be that 
large, and so if there aren't more closely matching paper(s) to cite (?), it may 
be worth summarizing the differences between the spec and  [CrosbyWallach] e.g. 
in an appendix.

 > "commitment" is a term of cryptographic art.

so it is, but its definitions in the literature seem to vary somewhat depending 
on context.  IIUC, the spec uses "commitment" to refer to the hash labels of 
intermediate nodes ('interior nodes' in [CrosbyWallach]) as well as the root 
hash, yes?

In S2.1.1, where it says..

          We prove that the left subtree entries D[0:k] are consistent
    and add a commitment to D[k:n]:'s not clear just what "add a commitment" means. add it (an 
intermediate-node hash?) to the list of hashes that the recursive algorthm is 

Overall, I'm finding S2.1.1 pretty difficult to parse and sort out.

E.g., it isn't clear to me how/why/when the apparently boolean 3d parameter of 
SUBPROOF is either true or false, and whether there's an implied "if b then ... 
else ..."  in there somewhere.

 >> O-10. The recursive algorithms in S2 are dense and take effort to work
 >> through, perhaps adding simplistic example code (in an appendix) which
 >> implements, and/or actually working through the algorithms to arrive at some
 >> of the audit paths and consistency proofs in S2.1.3, would be helpful.
 > We have actual working code - would a reference to that be better?

I don't know if it would be better (I've compiled it and am poking through the 
code). I sorta think pseudo code in an appendix that would generate the trees in 
S2.1.3. would be helpful.

 >> I desk-checked S2.1, and it seems correct, but didn't do S2.1.1 or S2.1.2.
 >> The examples in S2.1.3 appear nominally correct but I didn't desk check
 >> them.
 >> Should there be a reference to
 >> <> ?   And/or a note
 >> regarding available code and to contact the authors for more information?
 >> (as is done in RFC 2154)
 > I couldn't find such a thing in RFC 2154.

see end of S2 on page 4 of RFC2154.

 >  My only concern about such a
 > reference is whether it would live as long as the RFC does :-)

hence the desire to have pseudo code in spec :)

 >> O-11. S3.3 should mention the Maximum Merge Delay MMD where it says
 >> "periodically append". Also, in S3.3, "Signed Merkle Tree Update" should be
 >> a "Tree Head Signature" aka "signed tree head (STH)"?
 > I just removed that, as it is immediately repeated in the next section.


 >> O-12. S3.3, S3.4, S4.4, S4.5, and S4.7 mention the notion of logs
 >> "publishing" STHs, but no mechanism is described for explicitly
 >> "publishing". Is this meant to mean only that a "published" STH is available
 >> for retrieval by clients using the "Retrieve Latest Signed Tree Head" log
 >> client message?
 >> Or, would there be a use case, eg introducing an existing log service to a
 >> log monitor, for requesting (or being able to enumerate) all published STHs
 >> from the log?
 > I don't believe there is: the latest STH tells you everything you need
 > to know at that point. I have removed mention of publishing.


 >> O-14. Detailed comments on S2...
 >> ------------------------------
 >>> 2. Cryptographic components
 >>> 2.1. Merkle Hash Trees
 >>>    Logs use a binary Merkle hash tree for efficient auditing.  The
 >>>    hashing algorithm is SHA-256 (note that this is fixed for this
 >>>    experiment but it is anticipated that each log would be able to
 >>>    specify a hash algorithm).  The input to the Merkle tree hash is a
 >>>    list of data entries; these entries will be hashed to form the leaves
 >>>    of the Merkle hash tree.  The output is a single 32-byte root hash.
 >>>    Given an ordered list of n inputs, D[n] = {d(0), d(1), ..., d(n-1)},
 >>>    the Merkle Tree Hash (MTH) is thus defined as follows:
 >>>    The hash of an empty list is the hash of an empty string:
 >>>    MTH({}) = SHA-256().
 >> This MTH({}) construct doesn't appear to be used anywhere else in the spec
 >> (yes?), and so does it really need mentioning?
 > If it is not defined, then we cannot represent an empty tree.

yeah, I agree from a mathematical completeness perspective.  but I still found 
it sort of distracting in that I don't know that it's actually germane from an 
implementor's perspective. maybe it is because

 >>>    The hash of a list with one entry is:
 >>>    MTH({d(0)}) = SHA-256(0x00 || d(0)).
 >> The immediately above equation is for leaf entries (yes?),
 > Yes.
 >> where in this
 >> notation n = 1, perhaps it should be stated explicitly:
 >>     When n = 1, a leaf entry is denoted, and D[1] = {d(0)}. The leaf hash
 >>     (LH) for a leaf entry is calculated as:
 >>     MTH(D[1]) = LH(D[1]) = SHA-256( 0x00 || d(0) )
 > Ugh. LH(D[1]) seems meaningless to me. A leaf hash is always of a "1
 > entry tree".

yeah, for those who have grokked this stuff. for others it isn't immediately 
obvious. also,  the above comment was motivated in part due to the missing 
formal definition for "Leaf Hash" noted in item (4) above. even if the LH(D[1]) 
notation isn't used, some additional prose along the lines suggested above would 
be helpful to less initiated readers imv.

 >>>    For n > 1, let k be the largest power of two smaller than n.
 >> The unqualified "power of two" phrase is arguably ambiguous.
 > It is?

uh, yeah.

but "let k be a number which is the largest power of two such that..." isn't.

 >> Suggested rephrase for this where it occurs throughout section 2..
 >>     For n > 1, let k be a number which is the largest power of two
 >>     such that k = 2^i, 0 <= i < n, and k < n.
 > If we're going to go down that path, then it should say:
 > For n > 1, let k be the largest number such that k = 2^i and k < n.
 > or
 > For n > 1, let k = 2^i s.t. k < n and 2k >= n.
 > surely?

well, yes (i prefer the former), but i think it's also important to explicitly 
state 0 <= i < n

 >>>    The Merkle Tree Hash of an n-element list D[n] is then defined
 >>>    recursively as
 >> The above statement applies to the combination of the n = 1 equation above
 >> and the equation below, and so should perhaps be moved up above the n = 1
 >> equation.
 > ? It says n > 1, so doesn't apply to n = 1?

oh yeah huh.

in any case, I found the prose separation of the "list with one entry" and the 
"n > 1" case to be confusing because the former is part of the recursive 
definition of MTH(), yes?

 >>>    MTH(D[n]) = SHA-256(0x01 || MTH(D[0:k]) || MTH(D[k:n])),
 >>>    where || is concatenation and D[k1:k2] denotes the length (k2 - k1)
 >>>    list {d(k1), d(k1+1),..., d(k2-1)}.
 >> The above phrase doesn't parse well and is somewhat ambiguous, here it is
 >> extracted for clarity:
 >>  "D[k1:k2] denotes the length (k2 - k1) list {d(k1), d(k1+1),..., d(k2-1)}"
 >> How about rephrasing it along the lines of this:
 >>     D[k1:k2] denotes a sublist {d(k1), d(k1+1),..., d(k2-1)}, having
 >>     (k2 - k1) elements, of the original input list D[n]. When (k2 - k1)
 >>     is 1, a leaf hash is calculated.
 > We tried lots of different ways of saying this and they were all a
 > little messy. Yours mixes concerns and is rather verbose, so not
 > convinced it is actually an improvement.

well, the way it's presently said in the spec is (to me) pretty darn hard to 

which concerns are missed in the suggested reformulation?  I tend to think 
concocting something not so terse would be a service to the reader/implementer.

 >>                                          (Note that the hash calculation
 >>>    for leaves and nodes differ.  This domain separation is required to
 >>>    give second preimage resistance.)
 >>>    Note that we do not require the length of the input list to be a
 >>>    power of two.  The resulting Merkle tree may thus not be balanced,
 >>>    however, its shape is uniquely determined by the number of leaves.
 >>>    [This Merkle tree is essentially the same as the history tree [1]
 >>>    proposal, except our definition omits dummy leaves.]
 >> I suggest re-writing the first above Note along with the next paragraph in
 >> light of all above comments on S2 and [CrosbyWallach].
 > It was already partly rewritten as a result of above comments, so
 > let's see how you like the next version?


 >> O-15.  Some comments on S3:
 >> ------------------------------------
 >>> 3. Log Format
 >> this section isn't just about "format" of log - it's also about log
 >> behavior/operation
 > Good point.
 >>>    Anyone can submit certificates to certificate logs for public
 >>>    auditing, however, since certificates will not be accepted by clients
 >>>    unless logged, it is expected that certificate owners or their CAs
 >>>    will usually submit them.  A log is a single, ever-growing, append-
 >>>    only Merkle Tree of such certificates.
 >>>    When a valid certificate is submitted to a log, the log MUST
 >>>    immediately return a Signed Certificate Timestamp (SCT).  The SCT is
 >>>    the log's promise to incorporate the certificate in the Merkle Tree
 >>>    within a fixed amount of time known as the Maximum Merge Delay (MMD).
 >>>    If the log has previously seen the certificate, it MAY return the
 >>>    same SCT as it returned before.
 >> What if the submitted end entity cert is the same, but the certificate chain
 >> is different (yet valid)?
 > The purpose of the chain is to:
 > a) Prevent spam, and
 > b) Identify who to blame in the event of a misissue.
 > Alternate chains presumably don't actually change the direct blame,
 > and so I see no reason to do other than what the I-D says - i.e.
 > return the same SCT as before.

ok. tho i wonder if there's any value in caching the alternate chain or the new 
portions thereof.

 >>>                                     TLS servers MUST present an SCT from
 >>>    one or more logs to the client together with the certificate.  TLS
 >>>    clients MUST reject certificates that do not have a valid SCT for the
 >>>    end-entity certificate.
 >> [ see comment (7) above ]
 >>>    Periodically, each log appends all its new entries to the Merkle
 >>>    Tree, and signs the root of the tree.  Clients and auditors can thus
 >> Should "Clients and auditors" actually be "TLS Clients, log monitors, and
 >> log auditors" ?
 > Bearing in mind that these are actually roles rather than distinct
 > entities, it should probably just say "auditors".

I dunno, that may loose some readers.  in any case, the distinction between 
roles and distinct entities should probably be mentioned/discussed e.g. in the 
Introduction or thereabouts.


 >>> 3.1. Log Entries
 >>>    Anyone can submit a certificate to any log.  In order to enable
 >>>    attribution of each logged certificate to its issuer, the log SHALL
 >>>    publish a list of acceptable root certificates (this list might
 >>>    usefully be the union of root certificates trusted by major browser
 >>>    vendors).  Each submitted certificate MUST be accompanied by all
 >>>    additional certificates required to verify the certificate chain up
 >>>    to an accepted root certificate.  The root certificate itself MAY be
 >>>    omitted from this list.
 >>>    Alternatively, (root as well as intermediate) Certificate Authorities
 >> Additionally?  which manner is the experiment going to operate, or is it TBD
 >> ?
 > Not sure what you mean? The log will accept either type of submission.

oh, sorry, I meant s/Alternatively/Additionally/

 >>>    may submit a certificate to logs prior to issuance.  To do so, a
 >>>    Certificate Authority constructs a Precertificate by adding a special
 >>>    critical poison extension (OID, whose
 >>>    extnValue OCTET STRING contains ASN.1 NULL data (0x05 0x00)) to the
 >>>    leaf TBSCertificate (this extension is to ensure that the
 >> leaf == end entity ?
 >> s/leaf certificate/end entity certificate/g   ?
 > Yes.
 >>>    Precertificate cannot be validated by a standard X.509v3 client), and
 >>>    signing the resulting TBSCertificate [RFC5280] with either
 >>>    o  a special-purpose (Extended Key Usage: Certificate Transparency,
 >>>       OID Precertificate Signing Certificate.
 >>>       The Precertificate Signing Certificate MUST be certified by the CA
 >>>       certificate that will ultimately sign the leaf TBSCertificate
 >> "sign the leaf TBSCertificate"  means to say "sign the actual
 >> issued-to-the-customer TBSCertificate component of the End Entity
 >> certificate" ?
 > Well, it means something like that, I have added some words.


 >>>       (note that the log may relax standard validation rules to allow
 >>>       this, so long as the final signed certificate will be valid),
 >>>    o  or, the CA certificate that will sign the final certificate.
 >> "final certificate" is the "issued-to-the-customer End Entity certificate" ?
 > I have changed this to "issued certificate".

ok, tho should it be "issued end entity certificate" ?

 >>>    Structure of the Signed Certificate Timestamp:
 >> The SCT discussion here should probably be its own subsection.
 > OK.
 >>>        enum { certificate_timestamp(0), tree_hash(1), 255 }
 >>>          SignatureType;
 >>>        enum { v1(0), 255 }
 >>>          Version;
 >>>          struct {
 >>>              opaque key_id[32];
 >>>          } LogID;
 >>>          opaque CtExtensions<0..2^16-1>;
 >>>    "key_id" is the SHA-256 hash of the log's public key, calculated over
 >>>    the DER encoding of the key represented as SubjectPublicKeyInfo.
 >> I'd place the above paragraph regarding "key_id" down below the
 >> SignedCertificateTimestamp definition.
 >>>        struct {
 >>>            Version sct_version;
 >>>            LogID id;
 >>>            uint64 timestamp;
 >>>            CtExtensions extensions;
 >>>            digitally-signed struct {
 >>>                Version sct_version;
 >>>                SignatureType signature_type = certificate_timestamp;
 >>>                uint64 timestamp;
 >>>                LogEntryType entry_type;
 >>>                select(entry_type) {
 >>>                    case x509_entry: ASN.1Cert;
 >>>                    case precert_entry: ASN.1Cert;
 >>>                } signed_entry;
 >>>               CtExtensions extensions;
 >>>            };
 >>>        } SignedCertificateTimestamp;
 >>>    The encoding of the digitally-signed element is defined in [RFC5246].
 >> I would add a few words here summarizing that what happens here is that the
 >> digitally-signed struct here is replaced in the actual serialized binary
 >> structure by a struct DigitallySigned and cross-ref to S4.7 of RFC5246.
 > Except it isn't :-)

ok, then I don't understand what's going on here.  RFC5246 S4.7 sez...

    A digitally-signed element is encoded as a struct DigitallySigned:

       struct {
          SignatureAndHashAlgorithm algorithm;
          opaque signature<0..2^16-1>;
       } DigitallySigned;

..and I take the "digitally-signed struct" within SignedCertificateTimestamp to 
be a "digitally-signed element".  Please help me understand what am I missing?

 > And we already reference RFC5246.

we (IETF spec editors/authors) reasonably often reference particular sections of 
other RFCs in order to aid the reader. this is a sufficiently subtle construct 
to merit such an explicit ref (to S4.7 of RFC5246 if I'm correct) in my view.


 >> O-15.  Some comments on S4:
 >> ---------------------------
 >>> 4. Client Messages
 >> title should be "Log Client Messages" ?
 > Yes.
 >>>    Messages are sent as HTTPS GET or POST requests.  Parameters for
 >>>    POSTs and all responses are encoded as JSON objects.  Parameters for
 >> s/JSON objects/JSON texts/
 > I don't agree with this. See above.

see below.


 >>>       inclusion in the TLS handshake are not required to verify it, we
 >> s/the TLS handshake/subsequent TLS handshakes/   ?

 >>>       do not assume they know the ID of the log.
 >>>    timestamp  The SCT timestamp, in decimal.
 >>>    extensions  An opaque type for future expansion.  It is likely that
 >>>       not all participants will need to understand data in this field.
 >>>       Logs should set this to the empty string.  Clients should decode
 >>>       the base64 encoded data and include it in the SCT.
 >>>    signature  The SCT signature, base64 encoded.
 >> "The SCT signature" means a SignedCertificateTimestamp structure ?
 > No, the signature that is a component of the structure.

Oh, you mean this..

            digitally-signed struct {
                Version sct_version;
                SignatureType signature_type = certificate_timestamp;
                uint64 timestamp;
                LogEntryType entry_type;
                select(entry_type) {
                    case x509_entry: ASN.1Cert;
                    case precert_entry: ASN.1Cert;
                } signed_entry;
               CtExtensions extensions;

..which automagically transforms to

       struct {
          SignatureAndHashAlgorithm algorithm;
          opaque signature<0..2^16-1>;
       } DigitallySigned;

..if I understand RFC5246 correctly?  If not, please direct us to the correct 
formulation :)

 >>>    If the "sct_version" is not v1, then a v1 client may be unable to
 >>>    verify the signature.  It MUST NOT construe this as an error.  [Note:
 >>>    log clients don't need to be able to verify this structure, only TLS
 >>>    clients do - if we were to serve the structure binary, then we could
 >>>    completely change it without requiring an upgrade to v1 clients].
 >> Does this "if we were to serve the structure binary...."  statement mean to
 >> say that since v1 log clients don't need to be able to verify the SCT
 >> signature over the various returned data items, that this operation could
 >> instead return an opaque binary blob?
 > Indeed.

Ok, then perhaps it could use some added text to more explicitly state that?

 >> O-16.  Some comments on S5:
 >> ---------------------------
 >>> 5. Clients
 >>>    There are various different functions clients of logs might perform.
 >> Perhaps this section should be entitled "Log Client Roles" ?
 > Since you persuaded me to include TLS clients, no :-)


 >> this section doesn't mention the role of a (CA) log client that submits
 >> "certs and cert chains" to logs. Even though the latter role is mentioned
 >> elsewhere in the spec it should perhaps be mentioned here also.
 > OK.

ok :)

 >>> 5.1. Monitor
 >>>    Monitors watch logs and check that they behave correctly.  They also
 >>>    watch for certificates of interest.
 >> "Monitor" should be "Log Monitor" ?
 > There's no other kind of monitor :-)

in the explicit context of this spec, agreed.  but in general I favor/advocate 
creation and use of more fully descriptive words/phrases so at least one is more 
likely to find them with a search when you're wondering what sort of "monitor" 
someone down the road is yammering on about.

[ I would add a terminology section to the spec ]

 >>> 5.2. Auditor
 >>>    Auditors take partial information about a log as input and verify
 >>>    that this information is consistent with other partial information
 >>>    they have.  An auditor might be an integral component of a TLS
 >>>    client, it might be a standalone service or it might be a secondary
 >>>    function of a monitor.
 >> "Auditor" should be "Log Auditor" ?
 > And there's no other kind of auditor.

see above :)

e.g. in the overall webpki world, there /are/ other forms of auditors (e.g. the 
ones noted here <>) and so it's 
worth using a more descriptive term, imv.

 >>> 8. Efficiency Considerations
 >>>    The Merkle tree design serves the purpose of keeping communication
 >>>    overhead low.
 >>>    Auditing logs for integrity does not require third parties to
 >>>    maintain a copy of each entire log.  The Signed Tree Heads can be
 >>>    updated as new entries become available, without recomputing entire
 >>>    trees.  Third party auditors need only fetch the Merkle consistency
 >>>    proofs against a log's existing STH to efficiently verify the append-
 >>>    only property of updates to their Merkle Trees, without auditing the
 >>>    entire tree.
 >> The above could be explained in more detail, and S5.1 should be
 >> cross-referenced. Is the last sentence above essentially a summary of step
 >> #8 in S5.1? Or are there differences?
 > It is a summary of 5.1

ok, but it'd be helpful to state that and crossref S5.1.

wrt he JSON stuff...

 >> 1. The client messages S4 don't explicitly lay out the syntax for request
 >> messages or responses. E.g., for S4.1 "Add Chain to Log", is the input a
 >> stand-alone JSON text array, or a JSON text object containing a JSON text
 >> array?
 >> The term "JSON object" as used in the first paragraph is ambiguous and
 >> perhaps what is mean is simply "JSON texts" or "JSON text objects or JSON
 >> text arrays". RFC4627 clearly defines "JSON text", and should be cited. But
 >> RFC4627 is a little ambiguous itself regarding "JSON object" and so I
 >> suggest these definitions:
 >>     JSON text object:   A JSON text matching the "object" ABNF production
 >>        in Section 2.2 of [RFC4627].
 >>     JSON text array:   A JSON text matching the "array" ABNF production
 >>        in Section 2.3 of [RFC4627].
 > I agree that RFC 4627 should be cited and I will correct that.


 > The
 > rest of this confuses me: JSON is a textual representation of
 > structured data, as it states in the RFC. It defines an object quite
 > clearly
 > " An object is an unordered collection of zero or more name/value
 >    pairs, where a name is a string and a value is a string, number,
 >    boolean, null, object, or array."

well, yes, but that's in just the introduction of RFC 4627.

 > Defining a "JSON text object" seems pointless to me - clearly a JSON
 > object is an object as defined by JSON, surely? Introducing another
 > term seems like to add confusion rather than remove it.

note that "JSON text" is very explicitly defined in RFC4627 at the beginning of 
S2 as..

    A JSON text is a sequence of tokens.  The set of tokens includes six
    structural characters, strings, numbers, and three literal names.

    A JSON text is a serialized object or array.

       JSON-text = object / array

the reason I flagged this issue is that I just recently reviewed a different 
internet-draft where they'd confused a JSON object -- in the sense of a JSON 
text matching the object production of S2.2 RFC4627 -- and a "JSON object" in 
the sense of an abstract programming construct, and they didn't understand that 
in the protocol on-the-wire world a JSON object /is a string/  (i.e. it is 
string-serialized according to the grammar of RFC4627).

It seems the term "JSON object" is ambiguous depending on whether you're looking 
at it from a programming perspective or an on-the-wire protocol perspective (eg 
see <> which talks about 
"internal form" and "external form" (ugh)), hence my (perhaps feeble) attempt to 
invent a more explicit term for the on-the-wire protocol form.

So maybe a more palatable definition would be..

   JSON object: A JSON text matching the "object" ABNF production
                in Section 2.2 of [RFC4627].