Re: LC comments on draft-laurie-pki-sunlight-05

Ben Laurie <benl@google.com> Tue, 08 January 2013 17:54 UTC

Return-Path: <benl@google.com>
X-Original-To: ietf@ietfa.amsl.com
Delivered-To: ietf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0F90111E80F2 for <ietf@ietfa.amsl.com>; Tue, 8 Jan 2013 09:54:13 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.977
X-Spam-Level:
X-Spam-Status: No, score=-102.977 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, RCVD_IN_DNSWL_LOW=-1, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 20dE1Vnzcraq for <ietf@ietfa.amsl.com>; Tue, 8 Jan 2013 09:54:10 -0800 (PST)
Received: from mail-we0-f170.google.com (mail-we0-f170.google.com [74.125.82.170]) by ietfa.amsl.com (Postfix) with ESMTP id D577121F8583 for <ietf@ietf.org>; Tue, 8 Jan 2013 09:54:09 -0800 (PST)
Received: by mail-we0-f170.google.com with SMTP id r1so584148wey.15 for <ietf@ietf.org>; Tue, 08 Jan 2013 09:54:08 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:date:message-id:subject:from:to:cc:content-type; bh=il9sSMkxMTnkX6x3l0wpDb1HWnKd9cbymzRxCIcOwAQ=; b=NSLybn6GocImTIvJfhC6oaijKrnU05pnsvyehD+ykhI7eTqd7ahSgnSXZoiwjBz7x0 ZSOO682XkExcHLzVyauc4l/q+x3MxfNnHWrEYWz8McGwUNRoobRdIXxnD6wyZiggFdtN Ul7slZ2qvXgviox3n7Pz49Tok/lpHchldm+yFUB9rj+bF6yI2VwIOvcHz9LlMIPhxRW3 vLpE8qOYHO4pVLDEJ7NXk3FrxCOyk52hzv5fb3emijhpmb22kpH8ZR9lItwjIrQwR+5E xQB9uiUxVf81QQQkzX/JSr2APC8Ius0VQ8pPEmSlg7l2+LB6kqk1OCKbO2rgwDNCeax1 vQsA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:date:message-id:subject:from:to:cc:content-type :x-gm-message-state; bh=il9sSMkxMTnkX6x3l0wpDb1HWnKd9cbymzRxCIcOwAQ=; b=hLcV4iKLdNbOY9xkj0STFB/ieRegEuXZ+DdlXBEPQOi6KwiAo9FSaarjeyXT2Mg+wu 2/Kr43BRbKTV1fRfdu6WLBCg6AkoVSAMeu/oxYK2qFV3t4cZZTrD4yqlb+83htW6XDqu 8qj6a3R7Y+7uaRMq/c6WN202uJXIyyFGCE6u6g1K/yl132zbLpN7SEgwN3q154TXopqa J/mRkJTVlMy+CePXPZbYCSk67sGZEYuAsj1oWpzwk9Suue0zJsFL3YDOYbJ6Wn5uD19q 1a+COSHOhu2bMMfJ2B8b9qbvxTEpKxwhdVScAJCCzBgRlgOy4UtEpNpdTMxGhQ3cZWM5 v5wQ==
MIME-Version: 1.0
Received: by 10.180.106.34 with SMTP id gr2mr16194827wib.18.1357667648747; Tue, 08 Jan 2013 09:54:08 -0800 (PST)
Received: by 10.194.22.36 with HTTP; Tue, 8 Jan 2013 09:54:08 -0800 (PST)
Date: Tue, 08 Jan 2013 17:54:08 +0000
Message-ID: <CABrd9STRdENwQanrk4BuVA7Taz_r=vC6VeiZrOaLprfhfucYug@mail.gmail.com>
Subject: Re: LC comments on draft-laurie-pki-sunlight-05
From: Ben Laurie <benl@google.com>
To: =JeffH <Jeff.Hodges@kingsmountain.com>
Content-Type: text/plain; charset="ISO-8859-1"
X-Gm-Message-State: ALoCoQkoGF+aa3j/dxqTpPraqNbrJD8l1aGHcjYyjGyatA0KutiX1UHZcbOsOUTuzVNLOC+Z91DDqPMo5/4jk5n/e1Y5HmAolgN1xmOLVbry951IdKFVPifMqeIyX6DEVnBXr7+JIoC6PLXkWkSRjxkLXdTrHqvacuHPJunWA5ggmBSG3nfh2LZHGIYl48ZRUQYqTbGCMAxh
X-Mailman-Approved-At: Wed, 09 Jan 2013 08:04:45 -0800
Cc: Emilia Kasper <ekasper@google.com>, "therightkey@ietf.org" <therightkey@ietf.org>, IETF Discussion List <ietf@ietf.org>, Adam Langley <agl@google.com>
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ietf>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Jan 2013 17:54:13 -0000

On 1 January 2013 21:50, =JeffH <Jeff.Hodges@kingsmountain.com> wrote:
> Hi,
>
> Here are some last call comments on draft-laurie-pki-sunlight-05.
>
> Overall the spec is in basically overall reasonable shape but I do have some
> substantive comments that if I'm not totally misunderstanding things (which
> could be the case) ought to be discussed and addressed in some fashion.
>
> The plain overall comments are to some degree "take 'em or leave 'em"
> depending upon folks' sense of urgency to get the spec through the IETF
> pipeline, but the degree likely depends upon the observer.
>
> I hope this is helpful,

It is indeed,  thankyou.

> =JeffH
> ------
>
> comments on draft-laurie-pki-sunlight-05
>
> substantive comments (in somewhat arbitrary order)
> --------------------------------------------------
>
> 1. The client messages S4 don't explicitly lay out the syntax for request
> messages or responses. E.g., for S4.1 "Add Chain to Log", is the input a
> stand-alone JSON text array, or a JSON text object containing a JSON text
> array?
>
> The term "JSON object" as used in the first paragraph is ambiguous and
> perhaps what is mean is simply "JSON texts" or "JSON text objects or JSON
> text arrays". RFC4627 clearly defines "JSON text", and should be cited. But
> RFC4627 is a little ambiguous itself regarding "JSON object" and so I
> suggest these definitions:
>
>     JSON text object:   A JSON text matching the "object" ABNF production
>        in Section 2.2 of [RFC4627].
>
>     JSON text array:   A JSON text matching the "array" ABNF production
>        in Section 2.3 of [RFC4627].

I agree that RFC 4627 should be cited and I will correct that. The
rest of this confuses me: JSON is a textual representation of
structured data, as it states in the RFC. It defines an object quite
clearly

" An object is an unordered collection of zero or more name/value
   pairs, where a name is a string and a value is a string, number,
   boolean, null, object, or array."

Defining a "JSON text object" seems pointless to me - clearly a JSON
object is an object as defined by JSON, surely? Introducing another
term seems like to add confusion rather than remove it.

>
> Also, the syntax for GETS isn't fully specified. Are the URL parameters to
> be encoded as (order independent) key-value pairs, or just as
> order-dependent values?  Which separator character is to be used between
> parameters? RFC3986 should be cited.

RFC 3986 says nothing about parameter format, though - is there a
standard reference for that? I've refereced HTML 4.01, but perhaps
there's a better one?

> Examples for both JSON text inputs and outputs, as well as URL parameters
> would be helpful.

Yes, we will provide some.

> 2. "4. Client Messages" doesn't define error handling, i.e., responses to
> inputs the log service doesn't understand and/or is unable to parse, and/or
> have other errors. If the log service is to simply return a 4xx or 5xx error
> code, this should at least be mentioned.

For now, I will specify 4xx/5xx. We may have more to say once we've
gained some experience.

> 3. There appear to be three defined methods for TLS servers to provide TLS
> clients with CT data, in S3.2.  For this experiment, which approach is
> mandatory to implement for servers and clients?  Or, is it the case that
> participating TLS clients (ie web browsers etc) implement all three methods,
> and TLS servers can choose any of them?

The latter.

>
> Also, S3.2 probably doesn't belong in S3 and perhaps should be a separate
> top-level section on its own, and have three subsections, one for each
> method.

Maybe.

> 4. "Leaf Hash" as used in S4.5 appears to be formally undefined. It
> apparently would be:
>
>       SHA-256(0x00 || MerkleTreeLeaf)
>
> ..it should also be noted in S3.3.

You are right.

> 5. The recursive equations in S2.1 describe how to calculate a Merkle Tree
> Hash (MTH) (aka "root hash"), and thus as a side effect generate a Merkle
> Tree, for a given set of input data. However, there doesn't seem to be a
> defined algorithm (or even hints, really) for adding further inputs to an
> existing tree. Even though this may be reasonably left as an exercise for
> implementers, it should probably be discussed to some degree in the spec.
> E.g., note that leaf hashes are "frozen" and various interior tree node
> hashes become "frozen" as the tree grows. Is it not sub-optimal to employ
> the obvious default brute-force mechanism of rebuilding a tree entirely from
> scratch when new inputs are available?  Would not a recursive algorithm for
> adding new inputs to an existing tree be straightforward to provide?

I dunno about straightforward. I'll think about it.

> 6. Signed tree heads (STHs) are denoted in terms of "tree size" (number of
> entries), but SCTs are denoted in terms of a timestamp.  Should there be a
> log client message supporting the return of the nearest STH (and thus tree
> size) to a given timestamp?

I'm not sure why? Any STH (that includes that SCT) will do.

> 7. S3 paragraph 2 states that "TLS clients MUST reject certificates that do
> not have a valid SCT for the end-entity certificate" (i.e., hard-fail).
> Presummably this requirement is only for TLS clients participating in the CT
> experiment and that understand this protocol.

Of course - what other way could it be? In other words, all RFCs can
only say what implementations that conform with them do.

> This, or whatever the
> requirement actually is, should be further explained.

?

> For example, does the simple presence of SCT(s) in the TLS handshake serve
> to signal to participating TLS clients that hard-fail is expected if there
> are any issues with CT validation?

We are not saying what action a client should take when it rejects a
certificate, as is, I believe the usual practice.

> 8. The spec implies, but doesn't clearly describe, especially in S3.1, that
> the hashes are "labels" for tree entries, and that given a leaf hash, the
> log implementation should be able to look up and present the LogEntry data
> defined in that section.

We actually only require an entry to be retrievable by hash (in
effect) for the message in 4.7, which we (at least currently) label as
a debugging message - so I am not sure that logs really are required
to be able to do that - certainly the system would work fine if they
couldn't, I believe (other than being unable to provide the debugging
data).

> 9. Validating an SCT presummably requires having the Log Service's public
> key, yes?  This isn't clearly discussed, and also the mention of how one
> obtains a log service's public key is out of scope is buried in 2nd para of
> S4 -- it should be discussed in a separate clearly entitled subsection.

Good point.

>
>
> 10. Unless I'm totally missing it, there isn't an explicit description of
> how one (eg a TLS client) goes about validating/verifying an SCT.

Indeed, fixing that also fixes the above point.

> Various overall comments:
> --------------------------
>
> O-1. The phrase "this experiment" is used in S2.1 -- should describing this
> as an experiment be more explicitly done in the abstract and introduction
> sections?  What about the document title?

The plan is this will be an Experimental RFC, which seems clear enough to me?

> O-2. Should explicitly say in abstract and introduction that operationally,
> the logs are to be materialized as (experimental?) network services having
> the protocol operations for submissions and queries that are defined in this
> spec.

Seems a little redundant, but I have added something anyway.

>
>
> O-3. The bare term "client" is used in various places where either the term
> "log client" or "TLS client" is being implied -- these should be made
> explicit. Also the roles of log clients and TLS clients should be more
> thoroughly presented/explored, in part because they can intersect.

Section 5 "Clients" is an attempt to document the various client
roles, one or more of which may be embodied by any particular client
implementation, rather than try to deal with this intersection.

I have gone over mentions of "client" elsewhere to try to deal with
your point, though. Sometimes it just means "any client" so not all
instances have been changed.

>
>
> O-4. These things seem to be duplicate names:
>
>      "root hash" and "Merkle Tree Hash (MTH)"

Yes,

>
>      "Tree Head Signature" and "Signed Tree Head (STH)"

These are not the same - you are probably confused by the
digitally-signed struct, which is only used as input for a signature
and never appears in its own right. However, we haven't been super
clear about what's going on here and I've tried to clean that up.

>
> ..which makes parsing the spec more difficult than if one name is used
> consistently for each.
>
>
> O-5. The terms "leaf certificate" and "final certificate" appear to be used
> where..
>
>   End Entity certificate
>
>   final End Entity certificate
>
> ..would be clearer and more consistent with TLS and PKIX terminology, and
> perhaps less confusing with the terms "leaf" and "leaf node" which are used
> when discussing the Merkle trees and their components.

Good point.

> O-6. I found the "history tree" paper (aka "[1]", cited here as
> [CrosbyWallach]) helpful in understanding how such trees are constructed,
> perhaps it should be more prominently mentioned. Plus the differences
> between the two algorithms should perhaps be more explicitly mentioned. E.g.
> in [CrosbyWallach] version-n tree stores n+1 inputs, while in CT a version-n
> tree (D[n]) stores n inputs.
>
> [CrosbyWallach]  <http://tamperevident.cs.rice.edu/Logging.html>
>
>
> O-7. The note mentioning "dummy leaves" in [CrosbyWallach] seems misleading.
> The difference is AFAICT that in [CrosbyWallach] all nodes at layer 1 and
> above (leaf entries are at layer 0), are "interior nodes", and have hashes
> created using 0x01. Thus in a tree with an odd number of entries (ie leaf
> nodes at layer 0), there will be one leaf node under an interior node having
> only that one child. It's not that there is a "dummy leaf", it's that such
> an interior node's hash is constructed from just one child rather than two.

Not sure I really agree with this, but in any case I have reworded it
slightly. I've also changed the reference to be a "proper" one, shown
at the end of the I-D.

> While in CT, if the input set is an odd number of entries, then the hash of
> the final single leaf is at layer 1, and is calculated as a leaf hash using
> 0x00. Thus CT "interior nodes" always have two children, but if the tree has
> an odd number of entries, the rightmost hash at layer 1 ("j" in the "binary
> Merkle tree with 7 leaves" figure) is a leaf node hash rather than an
> interior node hash.
>
>
> O-8. [CrosbyWallach] discusses auditing and gossiping and could be cited as
> a source for further discussion on those topics.
>
>
> O-9. The notion of "commitments" isn't well defined, and where "add a
> commitment to D[k:n]"  couldn't  "add an interior/intermediate node to
> D[k:n}"  be used?
>
> Is not the term "commitment" used in [CrosbyWallach] equivalent to the
> sha256_root_hash (an STH component) in the spec?
>
> [CrosbyWallach] uses the term "interior node(s)" while the spec uses
> "intermediate nodes" (in one place).

CT is not actually derived from [CrosbyWallach], we just mention it as
a useful reference.

"commitment" is a term of cryptographic art.

> O-10. The recursive algorithms in S2 are dense and take effort to work
> through, perhaps adding simplistic example code (in an appendix) which
> implements, and/or actually working through the algorithms to arrive at some
> of the audit paths and consistency proofs in S2.1.3, would be helpful.

We have actual working code - would a reference to that be better?

> I desk-checked S2.1, and it seems correct, but didn't do S2.1.1 or S2.1.2.
> The examples in S2.1.3 appear nominally correct but I didn't desk check
> them.
>
> Should there be a reference to
> <https://code.google.com/p/certificate-transparency/> ?   And/or a note
> regarding available code and to contact the authors for more information?
> (as is done in RFC 2154)

I couldn't find such a thing in RFC 2154. My only concern about such a
reference is whether it would live as long as the RFC does :-)

> O-11. S3.3 should mention the Maximum Merge Delay MMD where it says
> "periodically append". Also, in S3.3, "Signed Merkle Tree Update" should be
> a "Tree Head Signature" aka "signed tree head (STH)"?

I just removed that, as it is immediately repeated in the next section.

>
>
> O-12. S3.3, S3.4, S4.4, S4.5, and S4.7 mention the notion of logs
> "publishing" STHs, but no mechanism is described for explicitly
> "publishing". Is this meant to mean only that a "published" STH is available
> for retrieval by clients using the "Retrieve Latest Signed Tree Head" log
> client message?
>
> Or, would there be a use case, eg introducing an existing log service to a
> log monitor, for requesting (or being able to enumerate) all published STHs
> from the log?

I don't believe there is: the latest STH tells you everything you need
to know at that point. I have removed mention of publishing.

>
>
> O-13. signed tree heads (STHs) are denoted in terms of "tree size" (number
> of entries), but SCTs are denoted in terms of a timestamp.  Should there be
> a log client message supporting the return of the nearest STH (and thus tree
> size) to a given timestamp?

I don't believe this is needed.

>
>
>
> O-14. Detailed comments on S2...
> ------------------------------
>
>> 2. Cryptographic components
>>
>>
>> 2.1. Merkle Hash Trees
>>
>>
>>    Logs use a binary Merkle hash tree for efficient auditing.  The
>>    hashing algorithm is SHA-256 (note that this is fixed for this
>>    experiment but it is anticipated that each log would be able to
>>    specify a hash algorithm).  The input to the Merkle tree hash is a
>>    list of data entries; these entries will be hashed to form the leaves
>>    of the Merkle hash tree.  The output is a single 32-byte root hash.
>>    Given an ordered list of n inputs, D[n] = {d(0), d(1), ..., d(n-1)},
>>    the Merkle Tree Hash (MTH) is thus defined as follows:
>>
>>    The hash of an empty list is the hash of an empty string:
>>
>>    MTH({}) = SHA-256().
>
>
> This MTH({}) construct doesn't appear to be used anywhere else in the spec
> (yes?), and so does it really need mentioning?

If it is not defined, then we cannot represent an empty tree.

>>    The hash of a list with one entry is:
>>
>>    MTH({d(0)}) = SHA-256(0x00 || d(0)).
>
>
> The immediately above equation is for leaf entries (yes?),

Yes.

> where in this
> notation n = 1, perhaps it should be stated explicitly:
>
>     When n = 1, a leaf entry is denoted, and D[1] = {d(0)}. The leaf hash
>     (LH) for a leaf entry is calculated as:
>
>     MTH(D[1]) = LH(D[1]) = SHA-256( 0x00 || d(0) )

Ugh. LH(D[1]) seems meaningless to me. A leaf hash is always of a "1
entry tree".

>
>
>
>>    For n > 1, let k be the largest power of two smaller than n.
>
>
> The unqualified "power of two" phrase is arguably ambiguous.

It is?

> Suggested rephrase for this where it occurs throughout section 2..
>
>     For n > 1, let k be a number which is the largest power of two
>     such that k = 2^i, 0 <= i < n, and k < n.

If we're going to go down that path, then it should say:

For n > 1, let k be the largest number such that k = 2^i and k < n.

or

For n > 1, let k = 2^i s.t. k < n and 2k >= n.

surely?

>>    The Merkle Tree Hash of an n-element list D[n] is then defined
>>    recursively as
>
>
> The above statement applies to the combination of the n = 1 equation above
> and the equation below, and so should perhaps be moved up above the n = 1
> equation.

? It says n > 1, so doesn't apply to n = 1?

>>    MTH(D[n]) = SHA-256(0x01 || MTH(D[0:k]) || MTH(D[k:n])),
>>
>>    where || is concatenation and D[k1:k2] denotes the length (k2 - k1)
>>    list {d(k1), d(k1+1),..., d(k2-1)}.
>
>
> The above phrase doesn't parse well and is somewhat ambiguous, here it is
> extracted for clarity:
>
>  "D[k1:k2] denotes the length (k2 - k1) list {d(k1), d(k1+1),..., d(k2-1)}"
>
>
> How about rephrasing it along the lines of this:
>
>     D[k1:k2] denotes a sublist {d(k1), d(k1+1),..., d(k2-1)}, having
>     (k2 - k1) elements, of the original input list D[n]. When (k2 - k1)
>     is 1, a leaf hash is calculated.

We tried lots of different ways of saying this and they were all a
little messy. Yours mixes concerns and is rather verbose, so not
convinced it is actually an improvement.

>
>
>                                          (Note that the hash calculation
>>
>>    for leaves and nodes differ.  This domain separation is required to
>>    give second preimage resistance.)
>>
>>    Note that we do not require the length of the input list to be a
>>    power of two.  The resulting Merkle tree may thus not be balanced,
>>    however, its shape is uniquely determined by the number of leaves.
>>    [This Merkle tree is essentially the same as the history tree [1]
>>    proposal, except our definition omits dummy leaves.]
>
>
> I suggest re-writing the first above Note along with the next paragraph in
> light of all above comments on S2 and [CrosbyWallach].

It was already partly rewritten as a result of above comments, so
let's see how you like the next version?

> O-15.  Some comments on S3:
> ------------------------------------
>
>> 3. Log Format
>
>
> this section isn't just about "format" of log - it's also about log
> behavior/operation

Good point.

>
>
>>    Anyone can submit certificates to certificate logs for public
>>    auditing, however, since certificates will not be accepted by clients
>>    unless logged, it is expected that certificate owners or their CAs
>>    will usually submit them.  A log is a single, ever-growing, append-
>>    only Merkle Tree of such certificates.
>>
>>    When a valid certificate is submitted to a log, the log MUST
>>    immediately return a Signed Certificate Timestamp (SCT).  The SCT is
>>    the log's promise to incorporate the certificate in the Merkle Tree
>>    within a fixed amount of time known as the Maximum Merge Delay (MMD).
>>    If the log has previously seen the certificate, it MAY return the
>>    same SCT as it returned before.
>
>
> What if the submitted end entity cert is the same, but the certificate chain
> is different (yet valid)?

The purpose of the chain is to:

a) Prevent spam, and

b) Identify who to blame in the event of a misissue.

Alternate chains presumably don't actually change the direct blame,
and so I see no reason to do other than what the I-D says - i.e.
return the same SCT as before.

>>                                     TLS servers MUST present an SCT from
>>    one or more logs to the client together with the certificate.  TLS
>>    clients MUST reject certificates that do not have a valid SCT for the
>>    end-entity certificate.
>
>
> [ see comment (7) above ]
>
>
>>    Periodically, each log appends all its new entries to the Merkle
>>    Tree, and signs the root of the tree.  Clients and auditors can thus
>
>
> Should "Clients and auditors" actually be "TLS Clients, log monitors, and
> log auditors" ?

Bearing in mind that these are actually roles rather than distinct
entities, it should probably just say "auditors".

>
>
>>    verify that each certificate for which an SCT has been issued indeed
>>    appears in the log.
>
>
> Add forward reference here to S4 and S5 ?
>
>
>
>>                         The log MUST incorporate a certificate in its
>>    Merkle Tree within the Maximum Merge Delay period after the issuance
>>    of the SCT.
>>
>>    Logs MUST NOT impose any conditions on copying data retrieved from
>>    the log.
>
>
> s/copying data retrieved/retrieving or sharing data/

OK.

>> 3.1. Log Entries
>>
>>
>>    Anyone can submit a certificate to any log.  In order to enable
>>    attribution of each logged certificate to its issuer, the log SHALL
>>    publish a list of acceptable root certificates (this list might
>>    usefully be the union of root certificates trusted by major browser
>>    vendors).  Each submitted certificate MUST be accompanied by all
>>    additional certificates required to verify the certificate chain up
>>    to an accepted root certificate.  The root certificate itself MAY be
>>    omitted from this list.
>>
>>    Alternatively, (root as well as intermediate) Certificate Authorities
>
>
> Additionally?  which manner is the experiment going to operate, or is it TBD
> ?

Not sure what you mean? The log will accept either type of submission.

>>    may submit a certificate to logs prior to issuance.  To do so, a
>>    Certificate Authority constructs a Precertificate by adding a special
>>    critical poison extension (OID 1.3.6.1.4.1.11129.2.4.3, whose
>>    extnValue OCTET STRING contains ASN.1 NULL data (0x05 0x00)) to the
>>    leaf TBSCertificate (this extension is to ensure that the
>
>
> leaf == end entity ?
>
> s/leaf certificate/end entity certificate/g   ?

Yes.

>>    Precertificate cannot be validated by a standard X.509v3 client), and
>>    signing the resulting TBSCertificate [RFC5280] with either
>
>
>
>>    o  a special-purpose (Extended Key Usage: Certificate Transparency,
>>       OID 1.3.6.1.4.1.11129.2.4.4) Precertificate Signing Certificate.
>>       The Precertificate Signing Certificate MUST be certified by the CA
>>       certificate that will ultimately sign the leaf TBSCertificate
>
>
> "sign the leaf TBSCertificate"  means to say "sign the actual
> issued-to-the-customer TBSCertificate component of the End Entity
> certificate" ?

Well, it means something like that, I have added some words.

>
>>       (note that the log may relax standard validation rules to allow
>>       this, so long as the final signed certificate will be valid),
>>
>>    o  or, the CA certificate that will sign the final certificate.
>
>
> "final certificate" is the "issued-to-the-customer End Entity certificate" ?

I have changed this to "issued certificate".

>
>
>>    Structure of the Signed Certificate Timestamp:
>
>
> The SCT discussion here should probably be its own subsection.

OK.

>
>
>>
>>        enum { certificate_timestamp(0), tree_hash(1), 255 }
>>          SignatureType;
>>
>>        enum { v1(0), 255 }
>>          Version;
>
>>
>>
>>          struct {
>>              opaque key_id[32];
>>          } LogID;
>>
>>          opaque CtExtensions<0..2^16-1>;
>>
>>    "key_id" is the SHA-256 hash of the log's public key, calculated over
>>    the DER encoding of the key represented as SubjectPublicKeyInfo.
>
>
> I'd place the above paragraph regarding "key_id" down below the
> SignedCertificateTimestamp definition.
>
>
>>        struct {
>>            Version sct_version;
>>            LogID id;
>>            uint64 timestamp;
>>            CtExtensions extensions;
>>            digitally-signed struct {
>>                Version sct_version;
>>                SignatureType signature_type = certificate_timestamp;
>>                uint64 timestamp;
>>                LogEntryType entry_type;
>>                select(entry_type) {
>>                    case x509_entry: ASN.1Cert;
>>                    case precert_entry: ASN.1Cert;
>>                } signed_entry;
>>               CtExtensions extensions;
>>            };
>>        } SignedCertificateTimestamp;
>>
>>    The encoding of the digitally-signed element is defined in [RFC5246].
>
>
> I would add a few words here summarizing that what happens here is that the
> digitally-signed struct here is replaced in the actual serialized binary
> structure by a struct DigitallySigned and cross-ref to S4.7 of RFC5246.

Except it isn't :-)

And we already reference RFC5246.

>> 3.2. Including the Signed Certificate Timestamp in the TLS Handshake
>
>
> This should be it's own top-level section as mentioned in comment (3).
>
>
>>    The SCT data from at least one log must be included in the TLS
>>    handshake, either by using an Authorization Extension [RFC5878] with
>>    type 182, or by using OCSP Stapling (section 8 of [RFC6066]),
>
>
> add to above sentence:
>
>   or by embedding the the SCT(s) in the presented End Entity cert,

Addressed earlier.

>>                                                                  where
>>    the response includes an OCSP extension with OID
>>    1.3.6.1.4.1.11129.2.4.5 (see [RFC2560]) and body:
>>
>>        SignedCertificateTimestampList ::= OCTET STRING
>>
>>    At least one SCT MUST be included.  Server operators MAY include more
>>    than one SCT.
>>
>>    Similarly, a Certificate Authority MAY submit the precertificate to
>
>
> s/the precertificate/a precertificate/

Yes.

>>    more than one log, and all obtained SCTs can be directly embedded in
>>    the final certificate, by encoding the SignedCertificateTimestampList
>
>
> s/final certificate/actual End Entity certificate/   ?

Addressed earlier.

>>    structure as an ASN.1 OCTET STRING and inserting the resulting data
>>    in the TBSCertificate as an X.509v3 certificate extension (OID
>>    1.3.6.1.4.1.11129.2.4.2).  Upon receiving the certificate, clients
>>    can reconstruct the original TBSCertificate to verify the SCT
>>    signature.
>
>
> This last step of "clients can reconstruct the original TBSCertificate"
> probably should be more thoroughly explained.

Yeah, it probably should.

> O-15.  Some comments on S4:
> ---------------------------
>
>
>> 4. Client Messages
>
>
> title should be "Log Client Messages" ?

Yes.

>>    Messages are sent as HTTPS GET or POST requests.  Parameters for
>>    POSTs and all responses are encoded as JSON objects.  Parameters for
>
>
> s/JSON objects/JSON texts/

I don't agree with this. See above.

> see <https://tools.ietf.org/html/rfc4627>  (it should be cited)
>
>>    GETs are encoded as URL parameters.  Binary data is base64 encoded as
>>    specified in the individual messages.
>>
>>    The <log server> prefix can include a path as well as a server name
>>    and a port.  It must map one-to-one to a known public key (how this
>>    mapping is distributed is out of scope for this document).
>
>
> s/distributed/constructed and distributed/  ?
>
>
>>    In general, where needed, the "version" is v1 and the "id" is the log
>>    id for the log server queried.
>
>
>
>
>
>> 4.1. Add Chain to Log
>>
>>
>>    POST https://<log server>/ct/v1/add-chain
>>
>>    Inputs
>>
>>    chain  An array of base64 encoded certificates.  The first element is
>
>
> a JSON text array?

It is already defined to be a field in a JSON object and hence that is
all it could be.

>>       the leaf certificate, the second chains to the first and so on to
>>       the last, which is either the root certificate or a certificate
>>       that chains to a known root certificate.
>>
>>    Outputs
>>
>>    sct_version  The version of the SignedCertificateTimestamp structure,
>>       in decimal.  A compliant v1 implementation MUST NOT expect this to
>>       be 0 (i.e. v1).
>>
>>    id The log ID, base64 encoded.  Since clients who request an SCT for
>
>
> s/clients/log clients/  ?

That seems obvious, but I have added it anyway.

>>       inclusion in the TLS handshake are not required to verify it, we
>
>
> s/the TLS handshake/subsequent TLS handshakes/   ?
>
>
>
>>       do not assume they know the ID of the log.
>>
>>    timestamp  The SCT timestamp, in decimal.
>>
>>    extensions  An opaque type for future expansion.  It is likely that
>>       not all participants will need to understand data in this field.
>>       Logs should set this to the empty string.  Clients should decode
>>       the base64 encoded data and include it in the SCT.
>>
>>    signature  The SCT signature, base64 encoded.
>
>
> "The SCT signature" means a SignedCertificateTimestamp structure ?

No, the signature that is a component of the structure.

>>    If the "sct_version" is not v1, then a v1 client may be unable to
>>    verify the signature.  It MUST NOT construe this as an error.  [Note:
>>    log clients don't need to be able to verify this structure, only TLS
>>    clients do - if we were to serve the structure binary, then we could
>>    completely change it without requiring an upgrade to v1 clients].
>
>
> Does this "if we were to serve the structure binary...."  statement mean to
> say that since v1 log clients don't need to be able to verify the SCT
> signature over the various returned data items, that this operation could
> instead return an opaque binary blob?

Indeed.

> O-16.  Some comments on S5:
> ---------------------------
>
>
>> 5. Clients
>>
>>
>>    There are various different functions clients of logs might perform.
>
>
> Perhaps this section should be entitled "Log Client Roles" ?

Since you persuaded me to include TLS clients, no :-)

> this section doesn't mention the role of a (CA) log client that submits
> "certs and cert chains" to logs. Even though the latter role is mentioned
> elsewhere in the spec it should perhaps be mentioned here also.

OK.

>
>
>> 5.1. Monitor
>>
>>
>>    Monitors watch logs and check that they behave correctly.  They also
>>    watch for certificates of interest.
>
>
> "Monitor" should be "Log Monitor" ?

There's no other kind of monitor :-)

>
>>
>> 5.2. Auditor
>>
>>
>>    Auditors take partial information about a log as input and verify
>>    that this information is consistent with other partial information
>>    they have.  An auditor might be an integral component of a TLS
>>    client, it might be a standalone service or it might be a secondary
>>    function of a monitor.
>
>
> "Auditor" should be "Log Auditor" ?

And there's no other kind of auditor.

>
>
>
>> 8. Efficiency Considerations
>>
>>
>>    The Merkle tree design serves the purpose of keeping communication
>>    overhead low.
>>
>>    Auditing logs for integrity does not require third parties to
>>    maintain a copy of each entire log.  The Signed Tree Heads can be
>>    updated as new entries become available, without recomputing entire
>>    trees.  Third party auditors need only fetch the Merkle consistency
>>    proofs against a log's existing STH to efficiently verify the append-
>>    only property of updates to their Merkle Trees, without auditing the
>>    entire tree.
>
>
> The above could be explained in more detail, and S5.1 should be
> cross-referenced. Is the last sentence above essentially a summary of step
> #8 in S5.1? Or are there differences?

It is a summary of 5.1

>
>
> ---
> end
>
>