Re: [lamps] draft-ietf-lamps-rfc3709bis-01 security, reliability, and privacy considerations

Russ Housley <housley@vigilsec.com> Wed, 01 June 2022 21:38 UTC
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.21\))
From: Russ Housley <housley@vigilsec.com>
In-Reply-To: <87sfoqdk94.fsf@fifthhorseman.net>
Date: Wed, 01 Jun 2022 17:38:06 -0400
Cc: LAMPS <spasm@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <DC7741A3-D11A-4908-8A67-5A44007B4054@vigilsec.com>
References: <877d8hx019.fsf@fifthhorseman.net> <7BA047D3-B499-4395-A8BB-99D5C816ADC6@vigilsec.com> <87sfoqdk94.fsf@fifthhorseman.net>
To: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
Archived-At: <https://mailarchive.ietf.org/arch/msg/spasm/0b6DBbgXhEaNgJSNF_udP2lQ1Ck>
Subject: Re: [lamps] draft-ietf-lamps-rfc3709bis-01 security, reliability, and privacy considerations
Precedence: list
DKG:

> Hi Russ and other LAMPSers--

Hi.

> Russ, thanks for taking a look at these concerns and restructuring some
> of the guidance in the draft in response.
> 
> You've clearly read both of my earlier lengthy e-mails and resolved
> several of the issues that i raised, and i appreciate that work.
> 
> That said, a large number of the issues appear to remain unaddressed,
> and have not even been commented on.  I did a time-consuming re-review
> of my earlier message, with the results outlined below.
> 
> I do not think this draft is ready to go to the larger community.  There
> remain several internal contradictions, broken or confusing references,
> at least one outstanding security concern, and the guidance to
> implementers is extremely minimal.  It would be easy for a naive
> implementer working from this draft to miss some of the subtleties
> inherent in this data structure and to produce or consume these objects
> in a way that has a problematic outcome in terms of security,
> reliability, interoperability, and/or privacy.
> 
> Some notes on the current security and privacy consderations are below,
> followed by a re-review of my earlier raised points.
> 
> I understand that the authors may not be able to (or want to) adopt all
> of the points raised, but given that some otherwise
> easy/trivial/non-controverisial points have not been addressed
> (e.g. fixing internal section references, breaking out subsections), i
> would appreciate a point-by-point response to indicate that they have at
> least been considered and explicitly declined.

DKG, I greatly appreciate the time you have given for your review.  I'm sorry that our responses where not well coordinated.  We tried to divide up the work, but it sounds like some things got missed.  We will do better on this reply.

> On Tue 2022-05-17 16:58:25 -0400, Russ Housley wrote:
>> I think the best way to respond if to offer the revised Security Considerations and the new Privacy Considerations.
>> 
>> 9.  Security Considerations
> […]
>>   When a relying party fetches remote logotype data, a mismatch between
>>   the media type provided in the mediaType field of the LogotypeDetails
>>   and the Content-Type HTTP header of the retrieved object should be
>>   treated as a failure and the fetched logotype data should not be
>>   presented to the user.
> 
> Is there a reason that this is a lower-case should, and not a SHOULD or
> even a MUST?  In what case would it be reasonable for a relying party to
> display an image with a mismatched Content-Type?

I agree.  This one should be a MUST.

>>   When a subscriber requests the inclusion of remote logotype data in a
>>   certificate, the CA cannot be sure that any logotype data will be
>>   available at the provided URI for the entire validity period of the
>>   certificate.  To mitigate this concern, the CA may provide the
>>   logotype data from a server under its control, rather than a
>>   subscriber-controlled server.
> 
> This paragraph (and other paragraphs, and even the section above with
> the header "Logotype Data") use the term "logotype data", which again is
> ambiguous between the explicitly-defined LogotypeData object, and the
> underlying images or audio objects.  Perhaps some framing text in the
> entire document (in the terminology section?) might make this clearer? I
> recognize that you've cleaned it up some in the document compared to the
> previous draft, but there are many more places where this can get
> confusing.  Do you think the privacy or security concerns are any
> different for the LogotypeData data structure as compared to the data
> that the LogotypeData points to?  or do you think they all share the
> same risks?

I'm trying to achieve a balance in the changes.  I'd really like to preserve as much of the original text in RFC 3709 as possible.  I'm not trying to avoid clarity, but I am trying to avoid changes that are purely editorial.

The first sentence of the section that you reference defines the term "logotype data"; it says: "This specification defines two types of logotype data: image data and audio data."

When the document uses "LogotypeData", it is always talking about the ASN.1 structure.

I reviewed every instance of "LogotypeData" in the document, and I only found one places that was confusing (to me) in the Privacy Considerations.  You poke at the same sentence later, so I'll offer alternative text when responding to that comment.

>>   The controls available to a parent CA to protect itself from rogue
>>   subordinate CAs are non-technical.  They include:
>> 
>>   *  Contractual agreements of suitable behavior, including terms of
>>      liability in case of material breach.
>> 
>>   *  Control mechanisms and procedures to monitor and follow-up
>>      behavior of subordinate CAs.
> 
> Do we want an informational reference to Certificate Transparency (RFC
> 9162) here?

Sure.  I propose:

   *  Control mechanisms and procedures to monitor and follow the
      behavior of subordinate CAs, including Certificate Transparency
      [RFC9162].

>>   *  Use of certificate policies to declare an assurance level of
>>      logotype data, as well as to guide applications on how to treat
>>      and display logotypes.
>> 
>>   *  Use of revocation functions to revoke any misbehaving CA.
>> 
>>   There is not a simple, straightforward, and absolute technical
>>   solution.  Rather, involved parties must settle some aspects of PKI
>>   outside the scope of technical controls.  As such, issuers need to
>>   clearly identify and communicate the associated risks.
> 
> I can't tell whether the paragraph above is related to parent CAs
> defending themelves against rogue intermediates, or about the overall
> "security considerations" section.

It is not just about the bullets above.  Do you have a proposal for improvement?

>> 10.  Privacy Considerations
>> 
>>   Certificates, and hence their logotype images, are commonly public
>>   objects and as such usually will not contain privacy-sensitive
>>   information.  However, when a logotype image that is referenced from
>>   a certificate contains privacy-sensitive information, appropriate
>>   security controls should be in place to protect the privacy of that
>>   information.  Details of such controls are outside the scope of this
>>   document.
> 
> Is this true for certificates generally?  I'm not sure what the
> definition of "privacy-sensitive information" is, but (for example) an
> X.509 certificate intended for use in S/MIME often contains a legal name
> an e-mail address, and in many cases some sort of employment status
> (e.g. when the Subject DN contains an O= field).  Does this draft
> consider legal name, e-mail address, or employment status to be
> "privacy-sensitive information"?

If the combination of email address and subject name is privacy sensitive, then do not put them in the same certificate.  Many S/MIME certificates only have the email address for this reason.

Maybe it would be better to approach this differently:

   Certificates are commonly public objects, so the inclusion of
   privacy-sensitive information in certificates should be avoided.  The
   more information that is included in a certificate, the greater the
   likelihood that the certificate will reveal privacy-sensitive
   information.  The inclusion of logotype data needs to be considered
   in this context.

I realize this does not directly answer your question.  I do not think this is the document to say whether an email address and subject name are privacy sensitive, and they real answer is "sometimes".

>>   In cases where logotype data is cached, monitoring would reveal when
>>   a remote LogotypeData, image, or audio sequence is fetched for the
>>   first time.
> 
> Which monitoring?  This paragraph could be talking about monitoring done
> by a network observer, or monitoring done by the operator of data
> server.
> 
> If the considerations here are about the network observer, then this
> draft which recommends both http and https should be clear that https
> will hide the content of the data request.  It probably should also
> mention that if data objects are different sizes then they can still be
> distinguishable, so that webservers that offer resources consumed by
> this protocol should pad their HTTPS responses to common sizes.
> 
> Clients observing this protocol that care about network observability
> should also pad their queries to a common size.

Both are able to monitor, but of course, the server operator can peek after the HTTPS is stripped away.

I think two paragraphs might be easier for the reader, one about network monitoring and another about the server operator.

   Logotype data might be fetched from a server when it is needed.  By
   watching activity on the network, an observer can determine which
   clients are making use of certificates that contain particular
   logotype data.  Since clients are expected to locally cache logotype
   data, network traffic to the server containing the logotype data will
   not be generated every time the certificate is used.  Further, when
   logotype data is not cached, activity on the network would reveal
   certificate usage frequency.  Even when logotype data is cached,
   regardless of whether direct or indirect addressing is employed,
   network traffic monitoring could reveal when logotype data is fetched
   for the first time.  Implementations MAY encrypt fetches of logotype
   data using HTTPS and pad them to a common size to reduce visibility
   into the data that is being fetched.  Likewise, servers MAY reduce
   visibility into the data that is being returned by encrypting with
   HTTPS and padding to a few common sizes.

   Similarly, when fetching logotype data from a server, the server
   operator can determine which clients are making use of certificates
   that contain particular logotype data.  As above, locally caching
   logotype data will eliminate the need to fetch the logotype data each
   time the certificate is used, and lack of caching would reveal usage
   frequency.  Even when implementations cache logotype data is cached,
   regardless of whether direct or indirect addressing is employed, the
   server operator could observe when logotype data is fetched for the
   first time.

>>   When the the "data" URI scheme is used, there is no network traffic
>>   to fetch logotype data, which avoids the concerns described above,
>>   but the certificate will likely be larger than one that contains a
>>   URL.  For this reason, the "data" URI scheme will be the only one
>>   that is supported by some CAs.
> 
> This is unclear and incomplete, for several reasons:
> 
> - it offers two conflicting reasons (concerns about privacy risks of network
>   traffic, and concerns about certificate size), and then says "for
>   this reason".  I'm assuming that it means the former, and not the
>   latter, but it would be better to be clear.
> 
> - This doesn't address the tradeoffs, or who would be responding to
>   them. Is it really just CAs that might limit these choices?  what
>   about subscribers, whose privacy might be more at stake than the CAs?
>   What about consuming implementations, whtat might have the relying
>   party's interests in mind?
> 
> - At least part of the privacy risk due to network traffic seems to
>   still be present -- even when the data: URI scheme is used -- as long
>   as the indirect method is used.  that is "there is no network
>   traffic" isn't quite right.

The point is that the "data" URI scheme with direct ad offers better privacy, so this is the only way that some CAs are willing to include logotype data.

I suggest:

   When the "data" URI scheme is used with direct addressing, there is
   no network traffic to fetch logotype data, which avoids the
   observations of network traffic or server operations described above.
   To obtain this benefit, the certificate will be larger than one that
   contains a URL.  Due to the improved privacy posture, the "data" URI
   scheme with direct addressing will be the only one that is supported
   by some CAs.  Privacy-aware certificate subscribers MAY wish to
   insist on their logotype data being embedded in the certificate with
   the "data" URI scheme with direct addressing.

>>   In cases where logotype data is cached, the cache index should
>>   include the hash values of the associated object with the goal of
>>   fetching the object only once, even when it is referenced by multiple
>>   URIs.  The index should include hash values for all supported hash
>>   algorithms.  Give preference to logotype data that is already in the
>>   cache when multiple alternative are offered in the LogotypeExtn
> 
> should be "multiple alternatives"

Fixed.

>>   certificate extension.
> 
> This makes sense as long as we're talking about the relying party's cache.

Okay.  I change the first sentence to make that clear:

   In cases where logotype data is cached by the relying party, the
   cache ...

>>   When fetching remote logotype data, relying parties should used the
>>   most privacy-preserving options that are available to minimize the
>>   opportunities for servers to "fingerprint" clients.  For example,
>>   avoid cookies, e-tags, and client certificates.
>> 
>>   When a relying party encounters a new certificate, the lack of
>>   network traffic to fetch logotype data might indicate that a
>>   certificate with references to the same logotype data has been
>>   previously processed and cached.
> 
> This might want an informational reference to Section 14.9 of rfc6797,
> as an example of the kind of attack possible.

HSTS does not feel like a good fit for an example, especially since that section talks about HSTS Host selecting its own host name and subdomains.

> I really appreciate this kind of specific guidance for the relying
> party.
> 
> Perhaps this section could be enhanced by providing specific subsections
> with guidance for the different players if they are interested in
> privacy-oriented defenses.
> 
> Perhaps four subsections are in order:
> 
> - Privacy-sensitive relying parties (how defend the user's privacy when
>   encountering these extensions)

I think the paragraph above already covers this.

> - Privacy-sensitive issuers (how to minimize information leakage by
>   structuring certificates)

I think the point about using the "data" URI scheme is used with direct addressing is the best that we have to offer.

> - Privacy-conscious subscribers (how to request a certificate that
>   minimizes privacy risks, and reasonable checks to confirm that an
>   issued certificate conforms to this guidance)

I think the point about using the "data" URI scheme is used with direct addressing is the best that we have to offer, and that was added above.

> - Privacy-conscious LogotypeData and logotype data hosts (how to serve
>   this data to minimize privacy risks to subscribers and relying
>   parties)

Again, I think the point about using the "data" URI scheme is used with direct addressing is the best that we have to offer, which eliminates this aspect altogether.

> Each subsection might only need a paragraph or two, but it would make it
> easy for an implementer to understand what kind of guidance is relevant
> to them.
> 
> Below i review my earlier comments and highlight those which have not
> been addressed.  This would be easier if there were an issue tracker,
> but i've still heard nothing about an issue tracker from any of the
> authors.  If you'd like, i again offer the use of
> https://gitlab.com/dkg/lamps-rfc3709bis to keep track (i've updated the
> git tree with the current draft's markdown source).  Let me know on list
> here if you'd like to use it.

I hope that we will not need the issue tracker, but if this gets complicated, then we will take you up on that.

> On Fri 2022-03-25 16:14:10 -0400, Daniel Kahn Gillmor wrote:
> 
>> i) not clear how to map the elements in the logotypeHash sequence of
>>    hashes to the elements in the logotypeURI sequence of URIs.  Should
>>    every reference within a given logotypeDetails refer to the same
>>    underlying object?  (this is related to (c) above).
> 
> I still think that the draft is unclear about this.  while the "Logotype
> Data" section describes how some images must be "variants" of the same
> underlying visual representation, for example, it doesn't make it clear
> that the "LogotypeData" object (either directly included or pointed to
> by the "LogotypeReference" object) is the SEQUENCE container that holds
> the variants, but that the communityLogos or otherLogos SEQUENCEs at the
> higher level are *not* required to hold variants of the same underlying
> visual representation.

I'm not sure I understand this comment.  Maybe you are asking why the community logotype is SEQUENCE OF but the issuer organization logotype and subject organization logotype are not?  If so, this was sorted many years ago when RFC 3709 was in the works.

We could add the following if I have correctly guessed concern:

   Certificate issuers may include more than one community logotype to
   indicate participation in more than one global community.

>> l) when fetching an indirect logotype, what Content-Type should the
>>    relying party expect for the LogotypeData?   I'm assuming it's
>>    supposed to be DER-encoded?
> 
> I still don't understand why this document doesn't define an explicit
> MIME type for this data.  servers can serve it either with the
> registered MIME type or with application/octet-stream, but this scenario
> appears to be *exactly* why we register MIME types.

RFC 3709 did not define a MIME type for a LogotypeData file, but it did have a convention for the filename.  This document removes the file extension of ".LTD".  This change resolves Errata ID 2325.

>> m) no security considerations for folding polymorphic image renderers
>>    into certificate handling code?  we have recent evidence that there
>>    can be pretty disastrous bugs lurking there:
>>    https://googleprojectzero.blogspot.com/2021/12/a-deep-dive-into-nso-zero-click.html
>>    guidance about ensuring narrow, well-audited codepaths would be
>>    great.
> 
> Please include a mention of this.  If the image is a jpeg, the rendering
> utility SHOULD NOT attempt .gif rendering (for example).

I have a hard time with this one.  There are many, many sources of potential bugs.  I have a hard time seeing why this one deserves a special spotlight.

>> PS the Security Considerations section contains the word "cashed" when i
>>   think it means "cached"
> 
> This is still the case.

I rewrote that paragraph above, so I could not find it.  Anyway, it is fixed now.

> On Tue 2022-03-29 18:17:02 -0400, Daniel Kahn Gillmor wrote:
>> Furthermore, the draft doesn't do much to try to alleviate size concerns
>> at all -- it almost looks like an afterthought.  For example, if the
>> draft really wanted to prioritize minimizing certificate size, then it
>> would offer an ASN.1 native OCTET STRING representation for the embedded
>> objects, rather than stuffing another layer of base64-encoding with the
>> data: URL.
> 
> I have seen no attempt to address this tradeoff.  The draft could
> mitigate at least some of the size concern by specifying a compact way
> to encode in-band data.  It does not do so, and it does not even try to
> justify why it does not do so.

I do not think we should address this one.  Doing so would be a move away from the syntax in RFC 3709.  At least that was the outcome from my perspective from the exchange between you and Stefan on the LAMPS WG mail list a few weeks ago.

>> Even worse, the text in the draft says:
>> 
>>   There is no need to significantly increase the size of the
>>   certificate by including image and audio data of logotypes when a URI
>>   identifying the location to the logotype data and a one-way hash of
>>   the referenced data is included in the certificate.
>> 
>> This "no need" appears to encompass the privacy rationale, if i'm
>> reading it right.
> 
> "no need" still stands in the draft.  That is unfortunate.

When RFC 3709 was written, the PKIX WG was very concerned about the impact on the size of the certificate.  As demonstrated by RFC 6170, privacy can be improved by increasing the size of the certificate.

Given that this is a merger of RFC 3709 and RFC 6170, I suggest:

   Image and audio data for logotypes can be remote by including a URI
   that identifies the location to the logotype data and a one-way hash
   of the referenced data in the certificate.  The privacy-related
   properties for remote logotype data depend on four parties: the
   certificate relying parties that use the information in the
   certificate extension to fetch the logotype data, the certificate
   issuers that populate the certificate extension, certificate
   subscribers that request certificates that include the certificate
   extension, and server operators that provides the logotype data.

   Alternatively, embedding the logotype data in the certificate with
   direct addressing (as defined in Section 4.3) provides improved
   privacy properties and depends upon fewer parties.  However, this
   approach can significantly increase the size of the certificate.

Note that it pulls in some of your thoughts related to the Privacy Considerations, but of course, the introduction to this sections should not contain too many details.

>> It even warns against embedding with scary-but-unquantified,
>> non-actionable guidance like:
>> 
>>     NOTE: Implementations need to be cautious about the size of images
>>     included in a certificate in order to ensure that the size of the
>>     certificate does not prevent the certificate from being used as
>>     intended.
>> 
>> What are these limits?  Where do they come from?  Do they also apply to
>> the size of images fetched over the network?  I'm fine with establishing
>> sensible, even arbitrary limits for the sake of interoperability.  But
>> then they need to be well-defined and measurable limits with clear scope
>> as to where they apply.
> 
> No additional guidance has been given here.

I do not think that hard numbers are available.  Given the other pressures to make certificates bigger, like PQC, I'm tempted to drop that NOTE.  I would like to hear what my co-authors think before doing so.

>>>> f) i didn't see any guidance to relying parties about how to handle a
>>>>    mismatch between the MIME type referenced in the mediaType field of
>>>>    the logotypeData and the Content-Type HTTP header of the retrieved
>>>>    resource.  Consider a polymorphic bytestream that could be
>>>>    interpreted by either a png renderer or a pdf renderer.  As the spec
>>>>    is currently written, it looks like a client could accept a
>>>>    Content-Type from the https server that doesn't match the mediaType
>>>>    field, and it might accept the HTTP header as ground truth.
>> 
>> This point didn't seem to get a response.  can you comment on it?  Did i
>> misunderstand the draft?
> 
> I see that -02 recommends that a mismatch between the served MIME type
> and the proposed MIME-type should result in a failure to render that
> object.  However, this choices raises a few new questions:
> 
> - When such a failure arises, should the relying party retry other
>   members of the logotypeURI in hopes of getting a match? or should it
>   treat the entire LogotypeDetails as invalid?  if the LogotypeDetails
>   has no valid members, should the relying party try the next
>   LogotypeDetails in the LogotypeImage or LogotypeAudio, or should the
>   entire LogotypeImage or LogotypeAudio as invalid?  if the entire
>   LogotypeImage or LogotypeAudio is invalid, should the relying party
>   fall back to the next LogotypeImage or LogotypeAudio, or should it
>   treat the entire surrounding LogotypeData as invalid?

I think that the relying party MAY try others.  The implementations that I know about do not include multiple choices, so I cannot draw on any real world experience.

I suggest:

   ...  However, if more than one location for the
   remote logotype data is provided in the certificate extension, the
   relying party MAY try to fetch the remote logotype data from an
   alternate location to resolve the failure.

> - how does this mediatype mismatch interact with the cache of data?
>   for example, consider one LogotypeImage with a sha256 digest X with a
>   mime-type of image/jpeg.   When a relying party encounters a
>   certificate that contains a LogotypeImage with the same sha256 digest
>   X, but with mime-type image/png, it doesn't do the network fetch,
>   but instead relies on its cache.  does it now feed the jpeg data to
>   its png renderer?  or does the cache index by mime type as well as
>   cryptographic digest?

The hash algorithm is very broken if this is a concern.  That said, I propose:

   In cases where logotype data is cached by the relying party, the
   cache index should include the hash values of the associated logotype
   data with the goal of fetching the logotype data only once, even when
   it is referenced by multiple URIs.  The index should include hash
   values for all supported hash algorithms.  The cached data should
   include the media type as well as the logotype data.  Implementations
   should give preference to logotype data that is already in the cache
   when multiple alternatives are offered in the LogotypeExtn
   certificate extension.

> For that matter, should a relying party cache image, audio, or
> LogotypeData objects that are received in-band, so that if they appear
> later as a reference they can be accessed?

Appendix B.3 has an example of the "data:" URI.  The addition of this logotype data to the cache will not improve privacy for the processing of this certificate.  However, it _might_ help if another certificate contains a URL for the exact same logotype data.

I propose:

   When the "data" URI scheme is used, the relying party MAY add the
   embedded logotype data to the local cache, which could avoid the need
   to fetch the logotype data if it is referenced by a URL in another
   certificate.

>>>> i) not clear how to map the elements in the logotypeHash sequence of
>>>>    hashes to the elements in the logotypeURI sequence of URIs.  Should
>>>>    every reference within a given logotypeDetails refer to the same
>>>>    underlying object?  (this is related to (c) above).
>>> 
>>> The text says: "Both direct and indirect addressing accommodate
>>> alternative URIs to obtain exactly the same item"
>> 
>> the text is confusing as written because:
>> 
>> - there are multiple layers of plurals involved (do direct and indirect
>>  addresses need to both point to the exact same item?  what about the
>>  layers between the LogotypeReference and the LogotypeDetails?), and
>> 
>> - "exactly the same item" is unclear.

I changed it to "exactly the same logotype data."

>> Does "exactly the same item" mean that the contents of the HTTP
>> responses should be byte-identical?  or that their *headers* are also
>> byte-identical?  what if the URL points to a resource that has different
>> body content based on the Accept: or Accept-Language: (or Accept-*?)
>> headers in the HTTP request?
> 
> This would be cleared up if it indicated that it is talking specifically
> about the logotypeURI field of LogotypeDetails object.  Retrieving any
> member reference of the logotypeURI's SEQUENCE MUST yield a response
> body that is bytewise identical to that retrieved from any other member
> reference.

The media type needs to match, and the hash of the logotype data needs to match.  I think the text is clear.

>> what i still don't understand, after several re-reads of the draft, is
>> why the LogotypeData can have more than one image and more than one
>> audio element.  Or maybe LogotypeData offers multiple distinct digital
>> representations of the same object by enclosing a sequence of
>> LogotypeImage objects?  but then i don't understand why there are
>> multiple LogotypeDetails objects within a LogotypeImage object.
>> 
>> Maybe the draft itself could offer a succinct explanation of why each of
>> these layers is present?
> 
> The rationale for every layer remains unclear to me after several
> re-reads of the draft and having this e-mail exchange.

Again, changes to this are not on the table.  We are preserving compatibility with the syntax in RFC 3709.

>> It's pretty weird that the document is titled with the word "logotypes",
>> and it has a terminology section, and yet there is no direct
>> acknowlegement of the word "image" or "audio" or even "media" until §3.
>> It appears to offer a generic mechanism to embed media resources in
>> X.509 certificates, which happens (by historical accident) to be named
>> "logotype".
>> 
>> If i didn't already know what i was looking at, it would be very hard to
>> make sense of the draft.  I'm not saying to drop the term "logotype"
>> entirely, but at least give the document a title like:
>> 
>>   Internet X.509 Public Key Infrastructure: Media ("Logotypes") in X.509 Certificates
>> 
>> And explain the historical accident in the introduction and/or in the
>> terminology section.
> 
> I still think this kind of terminology cleanup would help to clarify the
> document.

Section 1 introduces the idea.  It says:

   A big part of this process is branding.  Service providers and
   product vendors invest a lot of money and resources into creating a
   strong relation between positive user experiences and easily
   recognizable trademarks, servicemarks, and logotypes.

But you are right that image and audio come later.  I'd really rather not introduce too much change into the structure of the document.  Doing so will make it harder for a reader to compare the new RFC with RFC 3709.

>> TLS 1.3 only recently got finished protecting the server's certificate
>> in the TLS handshake's encryption layer, so that a network observer
>> can't see the offered endpoint identity.
>> 
>> If this draft expects the client to make a separate network fetch for a
>> given resource, potentially over http(!) then it risks undoing that
>> metadata minimization work.  Even if the resource is fetched over https,
>> if the size of the request/response pair for the resource is unique to
>> the host in question, then a network observer can guess with high
>> confidence the identity of the service being visited.
>> 
>> Furthermore, if the cert-specific media is reused across certificates
>> over time, then it can produce a long-lived data trail, capable of
>> linking otherwise unlinkable identities.
> 
> These concerns -- in particular related to the work done to hide the TLS
> 1.3 server certificate -- are not mentioned in the draft.

While the privacy improvements in TLS 1.3 are important, the Web PKI has seen no adoption of RFC 3709.  That said, I think an addition paragraph is justified:

   TLS 1.3 [RFC8446] includes the ability to encrypt the server's
   certificate in the TLS handshake, which helps hide the server's
   identity from anyone that is watching activity on the network.  If
   the server's certificate includes remote logotype data, the client
   fetching that data might disclose the otherwise protected server
   identity.

>> n) There seems to be a mismatch between the ASN.1 and the text about
>>   whether an image must be present.  The text says yes:
>> 
>>      Each logotype present in a certificate MUST be represented by at least one image data object.
>> 
>>   but the ASN.1 seems to say that audio-only is OK:
>> 
>>    LogotypeData ::= SEQUENCE {
>>       image           SEQUENCE OF LogotypeImage OPTIONAL,
>>       audio           [1] SEQUENCE OF LogotypeAudio OPTIONAL }
>>          -- At least one of the OPTIONAL components MUST be present
>>          ( WITH COMPONENTS { ..., image PRESENT } |
>>            WITH COMPONENTS { ..., audio PRESENT } )
> 
> This confusion remains in the document.

You are right.  Thanks for catching this.  Given the introduction of text-to-speech capability, I think the MUST statement should be changed to a SHOULD statement.

I suggest:

   Each logotype present in a certificate SHOULD be represented by at
   least one image data object.

>> o) It would be great if each example were supplied within an simple
>>   X.509v3 certificate represented in PEM form, in addition to whatever
>>   dumpasn1 data and commentary the draft provides.  This would make the
>>   draft something actionable, which implementers could confirm works
>>   with their tooling (or doesn't).
>> 
>>   I recommend using the sample certificate authorities in
>>   draft-ietf-lamps-samples to create the sample certificates.
> 
> draft-ietf-lamps-samples is now RFC 9216.  Please include some full
> X.509 test vectors.

Some implementers that have found these examples sufficient.  I'd like to hear from an implementer that the additional work would really be helpful before putting in the effort.

>> p) It would be great to see more examples/test vectors, including:
>> 
>>    - examples that exercise the audio functionality
>> 
>>    - examples where any given SEQUENCE in the twisty maze of SEQUENCEs
>>      has more than one element
>> 
>>    - examples of all Logotypes and "Other logotype type OIDs" defined
>>      in the draft
>> 
>>   If there isn't enough space (?) to include these examples in an
>>   appendix, then at least point to a place where these things will
>>   reliably exist so that an implementer can confirm the rough consensus
>>   with their own running code.
> 
> I would really appreciate having a wider corpus that demonstrates how
> the different variants might be used.  If that corpus is not in the
> draft, where is it?  if some implementer has actually implemented all
> this, surely they have a test suite that covers these concerns.  can the
> document point to that test suite?

Since implementers are showing more interest these days in the "data" URI scheme, I am not convinced that the effort you suggest will really help an actual implementer.  That said, if other people see the value, please speak up.  I'll do the work if people that are writing code will find it valuable.

>> q) §3 ("Logotype Data") says: "If a logotype of a certain type (as
>>   defined in Section 1.1)" but section 1.1 is "Certificate-based
>>   Identification".  Maybe it means "Section 2" or some subpart of
>>   section 4.1?  I recommend including these references in the
>>   underlying source by symbolic identifier, so that when sections get
>>   renamed, the output matches automatically.
>> 
>>   The security considerations section also references section 1.1 when
>>   it appears to be talking about either section 2 or section 4.1.
> 
> This section numbering confusion is still present in the document.

You are correct.  This should refer to Section 2.  Fixed.

>> r) https://www.ietf.org/archive/id/draft-ietf-lamps-rfc3709bis-01.html#section-4.4.3 says:
>> 
>>     When a certificate image logotype appears in the otherLogos, it
>>     MUST be identified by the id-logo-background object identifier.
>> 
>>   I think this is meant to be:
>> 
>>     When a certificate image logotype appears in the otherLogos, it
>>     MUST be identified by the id-logo-certImage object identifier.
>> 
>>   (that is, the OID name appears to have been copy/pasted from the
>>   §4.4.2)
> 
> This error is still in the document.

You are correct.  Fixed.

>> s) the ASN.1 for refStructURI contains this comment:
>> 
>>                    -- Places to get the same LogotypeData
>>                    -- image or audio object
>> 
>>   is it really "image or audio"? or is it "image and/or audio" or is it
>>   just a LogotypeData object, since that object itself might refer to
>>   multiple different image or audio objects?
> 
> I have seen no resolution for this question.

It is an inclusive or.

>> t)  "4.3. Embedded Images" says:
>> 
>>      If the logotype image is provided through direct addressing, then
>>      the image MAY be stored within the logotype certificate extension
>>      using the "data" scheme [RFC2397]. The syntax of the "data" URI
>>      scheme defined is included here for convenience:
>> 
>>   The implication here seems to be that if the LogotypeData object
>>   included by reference ("indirect addressing") then the LogotypeURI
>>   within that LogotypeData *can't* use the "data" URI.  If this is
>>   right, it means that indirect addressing will actually incur two
>>   distinct network lookups if the cache is empty, resulting in both
>>   latency and privacy losses.
>> 
>>   Why not allow the LogotypeURI contained with remotely-fetched
>>   LogotypeData to use a data: URI?
>> 
>>   It seems possible that this text was introduced in confusion over the
>>   multiple layers of indirection that are possible in this draft: i can
>>   understand why there's no point in sending a data: URI in a
>>   LogotypeReference, because you might as well just send a LogotypeData
>>   directly.
> 
> I haven't seen the issue raised in (t) addressed at all.

This is not changed from RFC 3709.  By using a "data" URI scheme with direct addressing, one gets the most privacy-preserving outcome.  Further, there is no value in using the "data" URI to embed a LogotypeData.  Direct addressing already does that.

>> u) §4.4.3 says:
>> 
>>     Applications providing a Graphical User Interface (GUI) to the
>>     certificate user MAY present a certificate image according to this
>>     standard in any given application interface, as the only visual
>>     representation of a certificate.
>> 
>>   I'm assuming here that "certificate user" may be either the
>>   subscriber or a relying party.  I think this text is problematic for
>>   a relying party.

There are three bullets in section 4.4.3.  The first two are examples of a human user that is relying on a certificate.  The last one is an example of a subscriber selecting the certificate that they want to use.  I have seen the last one in a proof-of-concept implementation.

>>   As a relying party, the application has to have some level of
>>   knowledge about how it will *act* on the cert (the draft calls this
>>   "automated certificate path validation").  The draft makes it very
>>   clear that the images here have no bearing on the functional aspect
>>   of the certificate.
>> 
>>   I might trust a given CA (or its intermediate CAs) to validate the
>>   functional parts of a certificate mechanically (via ACME, for
>>   example; i can rely on a CA to say "we know this cert belongs to
>>   foo.example because we executed ACME challenge X").  That validates
>>   the functional parts, which are what my tooling relies on.  But i
>>   have *no idea* how any of my "trusted" root CAs (or their
>>   intermediaries) are going to produce or vet these images. Indeed,
>>   since the spec says that the images won't be used in path validation,
>>   the CAs might deliberately not vet them much, because there are no
>>   CA/B Forum Baseline Requirements about them (are there?)

The CA/B Forum does not cover this extension in the baseline requirements.  As I said above, I have not seen this extension used in the Web PKI.

>>   But when i ask my tooling to render information about a peer's
>>   certificate, the most salient features that i'm looking for are the
>>   verified, functional elements, not just the branding.

In some contexts, you are right.  In others, the branding is very important.  That is what lead to the creation of RFC 3709 in the first place.

>>   Consider a certificate with a subjectAltName of foo.example, but an
>>   embedded certificate image that prominently displays the VISA logo
>>   and say "foo.example" in 8 point font near the bottom right.  If the
>>   operator of foo.example knows that Alice has a Visa credit card and
>>   will use this mechanism to confirm the identity of whatever site she
>>   visits, this strikes me as a clear phishing risk.  the operator can
>>   just make, say, https://visa.foo.example mimic the visa webpage and
>>   when the user clicks on the site info button in their browser to try
>>   to understand what's going on, there's the visa logo, with no other
>>   info!
>> 
>>   So I can't tell whether the cited paragraph is acknowledging that some
>>   implementations will fail to give the user visible detail about the
>>   functional mechanics of a certificate, or if it is actually
>>   encouraging implementations to hide their functional interpretation
>>   of the cert.

I think you have already accepted this is an appropriate GUI for the selection among several certificates held by a subscriber.

>>   If the cited paragraph is just an acknowledgement that some
>>   implementations will be dangerously insecure, I think we should (a)
>>   offer a more explicit lamentation, and (b) add a sentence that
>>   indicates that an implementation that can only offer this visual
>>   representation MUST at least offer a certificate export mechanism
>>   (coupled with an export of any related cached LogotypeData or media
>>   objects that were retrieved over the network), so that an interested
>>   user can debug the underlying certificate with different tooling.
>>   This is more of a "MUST (BUT WE KNOW YOU WON'T)" from RFC 6919§1 but
>>   hey it's better than an idle lamentation.  Even better would be to
>>   drop this MAY entirely, and convert it to a MUST NOT.
> 
> (u) above describes a significant security concern, not yet addressed in
> draft -02.

I suggest:

   Applications providing a Graphical User Interface (GUI) to the
   certificate user MAY present a certificate image as the only visual
   representation of a certificate; however, the certificate user SHOULD
   be able to easily obtain the details of the certificate content.

>> v) "5. Type of Certificates" says:
>> 
>>     logotypes MUST NOT be part of certification path validation or any
>>     type of automated processing.
>> 
>>   but this seems to be in conflict with requirements like "The
>>   background image MUST allow black text to be clearly read when placed
>>   on top of the background image." which would presumably be enforced
>>   by the certificate issuer via automated processing.
>> 
>>   Perhaps this is supposed to mean "any type of automated processing by
>>   the relying party" or "any type of automated processing during
>>   certificate validation"?
> 
> This item (v) is also still not addressed.  Are we really saying that an
> issuer MUST NOT attempt to confirm (for example) that two formats of
> visual representation are actually roughly visually similar, using (for
> example) a conversion to raw pixels and then some sort of distance metric?

I think you are misreading this one.  Path validation is specified in Section 6 or RFC 5280.  I'll add an explicit reference.  This is not saying anything whatsoever about the vetting that a CA might perform prior to including the logotype data in the certificate.

>> x) section 4.1 says:
>> 
>>    The LogotypeReference and LogotypeDetails structures explicitly
>>    identify one or more one-way hash functions employed to authenticate
>>    referenced image or audio objects. CAs MUST include a hash value for
>>    each referenced object, calculated on the whole object. CAs SHOULD
>>    include a hash value that computed with the one-way hash function
>>    associated with the certificate signature, and CAs MAY include other
>>    hash values. Clients MUST compute a one-way hash value using one of
>>    the identified functions, and clients MUST discard the logotype data
>>    if the computed hash value does not match the hash value in the
>>    certificate extension.
>> 
>>  This paragraph is doing too much work at once and mixing up (or
>>  forgetting about) what it's talking about.  LogotypeReference doesn't
>>  refer to "image or audio objects" -- it refers to a LogotypeData
>>  object.  But of course LogotypeDetails *does* refer to image or audio
>>  objects, but not to LogotypeData objects.  Likewise, the last sentence
>>  seems to imply that hash mismatch should trigger rejection only for
>>  LogotypeReferences ("discard the logotype data") but not for
>>  LogotypeDetails.
>> 
>>  Functionally, it also seems to imply that an object should be rejected
>>  when any hash doesn't match, but verification terminates successfully
>>  when any hash tried does match.  The implication here is that when
>>  multiple hashes are offered, and some of them are miscalculated, the
>>  rejection or acceptance will depend on the order in which the client
>>  tries the hashes.  the outcome would be more stable if the requirement
>>  was either "try all hashes you have an implementation for, accept if
>>  one of them matches" or "try all hashes you have an implementation
>>  for, reject if one of them doesn't match".  Then the outcome for a set
>>  of mismatched hashes will only vary based on which hashes are
>>  implemented, not on the implementation's choice of ordering.
> 
> The ambiguity described by (x) is still present in the document, and
> seems likely to lead to non-interoperable implementations (a certificate
> can be produced that will be rendered in one way by an implementing
> relying party, and in another way by a different relying party).

Yes, the logotype data should be rejected if it does not match the hash.  During the development of RFC 3709, there was a concern that the CA would perform some vetting of the logotype data, and then the server operator would change the image returned from the URL.  The hash value detects such a change.

>> y) section 6 ("Use in Clients") says:
>> 
>>     The logotype is to be displayed in conjunction with other identity
>>     information contained in the certificate. The logotype is not a
>>     replacement for this identity information.
>> 
>>   but section 4.4.3 ("Certificate Image Logotype") says:
>> 
>>     Applications providing a Graphical User Interface (GUI) to the
>>     certificate user MAY present a certificate image according to this
>>     standard in any given application interface, as the only visual
>>     representation of a certificate.
>> 
>>   These seem to be in direct contradiction to one another.
> 
> The internal contradiction indicated by (y) is still present in draft -02.

The logotype is not a replacement the identity presented to the relying party.  It augments it, and I think that is what the quoted text is saying.

I think you have already accepted this is an appropriate GUI for the selection among several certificates held by a subscriber.

>> z) section 3 "logotype data" says:
>> 
>>     Compliant applications MUST NOT display more than one of the image
>>     objects and MUST NOT play more than one of the audio object for any
>>     logotype type at the same time.
>> 
>>     A client MAY simultaneously display multiple logotypes of different
>>     logotype types. For example, it may display one subject
>>     organization logotype while also displaying a community logotype,
>>     but it MUST NOT display multiple image variants of the same
>>     community logotype.
>> 
>>   but section 4.4.1 ("loyalty logotype") says "The logotype extension
>>   MAY contain more than one Loyalty logotype." and section 4.1 also
>>   says:
>> 
>>      If more than one Community logotype is present, they MUST be
>>      placed in order of preferred appearance. Some clients MAY choose
>>      to display a subset of the present community logos; therefore the
>>      placement within the sequence aids the client selection. The most
>>      preferred logotype MUST be first in the sequence, and the least
>>      preferred logotype MUST be last in the sequence.
>> 
>>   This is pretty unclear.  It sounds like multiple logotypes of the same
>>   type can be displayed, but also only one logotype of any given type
>>   can be displayed.

There can be more than one community logotype, and there can be more than one loyalty logotype.  I think that we need to provide similar advice regarding the display of loyalty logotype data:

   If more than one loyalty logotype is present, they MUST be placed in
   order of preferred appearance.  Some clients MAY choose to display a
   subset of the present loyalty logotype data; therefore the placement
   within the sequence aids the client selection.  The most preferred
   loyalty logotype data MUST be first in the sequence, and the least
   preferred loyalty logotype data MUST be last in the sequence.

>> zz) section 4.4.2 says
>> 
>>    "The logotype extension MUST NOT contain more than one certificate
>>    background logotype."
>> 
>>   But section 4.4.3 doesn't offer any such constraint for the
>>   "Certificate Image" logotype.  Surely it should have the same
>>   constraint?
>> 
>>   While we're looking at it, does "logotype" in the cited sentence
>>   refer to a LogotypeDetails, a LogotypeData, a LogotypeImage, a
>>   LogotypeReference, some combination, or something else?
> 
> Neither (z) nor (zz) appear to have been addressed in -02.

I think it is a good idea to limit the certificate image in the same manner as the background.  I suggest:

   The logotype extension MUST NOT contain more
   than one certificate image logotype.

>> zzz) The draft should have distinct subsections for each logotype type,
>>     as well as for each otherType OID.  This will make it easier to
>>     refer to things in the future.
> 
> Please provide the subsections suggested in (zzz), or explain why they
> would be inappropriate.  This is simple editorial work for making the
> document easier to use in the future.

There already are subsections for each of the otherType OIDs.

There is really one paragraph that for each of the the "three standard logotype types". I cannot see how making those subsections would be helpful.

> Doing this level of detailed review takes time.  Re-reading it to try to
> map the changes in the document to see whether they were resolved also
> takes time.  I appreciate that some issues raised have been addressed,
> but i find it surprising that so many of them have *not* been addressed
> (or even considered?) and the document was put back into WGLC by the
> chairs.

I greatly appreciate your review.  You have improved the document.

> I don't know how much more time i can continue to devote to this
> particular document, but i hope that the WG chairs or the responsible AD
> will at least attempt to track which concerns have been addressed and
> which have not.  I think an issue tracker makes this easier to do, but i
> recognize that other people may prefer a different approach.  Whatever
> approach is chosen, i'd hope that issues raised in good faith are not
> accidentally discarded.

I hope you can tell from this comprehensive response that your comments are appreciated.

Russ
[lamps] draft-ietf-lamps-rfc3709bis-01 security, … Daniel Kahn Gillmor
Re: [lamps] draft-ietf-lamps-rfc3709bis-01 securi… Stefan Santesson
Re: [lamps] draft-ietf-lamps-rfc3709bis-01 securi… Daniel Kahn Gillmor
Re: [lamps] draft-ietf-lamps-rfc3709bis-01 securi… Stefan Santesson
Re: [lamps] draft-ietf-lamps-rfc3709bis-01 securi… Daniel Kahn Gillmor
Re: [lamps] draft-ietf-lamps-rfc3709bis-01 securi… Russ Housley
Re: [lamps] draft-ietf-lamps-rfc3709bis-01 securi… Daniel Kahn Gillmor
Re: [lamps] draft-ietf-lamps-rfc3709bis-01 securi… Russ Housley
Re: [lamps] draft-ietf-lamps-rfc3709bis-01 securi… Russ Housley
Re: [lamps] draft-ietf-lamps-rfc3709bis-01 securi… Daniel Kahn Gillmor
Re: [lamps] draft-ietf-lamps-rfc3709bis-01 securi… Russ Housley
Re: [lamps] draft-ietf-lamps-rfc3709bis-01 securi… Daniel Kahn Gillmor
Re: [lamps] draft-ietf-lamps-rfc3709bis-01 securi… Russ Housley
Re: [lamps] draft-ietf-lamps-rfc3709bis-01 securi… Stefan Santesson
Re: [lamps] draft-ietf-lamps-rfc3709bis-01 securi… Stefan Santesson
Re: [lamps] draft-ietf-lamps-rfc3709bis-01 securi… Russ Housley