Re: [Ace] AD Evaluation of draft-ietf-ace-mqtt-tls-profile-12

Benjamin Kaduk <kaduk@mit.edu> Sat, 14 August 2021 00:37 UTC

Return-Path: <kaduk@mit.edu>
X-Original-To: ace@ietfa.amsl.com
Delivered-To: ace@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 947013A2DE7; Fri, 13 Aug 2021 17:37:25 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.498
X-Spam-Level:
X-Spam-Status: No, score=-1.498 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, KHOP_HELO_FCRDNS=0.399, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0a9pusSBMweK; Fri, 13 Aug 2021 17:37:20 -0700 (PDT)
Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 48E023A2DE6; Fri, 13 Aug 2021 17:37:19 -0700 (PDT)
Received: from kduck.mit.edu ([24.16.140.251]) (authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 17E0bAWo011307 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 13 Aug 2021 20:37:16 -0400
Date: Fri, 13 Aug 2021 17:37:10 -0700
From: Benjamin Kaduk <kaduk@mit.edu>
To: Cigdem Sengul <cigdem.sengul@gmail.com>
Cc: draft-ietf-ace-mqtt-tls-profile.all@ietf.org, Ace Wg <ace@ietf.org>
Message-ID: <20210814003710.GA50759@kduck.mit.edu>
References: <20210805223931.GI50759@kduck.mit.edu> <CAA7SwCM1Gjqb+ydTz2dmxA33FKYimwegWcpCckCZWiO3qeyH8w@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <CAA7SwCM1Gjqb+ydTz2dmxA33FKYimwegWcpCckCZWiO3qeyH8w@mail.gmail.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/ace/PMUOIo4whLkJR9zuIBGcez6afgo>
Subject: Re: [Ace] AD Evaluation of draft-ietf-ace-mqtt-tls-profile-12
X-BeenThere: ace@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Authentication and Authorization for Constrained Environments \(ace\)" <ace.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ace>, <mailto:ace-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ace/>
List-Post: <mailto:ace@ietf.org>
List-Help: <mailto:ace-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ace>, <mailto:ace-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 14 Aug 2021 00:37:26 -0000

Hi Cigdem,

Hopefully you have not gotten too far along on the few items where I reply
and say that your proposed change may not be needed; I had hoped to write
this message several days ago.  (That said, there really are only a few
such places; the bulk of your proposals look good.)

On Sat, Aug 07, 2021 at 02:54:16PM +0100, Cigdem Sengul wrote:
> Hello Ben,
> Thank you very much for the review, and I will look into the pull request
> as soon as possible.
> Below I've gone over your comments and categorised them   as "Discussion"
> if I had questions/clarifications/comments, and "Will Do" for the rest.
> 
> Discussion:
> 
> 
> > Specifically, there's no requirement for ACE access tokens to be
> > self-deliniating, so we can't actually programmatically tell whether
> > there's content after the token in a CONNECT message; the mechanisms in
> > Sections 2.2.4.1/2.2.4.2 seem to assume that we can do so for
> > determining whether the CONNECT is just providing a token or also
> > providing PoP over a TLS exporter value.  I think that this just means
> > we need to have an explicit "token length" field (or similar).
> >
> 
> [CS: Those sections do make use of an explicitly token length field Both
> figures 4 and 5 show a token length in the Authentication Data.
> 2.2.4.1 - "For this option, the Authentication Data MUST contain the
> two-byte
>    integer token length, the token, and the keyed message digest (MAC)
>    or the Client signature (as shown in Figure 4)."
> Am I misunderstanding something?
> ]

I think I am misunderstanding (or misreading) something, and you have it
right.  Oops.
I'm not sure what I was looking at when I wrote that comment, but what I'm
looking at now doesn't have the behavior I was apparently concerned about.
Sorry about that.

> >
> > There are also a few places where we seem to be putting requirements on
> > Broker behavior that are in direct conflict with normative requirements
> > of the MQTT specification.  We can't override the external spec, so
> > we'll need to check and reword in any places where there are conflicts.
> > (I'm not an expert on MQTT and only read the spec as part of doing this
> > review, so it's entirely possible that I'm misinterpreting the MQTT spec
> > in some or all of these locations.)
> >
> 
> [CS: Noted, I will respond to them where you have raised them specifically.]
> 
> 
> > A few other general notes before the section-by-section notes:
> >
> > There is very little in this document about the HTTP-based interactions
> > with the AS.  I think the intent is to defer to the core framework for
> > that, but being a little more explicit about what is being pulled in and
> > how would be helpful.
> >
> 
> [CS: OK. Does this mean refer/cite the relevant sections in the core
> framework?]

I think that would help, yes.

> Section 1
> 
>    their subscribers.  In the rest of the document the terms "RS", "MQTT
>    Server" and "Broker" are used interchangeably.
> 
> We will probably get a reviewer asking why we can't pick a single term
> and standardize on it.  However, I expect that there are places where we
> want to emphasize on one aspect or another of its behavior, so don't
> think we should actually do so.  Similarly for the places where we
> mention that CoAP can be used (but don't reference a concrete
> specification).
> 
> [CS: This was brought up before. In practice, MQTT Broker or Broker is
> widely used, MQTT Server is in MQTT spec, and the
> RS is the additional functionality we are putting on the broker with this
> spec. I am happy to keep to one if interchangeable use is an issue -
> possibly select "MQTT Server" then. ]

Let's leave things as-is for now and see who complains.  I just didn't want
it to come as a surprise later, and it sounds like we have a plan for what
we can do if necessary.

> Section 2.2.1
> 
> The way we name these authentication options with specific (quoted)
> strings suggest that they will be used as a protocol element.  But
> where?  Is the literal string "Known(RPK/PSK)" used in both cases (vs.
> having distinct strings for RPK and PSK)?
> 
> Also, the hyphen character tends to more often be used as a joiner than
> a separator, so it's easy to misread this as a triple of "TLS",
> "Anon-MQTT", "None"/etc..  (I originally was going to ask why these all
> had a "TLS" prefix...)  It might be better to use a semicolon or comma
> instead of hyphen.
> 
> [CS: No they are not protocol elements. These were all the cases that we
> came up with Jim thinking about the different possibilities for pushing the
> token (Client may push the token before/during TLS session set-up; or
> client may defer it to MQTT connect. ). I am open to suggestions to present
> them better. I guess removing the "" would help? or any other suggestions
> on naming?

Yes, removing the quotation marks would help.
I'd also replace the hyphen with a comma in each pairing, and change the
sentence introducing the list to be something like "Thus, client
authentication procedures involve the following possible combinations:"
(s/options/combinations/ is the main goal, but I reworded further to avoid
having both "combined" and "combinations" in the same sentence.)

>  o  "TLS:Anon-MQTT:None": This option is used only for the topics that
>       do not require authorization, including the "authz-info" topic.
> 
> Are there topics other than "authz-info" that don't require
> authorization?  We might need to add some heding language to the earlier
> statement that "Client and the Broker MUST perform mutual
> authentication" if so.
> 
> [CS: Yes, actually we should think that the Broker may be hosting both
> public and private topics, and hence, it may accept unauthorised clients if
> that's the case. ]

I suspected there could be some other public topics, but wasn't sure.
Maybe start the section like:

  The Broker always authenticates itself to the Client.  Some topics may be
  "public" and not require client authentication to subscribe to (or even
  write to, as in the case of "authz-info"), but in order to perform
  operations that require authorization, the Client and Broker MUST perform
  mutual authentication.

>    It is RECOMMENDED that the Client implements "TLS:Anon-MQTT:ace" as a
>    first choice when working with protected topics.  However, depending
>    on the Client capability, Client MAY implement "TLS:Known(RPK/PSK)-
>    MQTT:none", and consequently "TLS:Anon-MQTT:None" to submit its token
>    to "authz-info".
> 
> It's good that we provide guidance on which of the authentication
> schemes are preferred (since we offer both ACE-layer and TLS-layer
> schemes).  However, we will surely be asked to defend why there are two
> possible ways of doing it instead of just one, and this text doesn't
> really do a good job of that.  What might cause a need to implement
> TLS:Known(RPK/PSK)-MQTT:none?
> 
> [CS: This is I think historical - at the beginning we wrote the spec for
> MQTT v.3.1 where we are overloading
> the username/password.  We thought this may not be always preferred, as the
> ace framework already talked about the
> authz-info endpoint. Hence, we gave two options for providing the token and
> tried to replicate the "authz-info" behaviour in MQTT.]

Let's try to come up with some additional text we can put in the document
to support the less-preferred behavior, even if it's just so that some
architecture can use a consistent implementation strategy across profiles.

>   The Broker MUST support "TLS:Anon-MQTT:ace".  To support Clients with
> >    different capabilities, the Broker MAY provide multiple client
> >    authentication options, e.g. support "TLS:Known(RPK)-MQTT:none" and
> >    "TLS:Anon-MQTT:None", to enable RPK-based client authentication, but
> >    fall back to "TLS:Anon-MQTT:ace" if the Client does not send a client
> >    certificate (i.e. it sends an empty Certificate message) during the
> >    TLS handshake.
> >
> > [Just a potential nit about the wording: "fall back" implies some
> > ordering requirements, but IIRC use of RPK requires sending the token to
> > authz-info before starting the new TLS connection, which doesn't quite
> > match the steps as ordered in this description.]
> >
> 
> [CS: True that is not well-written.
> That was added because to potentially handle the Brokers having optional
> client authentication for TLS.
> So, client authentication in TLS doesn't work, the broker would expect a
> token in CONNECT.
> But it doesn't look needed as we do have a MUST for a token to be provided
> for TLS-level client authentication.
> ]

I think that makes sense to not require the broker to support TLS-level
client authentication, yes.

>    In MQTT v5.0, the Client signals a clean session (i.e. the session
> >    does not continue an existing session), by setting the Clean Start
> >    Flag to 1 and, the Session Expiry Interval to 0 in the CONNECT
> >    message.  [...]
> >
> > I don't understand why setting the Session Expiry Interval to 0 is
> > needed to produce a clean session.  As I understand it, setting this
> > interval to 0 merely directs the server to not store session state after
> > the client disconnects from this session, which is unrelated to whether
> > or not this is a new session or a reused session.
> >
> 
> [CS: This was when we switched from MQTT v3.1 to induce MQTT v5 in the
> draft.
> We were looking for some equivalence between two specs and MQTTv5 had:
> "Setting of Clean Start to 1 and Session Expiry Interval to 0 is equivalent
> in MQTT v3.1.1 of setting Clean Session to 1."
> But we should acknowledge that this got separated and a clean start is the
> way to signal to start a new session. Will fix.]

Okay, and that makes sense how we would have gotten here!  (I only read
MQTT v5, not v3.1, so I think I internalized the nice/idealized version of
MQTT and missed out on all the messy bits of v3.1...)

>        In this profile, the Broker SHOULD always start with a
>    clean session regardless of how these parameters are set.  Starting a
>    clean session helps the Broker avoid keeping unnecessary session
>    state for unauthorised clients.  If the Broker starts a clean
> 
> This SHOULD seems highly problematic to me.  It looks like it
> contradicts a hard requirement of MQTT 5.0 ("If a CONNECT packet is
> received with Clean Start set to 0 and there is a Session associated
> with the Client Identifier, the Server MUST resume communications with
> the Client based on state from the existing Session").  If we want to
> recommend that the broker does not maintain session state, that should
> be implemented by setting the Session Expiry Interval in the CONNACK,
> not as part of the CONNECT processing.
> 
> [CS: yes, you are right. the CONNACK can set the session expiry interval to
> 0.
> Actually, then it doesn't matter what clean start flag is.
> I will change to the "Session Expiry Interval SHOULD be set to 0" so that
> no session state is kept.
> And yes the server can override the client's session expiry interval in
> connect by
> sending a different value in connack. ]

Thanks for confirming that I was reading the MQTT spec properly (in this
instance, at least)!  This proposal sounds good to me.

>    When reconnecting to a Broker that supports session continuation, the
> >    Client MUST still provide a token, in addition to using the same
> >    Client identifier, setting the Clean Start to 0 and supplying a
> >    Session Expiry interval in the CONNECT message.  The Broker MUST
> >
> > (As above, setting a Session Expiry interval seems to relate to the
> > subsequent connection, not the current one, so this feels
> > over-specified.)
> >
> 
> [CS: Overspecified for current connection? As this is something that is
> communicated in the CONNECT, my preference would be to include it and be
> clear about it. At the moment, the client is allowed to provide a session
> expiry interval (to say for how long it wants the session state to be kept
> at the broker),  but the server can connack a different value. It may
> actually be better to recommend a broker to always dictate session expiry
> in connack and not accept client value? ]

I think we probably should focus more on broker behavior, yes.
So this could be "client SHOULD set to zero" and "broker MUST dictate
expire in connack" as you suggest.

>    Note that, according to the MQTT standard, the Broker uses the Client
> >    identifier to identify the session state.  In the case of a Client
> >    identifier collision, a client may take over another client's
> >    session.  [...]
> >
> > Just to confirm: the ACE token is not used to provide authorization to
> > use a given client identifer; the client identifier is just used as an
> > unauthenticated identifier?  We might consider calling that out
> > explicitly.
> >
> 
> [OK. Yes, the MQTT client id is considered different than the Client
> Identifier that may be in the token.]

They are clearly different identifiers, yes.  I was trying to ask about
something slightly different, though.  One might imagine treating MQTT
Client Identifiers as resources in their own right, and having the AS grant
authorization to use them.  In that case the Client Identifier could be
encoded in the token and the broker could reject attempts to use a client
identifier that were not accompanied by a token authorizing use of that
identifier.  However, such an architecture seems to have some layering
issues and could be messy to implement, so I was assuming that we were not
proposing to do so.

>    topics.  Therefore, while this issue is not expected to affect
> >    security, it may affect QoS (i.e.  PUBLISH or QoS messages saved for
> >    Client A may be delivered to a Client B).  [...]
> >
> > I think this is just an aside and not something we need to cover in this
> > document, but what happens if (e.g.) PUBCOMP goes to client B when
> > PUBREC went to client A?  Does the message actually get delivered?  Does
> > anything deadlock?
> >
> 
> [CS: Is your example assuming client A sent a PUBREL, but the broker's
> PUBCOMP sent and failed to client A?

Either sent and failed or the sending was delayed and disconnection
detected before it got sent.

> From the spec: When Client B reconnects, it won't have a state of sending
> PUBREL, so it won't retry, whereas client A would have retried ("When a
> Client reconnects with Clean Start set to 0 and a session is present, both
> the Client and Server MUST resend any unacknowledged PUBLISH packets (where
> QoS > 0) and PUBREL packets using their original Packet Identifiers. This
> is the only circumstance where a Client or Server is REQUIRED to resend
> messages. Clients and Servers MUST NOT resend messages at any other time")
> So, I can't see how PUBCOMP could have gone to client B.  But, the
> Broker may release the message to the subscribers, so the packet would be
> delivered to subscribers, as with PUBREC/PUBREL/PUBCOMP it is trying to
> ascertain only-once delivery. (Figure 4.3 in MQTT spec shows the
> receiver Initiates
> onward delivery of the Application Message) but with a note: "The receiver
> does not need to complete the delivery of the Application Message before
> sending the PUBREC or PUBCOMP." So, it will be up to broker implementation
> if message gets delivered.

I think I was assuming that the server would keep the PUBCOMP queued up and
do its own (timer-driven?) retransmissions, but your description makes it
sound like the server only stores the session state but doesn't send a
PUBCOMP until it gets a repeated PUBREL.  That makes sense given the
TCP-based connection scheme, so I think I was just misunderstanding how
MQTT works.  Thanks for helping me un-confuse myself :)

> Section 2.2.4.1, 2.2.4.2
> >
> > (Expounding on the high-level comment from above,)
> > It seems that we're using the presence/absence of extra data after the
> > token to indicate whether or not the MAC/Signature over exporter content
> > is present and thus which message flow is being used for authentication.
> > However, this is only possible if the token itself is self-describing,
> > which I do not think is guaranteed.  Consider, for example, the case of
> > a token that's a reference value that must be introspected in order to
> > retrieve claim information.  Such tokens can be arbitrary byte strings,
> > so I think we need some other in-band way to differentiate between
> > authentication methods.
> >
> 
> [CS: But, There is authentication data length then 2-B token length, then
> token.
> So, using this information, the beginning of MAC can be computed. so, if it
> ends up being the end of authentication data, there is no MAC?]
> 
> 
> > Section 2.2.4.1
> >
> >    For this option, the Authentication Data MUST contain the two-byte
> >    integer token length, the token, and the keyed message digest (MAC)
> >    or the Client signature (as shown in Figure 4).  [...]
> >
> > Does this go in the AUTH (which is what §2.2.4 claims to cover) or the
> > initial CONNECT?  (Hint: later in the paragraph the broker replies to it
> > with a CONNACK.)
> >
> 
> 
> [CS: I guess this is not clear? This is the AUTH property in the client's
> CONNECT. Only if a challenge mechanism is used the broker generates an AUTH
> packet. ]

Right, this was not clear, at least on first reading, since the overall
2.2.4 section heading primes the reader to be thinking about AUTH and I
apparently didn't internalize the first paragraph of §2.2.4 and connect it
to this description.  I'd suggest "the Authentication Data in the initial
CONNECT MUST contain".


>  Section 7.5 of [RFC8446]).  This content is exported from the TLS
> >    session using the exporter label 'EXPORTER-ACE-MQTT-Sign-Challenge',
> >    an empty context, and length of 32 bytes.  [...]
> >
> > While an empty context should provide ample protection, it seems to me
> > that we could consider using the client identifier as the context.
> > There can also be value in incorporating information on the server
> > identity in the output.  If the SNI extension is used, that information
> > would already be included in the key schedule, though we do not
> > currently seem to mandate SNI usage.  Current best practices for new
> > deployments are to always use SNI and always use ALPN, so we should
> > probably consider both of those.
> >
> > [CS: Happy to add all those considerations. Client ID assuming the MQTT
> client id in the CONNECT message?]

Yup, the MQTT client id from CONNECT.

>    a CONNACK with the appropriate response code.  The client cannot
> >    reauthenticate using this method during the same session ( see
> >    Section 4).
> >
> > Depending on what "session" means, this restriction may be too strict.
> > We should probably be more clear about what "session" means...if it's
> > the MQTT session, I think it's okay to use this method when
> > re-connecting to take over the session.
> >
> 
> [CS: This is the MQTT session - I believe we discussed this before, and the
> agreement was that this should not be allowed.
> The reasoning was the client was not necessarily proving anything new with
> the challenge from TLS-Exporter already used.]

I agree that re-using the same challenge value from a TLS-Exporter is
problematic for that reason.

But consider the case where a client makes a new TCP+TLS connection and
then sends MQTT CONNECT with Clean Start=0.  The new TLS connection, even
if using TLS session resumption, will have [*] a different TLS exporter
value than any other TLS connection, so it would be a safe way to
authenticate, but this text seems to prohibit us from using it.
(I guess I commented on basically the same thing later on, too...)

[*] modulo things like the "triple-handshake attack" that motivated the
extended master secret extension of RFC 7627.  See also my comment that we
should incorporate current TLS best practices, which will include mandating
the use of extended master secret.

> Section 2.2.4.2
> 
> I a little bit wonder if we need to hardcode the nonce length or could
> let it be variable, with the corresponding change in level of protection
> provided.  In non-constrained setups we would typically use a 128- or
> even 256-bit bit nonce for this purpose, not a 64-bit one, and it's
> somewhat surprising to preclude the stronger usage.  (Especially so
> since we use a 256-bit value from the TLS exporter.)  If we do allow for
> length variation, we'll need to add length prefixes to the MAC/signature
> input (we should probably do that anyway since the client nonce is
> variable-length).
> 
> [CS: I believe this was discussed before as well. And the general feeling
> was we did not want to do variable nonce, and then complicate with agreeing
> on which lengths etc. If the feeling has changed, happy to make that
> change, which was my initial proposal.)

I don't feel strongly enough to reopen a topic that was previously
discussed.  The nonce is presumably strongly time-limited anyway, so I
think in practice it will provide plenty of security strength.

>    Validation of the signature or MAC MUST fail if the signature
> >    algorithm is set to "none", when the key used for the signature
> >    algorithm cannot be determined, or the computed and received
> >    signature/MAC do not match.
> >
> > Where would the "none" appear?  We haven't said anything about a COSE
> > encoding for the signature or MAC value, or anything like that...I
> > assumed it was going to be the "raw" output from the relevant primitive
> > (EdDSA, HMAC, etc.).
> >
> 
> [CS: Yes. This is written assuming a header JOSE/COSE associated with the
> token or,
> may have received as a result of introspection response. Will revise.]

Thanks!

> 
> >    a new session, it MUST also set Session Present to 0 in the CONNACK
> >    packet to signal a clean session to the Client.  Otherwise, it MUST
> >    set Session Present to 1.
> >
> > (As above,) my understanding is that the Broker does not have agency
> > over whether a new session is started, and must honor the client's
> > request.  So these "MUST set" seem out of place.
> >
> 
> [CS:This will be revised, but the Broker can signal if it can support
> continuing a session or not by sending a CONNACK with session expiry
> interval 0.]

Yup, and that's fine -- I have no problem telling the broker that it will
not actually create new sessions, just with requiring it to put bits on the
wire that OASIS says to not do.

> 
>    If the Broker accepts the connection, it MUST store the token until
>    the end of the connection.  On Client or Broker disconnection, the
>    Client is expected to transport a token again on the next connection
>    attempt.
> 
> This seems to deviate somewhat from the framework, that expects the RS
> to be prepared to store at least one token for future use, and
> recommends storing one token per PoP key.
> [CS: OK I will need to update the draft if I cannot have a deviation.
> Then,I  need to think about how to implement it - I assume PoP will be done
> again
> if the client connects without a token, but RS believes there is a token
> associated with it?]

It may be okay to have a deviation, but I don't want to just "silently"
have a deviation -- it should be noted as such, which indicates that we've
actually thought about it.
(The framework text seems to be "MUST be prepared to store at least one
access token", not "MUST store at least one access token"...)

> 
> >                                                              The Broker
> >    SHOULD also use a cache time out to introspect tokens regularly.
> >
> > We will surely be asked to provide guidance on what timescale
> > "regularly" indicates, if we do not proactively provide some guidance.
> >
> > [CS: I would expect that is based on the dynamicity of the application and
> risk associated, but I was considering
> ranges like every 12 hours; every 24 hours]
> 
>  If the Will Flag is set, then the Broker MUST check that the token
> >    allows the publication of the Will message (i.e. the Will Topic
> >    filter is in the scope array).
> >
> > We might want to say a little more about when this check happens.  My
> > intuition is that it occurs during the CONNECT processing where the
> > actual Will message is provided, and that the connection would be
> > rejected if the Will message is unauthorized.  But this section is
> > titled "Authorizing PUBLISH and SUBSCRIBE Messages", so one might be
> > forgiven for assuming that CONNECT is out of scope...
> >
> 
> [CS: I assumed the check will happen when the Will message needs to be
> published (information about the Will topic is in the CONNECT).
> And the Broker will refuse to publish if it is not covered in the token.
> This can be changed to checking in CONNECT but then, we add "Will message
> not authorised" as a reason to refuse Connection with the rationale that
> even if it is not an actual publication, it may result in publishing a
> message (the will) in future.
> (I've written more on this in "Will do" section below, because it came up
> in multiple places, and I did not want to repeat the same discussion)]

(Definitely don't repeat the same discussion in the same email!)

I think my assumption was that the client sets a Will message and assumes
that it will always go out if the client goes away uncleanly, functioning
as a sort of "dead-man's switch".  If the client disconnects and the broker
prepares to send the Will message only to discover that it's unauthorized,
there's no way to indicate the error to the client, and so the client's
assumption that the message will go out is broken.  Maybe I'm wrong about
what Will messages are used for, though, and this is not actually a problem
in practice.  However, if there is a need to provide some "guarantee of
future delivery" for the Will message, we need to check authorization while
the client is still around, and at CONNECT time seems to be the natural (or
even only?) option for when to do that.  If we do move the checking to
CONNECT, I think we would also want to mention that "Will message not
authorised" can happen at the periodic re-authorisation checks that we
already talk about.

> 
> > Section 4
> >
> >    Authentication Data.  The Broker accepts reauthentication requests if
> >    the Client has already submitted a token (may be expired) and
> >    validated via the challenge-response PoP.  Otherwise, the Broker MUST
> >    deny the request.  If the reauthentication fails, the Broker MUST
> >    send a DISCONNECT with the reason code "0x87 (Not Authorized)".
> >
> > Is this correct?  It seems to say that if the initial CONNECT used the
> > TLS exporter for PoP, then it's forbidden to use the challenge-response
> > method for PoP and thus impossible to reauthenticate on that connection.
> > I don't understand why such a limitation would be needed or useful.
> 
> [CS: Yes, at the time, in the workgroup meetings, I believe we agreed with
> authentication and re-authentication to use the same methods.
> I cannot think of why now. It should be OK to resubmit a token without a
> PoP MAC, and trigger challenge-response. Will fix.
> ]

(just noting that this is the comment I had referred to above as being on
basically the same topic)

> Section 6.2
> >
> >    o  RS-Client PUBLISH authorization failure: When RS is forwarding
> >       PUBLISH messages to the subscribed Clients, it may discover that
> >       some of the subscribers are no more authorized due to expired
> >       tokens.  These token expirations SHOULD lead to disconnecting the
> >       Client rather than silently dropping messages.
> >
> > (I'm not actually sure how much the MQTT spec says about this type of
> > scenario, and thus whether the "SHOULD" is the right term to use.)
> >
> 
> [CS: MQTT v3.1 would not consider the clients losing authentication in the
> middle of a session. I am not sure we can find guidance there on this.]

It makes sense that we won't get much guidance, given that v3.1 was
basically assuming password auth, which is "one and done" for the whole
connection.  I should have been more clear, though -- I think this is a
candidate for "MUST lead to disconnecting the client rather than silently
dropping messages", and I wanted to know if you had a reason to make it
optional.

> Section 10.1
> 
> > The normative use of AIF means that we'll have to wait for AIF to catch
> > up to us at some point, whether at the RFC Editor or sooner.
> >
> 
> [CS: Advice is not to make it normative? There was a strong sentiment in
> the workgroup that we should standardise the token scope]

No, having it normative is fine (good, actually).  I'm just pointing out
that we won't get an RFC number until AIF does, but the datatracker doesn't
even know that AIF had a WGLC yet (though
https://mailarchive.ietf.org/arch/msg/ace/WJ_GjkSgsuVKspK5QDlf4U-VcxI/
thinks otherwise).

> 
> 
> Will Do:
> -----------
> 
> 
> >
> > If we're using TLS Exporters and allow TLS-not-1.3, we need to make some
> > additional requirements on TLS usage in order for the exporter values to
> > be safe.  Typically this takes the form of requiring the extended master
> > secret extension along with guidance on what cipher suites to use; I
> > guess RFC 7925 (rather than 7525) would be the default reference for
> > other TLS usage.
> >
> [CS: OK ]
> 
> 
> > This document seems to mostly use British English.  AFAIR, that's okay,
> > but if it's inconsistent the RFC Editor will prefer American English.  I
> > didn't attempt to check for this (though there are tools like
> > https://github.com/larseggert/ietf-reviewtool that will do so).
> >
> 
> [CS: Will check and Americanize. :) ]

I *think* you don't have to Americanize :)
It's just that if you have a mix, the RFC Editor will pick American every
time.  If it's consistently British they will leave it alone, is my
understanding.

> 
> >
> >    Clients use MQTT PUBLISH message to publish to a topic.  This
> >    document does not protect the payload of the PUBLISH message from the
> >    Broker.  Hence, the payload is not signed or encrypted specifically
> >    for the subscribers.  This functionality MAY be implemented using the
> >    proposal outlined in the ACE Pub-Sub Profile
> >    [I-D.ietf-ace-pubsub-profile].
> >
> > I suggest s/MAY/may/, which would avoid any need to make
> > ace-pubsub-profile a normative reference.
> >
> 
> [CS: OK.]
> 
> 
> >
> >    reference or JSON Web Token (JWT) [RFC7519].  For JWTs, this document
> >    follows [RFC7800] for PoP semantics for JWTs.  The Client-AS and RS-
> >    AS MAY also use protocols other than HTTP, e.g.  Constrained
> >    Application Protocol (CoAP) [RFC7252] or MQTT; it is recommended that
> >    TLS is used to secure these communication channels between Client-AS
> >    and RS-AS.  Implementations MAY also use "application/ace+cbor"
> >    content type, and CBOR encoding [RFC8949], and CBOR Web Token (CWT)
> >    [RFC8392] and associated PoP semantics to reduce the protocol memory
> >    and bandwidth requirements.  For more information, see Proof-of-
> >    Possession Key Semantics for CBOR Web Tokens (CWTs) [RFC8747].
> >
> > One thing that can be surprising to readers not versed in the ecosystem
> > is that RFCs 7800 and 8747 only talk about the token claims that are
> > used to build the PoP system, and don't actually define mechanisms for
> > providing the proof of possession.  We might want a forward-reference to
> > § 2 where we do actually specify mechansisms to prove possession of the
> > indicated key.
> >
> 
> [CS: OK.]
> 
>    o  "TLS:Anon-MQTT:ace": The token is transported inside the CONNECT
> >       message, and MUST be validated using one of the methods described
> >       in Section 2.2.2.  This option also supports a tokenless
> >       connection request for AS discovery.
> >
> > We should probably look carefully at the "for AS discovery" phrasing, in
> > light of the late changes in how the framework talks about the AS
> > Request Creation Hints.
> >
> 
> [CS: OK, I will double-check, but I believe I have included this as an
> option only following the framework.]
> 
> 
> >
> >    o  "TLS:Known(RPK/PSK)-MQTT:none": For the RPK, the token MUST have
> >       been published to the "authz-info" topic.  For the PSK, the token
> >       MAY be, alternatively, provided as an "identity" in the
> >       "identities" field in the client's "pre_shared_key" extension in
> >       TLS 1.3.  The TLS session set-up is as described in DTLS profile
> >       for ACE [I-D.ietf-ace-dtls-authorize].
> >
> > A couple notes here:  First, the DTLS profile primarily uses (D)TLS 1.2
> > terminology, with the DTLS 1.3 flow as an addendum.  I'm actually happy
> > for us to only or primarily talk about the TLS 1.3 idiom here in this
> > document, but we may want to think about what phrasing we use for the
> > relationship between our usage and the DTLS profile.  Second, the DTLS
> > profile is for, well, DTLS, not TLS.  We should acknowledge that and
> > state the (lack of) effect on the procedures to follow.
> >
> 
> [CS: Agreed. I will look into phrasing that.]
> 
> 
> >
> >    The Broker MUST be authenticated during the TLS handshake.  If the
> >    Client authentication uses TLS:Known(RPK/PSK), then the Broker is
> >    authenticated using the respective method.  Otherwise, to
> >    authenticate the Broker, the client MUST validate a public key from a
> >    X.509 certificate or an RPK from the Broker against the 'rs_cnf'
> >    parameter in the token response.  The AS MAY include the thumbprint
> >    of the RS's X.509 certificate in the 'rs_cnf' (thumbprint as defined
> >    in [I-D.ietf-cose-x509]).  In this case, the client MUST validate the
> >    RS certificate against this thumbprint.
> >
> > Just to confirm: we consciously chose to not reference RFC 6125 for
> > "normal" X.509 server certificate validation procedures, in favor of the
> > "constrained environment" procedures we describe here?
> >
> 
> [CS: Yes.]
> 
> 
> >
> > Section 2.2.2
> >
> >    The Broker MUST verify the validity of the token (i.e. through local
> >    validation or introspection, if the token is a reference) as
> >    described in Section 2.2.5.  If the token is not valid, the Broker
> >    MUST discard the token.  Depending on the QoS level of the PUBLISH
> >
> > I see that this is covered in the framework (and we reference the
> > appropriate section already), but I'd consider reiterating that "not
> > valid" includes "the Broker is not an intended audience of the token".
> >
> 
> [CS: OK. Will add that to 2.2.5]
> 
> >
> >    It must be noted that when the RS sends the 'Not authorized'
> >    response, this corresponds to the token being invalid, and not that
> >    the actual PUBLISH message was not authorized.  Given that the
> >    "authz-info" is a public topic, this response is not expected to
> >    cause confusion.
> >
> > Thanks for including this note.  I assume that it would be challenging
> > to get OASIS to allocate a new reason code for us to use, and we didn't
> > attempt to pursue that path.
> >
> 
> [CS: Yes]
> 
> 
> >
> > Section 2.2.3
> >
> >    Similarly, the Broker MUST NOT process any packets before it has sent
> >    a CONNACK.  The only exceptions are DISCONNECT or an AUTH response
> >    from the Client.
> >
> > nit: in some pedantic sense, these two sentences are in conflict, since
> > the latter violates the "MUST NOT".  Who cares and how much has varied
> > over time, especially on the IESG, so it's not clear that we actually
> > need to change anything right now.
> >
> 
> [CS: I see - Would rewording like "Except a DISCONNECT and AUTH response,
> the broker MUST NOT process any other
> packets before it has sent a CONNACK" work?

Yes, that's the typical sort of phrasing we end up with in the cases where
someone decides to make a stink about it.

> 
> 
> >
> >           +------------------------------------------------------+
> >           |CPT=1 | Rsvd.|Remaining len.| Protocol name len. = 4  |
> >           +------------------------------------------------------+
> >
> > Figure 2 appears to show that the Remaining Length field of the fixed
> > header occupies a single octet, but IIUC it's encoded as a variable byte
> > integer.  Fixing that would also let us put the separator that appears
> > after it on a proper byte boundary.
> >
> 
> [CS: True. Will fix - it should have been clarified that this was an
> example and that the actual
> field could be of variable length, using a continuation algorithm for
> larger values.]
> 
> 
> 
> 
> >
> >           |                      'M' 'Q' 'T' 'T'                 |
> >           +------------------------------------------------------+
> >
> > This figure does not show the two-byte length prefix for the string.
> >
> 
> [CS: OK, will fix the example (indeed it is not precise). ]
> 
> 
> >
> >    The Will Flag indicates that a Will message needs to be sent if the
> >    [...]
> >    Interval in the Will Properties.  Section 5 explains how the Broker
> >    deals with the retained messages in further detail.
> >
> > We might be able to get away with leaving the description of Will
> > operation to the MQTT spec, and not have to say so much about it here.
> >
> 
> [CS: This part was extended because I've got a number of clarification
> requests for Will messages in MQTT.
> So, I will keep it if it is not an issue.]

Not an issue.
It's just always easier to add text than remove it, so I try to keep an eye
out for potential opportunities :)

> >
> >
> 
> > Section 2.2.4
> >
> >    To use AUTH, the Client MUST set the Authentication Method as a
> >    property of a CONNECT packet by using the property identifier 21
> >    (0x15).  This is followed by a UTF-8 Encoded String containing the
> >    name of the Authentication Method, which MUST be set to 'ace'.  [...]
> >
> > (I assume there's not an MQTT registry that we can register "ace" in as
> > an Authentication Method.)
> >
> 
> [CS: That's what I understand as well from prev. communication]
> 
> 
> >
> >
> > Section 2.2.5
> >
> >    To authenticate the Client, the RS validates the signature or the
> >    MAC, depending on how the PoP protocol is implemented.  HS256 (HMAC-
> >    SHA-256) [RFC6234] and Ed25519 [RFC8032] are mandatory to implement
> >    depending on the choice of symmetric or asymmetric validation.
> >
> > I think there is a decent argument (and that it's likely some other AD
> > will make it) that we need to make both HS256 and Ed25519 mandatory to
> > implement for the Broker, leaving only clients with the choice.
> > Otherwise we can get into scenarios where interop is impossible.
> >
> 
> [CS: Agreed.]
> 
> 
> >
> >
> > Section 2.2.6.2
> >
> >    On success, the reason code of the CONNACK is "0x00 (Success)".  The
> >    AS informs the client that selected profile is "mqtt_tls" using the
> >    "ace_profile" parameter in the token response.  If the Broker starts
> >
> > The line about the AS returning "mqtt_tls" as the selected profile feels
> > out of place here, where we're discussing successful MQTT
> > authorization.
> >
> 
> [CS: That's definitely misplaced and should be removed]
> 
> 
> 
> > Section 3
> >
> >    The scope field contains the publish and subscribe permissions for
> >    the Client.  The scope is a JSON array, each item following the
> >    Authorization Information Format (AIF) for ACE [I-D.ietf-ace-aif].
> >    Using the Concise Data Definition Language (CDDL) [RFC8610], the
> >    specific data model for MQTT is:
> >
> > This seems a little dicey, since we claim to allow JWT tokens as well as
> > CWT.  JWT "scope" is pretty tightly nailed down to be a space-separated
> > list (though CWT gives much greater freedom).  We probably need to have
> > some text about this situation, with a phrase about how "in order to be
> > compatible with the JWT scope format, we use a single scope value with
> > internal structure", that this structure is also compatible with the CWT
> > rules, and noting that our AIF usage prevents any internal spaces, so
> > the interpretation of the scope value is unambiguous even when
> > unmodified JWT libraries are used.
> >
> 
> [CS: OK will do]
> 
> 
> 
> 
> > Section 3.2
> >
> >    subscription to the particular topic).  The Broker sends a PUBLISH
> >    message with the Topic name to all the valid subscribers.
> >
> > (nit) It seems a little strange to mention specifically the Topic name
> > but say nothing about there being a payload along with it.
> >
> 
> [CS: I assumed Publish "message" captures the payload - what I am trying to
> say is it is the message corresponding to that topic.]
> 
> 
> >
> > Section 3.3
> >
> >    On receiving the SUBSCRIBE message, the Broker MUST use the type of
> >    message (i.e.  SUBSCRIBE) and the Topic Filter in the message header
> >    to match against the scope field of the stored token or introspection
> >    result.  The Topic Filters MUST be equal or a subset of at least one
> >    of the 'topic_filter' fields in the scope array found in the Client's
> >    token.
> >
> > I suggest being very explicit about whether the wildcards in the token
> > scope are expanded as part of matching the topic filter in the request,
> > or if the SUBSCRIBE message must use topic filters that match
> > byte-for-byte the permissions granted by the token.
> >
> 
> [CS: OK. It is the former. So subscribe has "a/b/*", token permits "a/*" -
> this is a valid subscription request, as "a/b/* a subset of "a/*"]

Sounds good, and thanks for making the update.  This would be a very
high-impact place to have implementation differences and failure to
interoperate, so putting in extra text seems merited.

> 
> >
> > Section 5
> >
> >    In the case of a Client DISCONNECT, the Broker deletes all the
> >    session state but MUST keep the retained messages.  By setting a
> >
> > As written, this seems to imply that all client DISCONNECT messages
> > result in the loss of session state.  I didn't (quickly) find a clear
> > statement one way or the other in the MQTT spec, but it does seem that
> > it's allowed to send a nonzero session expiry interval in a DISCONNECT,
> > which seems to imply that such DISCONNECT messages do not cause all
> > session state to be discarded.
> >
> 
> [CS: Will clarify the Session Expiry Interval, which may lead to keeping
> state.]
> 
> >
> >    Hence, the new subscribers can receive the last sent message from the
> >    publisher for that particular topic without waiting for the next
> >    PUBLISH message.  The Broker MUST continue publishing the retained
> >    messages as long as the associated tokens are valid.
> >
> > This MUST seems to be repeating a requirement from the MQTT spec, which
> > may not merit normative language from us.
> >
> 
> [CS: I did a MUST because it puts a condition on tokens being valid.]

Ok.

> 
> >
> >    In case of disconnections due to network errors or server
> >    disconnection due to a protocol error (which includes authorization
> >    errors), the Will message is sent if the Client supplied a Will in
> >    the CONNECT message.  The Client's token scope array MUST include the
> >
> > This "if the Client supplied a Will in the CONNECT message" implies that
> > authorization checks are performed at time of CONNECT...
> >
> 
> [CS: I've put a note on this in discussion. The client does supply the will
> in the message, but it's for a future publication;
> it may never be published. We can choose to reject that in CONNECT, or when
> the time comes.I  am fine either way to add a CONNECT failure
> due to Will being out of scope.]

Sorry to split the discussion between locations in the review/email.
I think the important question here is what the client expects to get from
the Will message, e.g., if it needs to be a sort of "dead-man's switch"
(not sure if there's a better expression than that idiom).  I have no
intersection with the MQTT ecosystem to know the answer myself.  I could
perhaps ask some people internally, though, if that would help.

> 
> >
> >    Will Topic.  The Will message MUST be published to the Will Topic
> >
> > ... but "scope array MUST include the Will Topic" suggests an
> > authorization check when the Will message is actually sent.  Is it one,
> > the other, or both checks that must pass?
> >
> 
> [CS: The token scope needs to include the Will Topic. In the draft, the
> broker checks if it can publish a message, just before a publish.
> This applies to Will as well. Can be changed as stated above.]
> 
> 
> 
> >
> > Section 6.1
> >
> > [My earlier comments about the MQTT header layout apply to Figure 11 as
> > well.]
> >
> 
> [CS: Noted. Will fix.]
> 
> >
> >    The Broker SHOULD NOT accept session continuation.  To this end, the
> >    Broker ignores how the Clean Session Flag is set, and on connection
> >    success, the Broker MUST set the Session Present flag to 0 in the
> >    CONNACK packet to indicate a clean session to the Client.  [...]
> >
> > As above, this seems to violate the MQTT normative requirements (but I
> > mostly only read about MQTT 5, not 3.1.1).
> >
> 
> [CS: Noted. Will fix based on agreement]
> 
> >
> >    The CONNECT in MQTT v3.1.1 does not have a field to indicate the
> >    authentication method.  To signal that the Username field contains an
> >    ACE token, this field MUST be prefixed with 'ace' keyword, which is
> >    followed by the access token.  [...]
> >
> > An example of what this looks like would be helpful.  It sounds like we
> > just take the first three bytes for the sentinel and then go directly to
> > the access token, vs having some kind of "separator" as part of the
> > sentinel value.
> >
> 
> [CS: Yes, will give an example.]
> 
> 
> >
> >    In MQTT v3.1.1, the MQTT Username is a UTF-8 encoded string (i.e.  is
> >    prefixed by a 2-byte length field followed by UTF-8 encoded character
> >    data) and may be up to 65535 bytes.  Therefore, an access token that
> >    is not a valid UTF-8 MUST be Base64 [RFC4648] encoded.  (The MQTT
> >    Password allows binary data up to 65535 bytes.)
> >
> > Data-dependent encoding transformation without explicit signaling is a
> > really bad idea.  I think we need to always base64-encode the token.
> > We should also specify a section reference to RFC 4648 (for plain base64
> > vs base64url) and probably make a statement about whether padding
> > characters are retained or omitted.
> >
> 
> [OK. Will do that]
> 
> >
> >
> >
> > Section 7.1
> >
> >    This document registers 'EXPORTER-ACE-MQTT-Sign-Challenge'
> >    (introduced in Section 2.2.4.1 in this document) in the TLS Exporter
> >    Label Registry [RFC8447].
> >
> > We need to specify the other columns in the registry.
> > I think we can have:
> > DTLS-OK: Y
> > Recommended: Y
> > Reference: [this document]
> >
> 
> [CS: Thank you, will add.]
> 
> 
> >
> > Section 7.2
> >
> >    This document registers the 'application/ace+json' media type for
> >    messages of the protocols defined in this document carrying
> >    parameters encoded in JSON.
> >
> > Thanks for sending this to the media-types list for review
> > (
> > https://mailarchive.ietf.org/arch/msg/media-types/85kGXBBKaWqIoCSU5k7GrE5FRWw/
> > ).
> > It's unfortunate that nobody replied to that thread, but I don't know
> > that there's more that we can do.
> >
> 
> [CS: Do I need to send follow-up e-mails?]

As far as I know, you don't need to do anything else.


> >
> > Section 7.3
> >
> >    The following registrations are done for the ACE OAuth Profile
> >    Registry following the procedure specified in
> >    [I-D.ietf-ace-oauth-authz].
> >    [...]
> >    o  CBOR Value:
> >
> > I think it's clearer for reviewers if we put "TBA" or "to be assigned by
> > IANA" here.
> >
> 
> [CS: OK]
> 
> >
> >    o  Description: Profile for delegating Client authentication and
> >       authorization using MQTT as the application protocol and TLS For
> >       transport layer security.
> >
> > It seems like it might be preferred to talk about using MQTT for the
> > C/RS interactions and HTTP for the interactions with the AS.
> > Separately, mention that TLS is used for confidentiality/integrity
> > protection and server authentication; also, client authentication can be
> > provided either via TLS or using in-band proof of possession at the MQTT
> > application layer.
> >
> 
> [CS: OK. Will revise]
> 
> 
> >
> > Section 8
> >
> >    revoked topics.  If the RS caches the token introspection responses,
> >    then the RS SHOULD use a reasonable cache timeout to introspect
> >
> > ("reasonable cache timeout" again)
> >
> [CS: Will fix it with the same guidance, once agreed.]
> 
> 
> 
> >
> >                                       If the RS supports the public
> >    "authz-info" topic, described in Section 2.2.2, then this may be
> >    vulnerable to a DDoS attack, where many Clients use the "authz-info"
> >    public topic to transport fictitious tokens, which RS may need to
> >    store indefinitely.
> >
> > We do say that the RS only stores "valid" tokens, which includes being
> > generated by a trusted AS and having RS as the audience.  So it's not
> > clear that this statement is accurate if the attack is to involve
> > "fictitious" tokens.  Similar attacks are possible, though.
> >
> 
> [CS: OK,  We accept it can still be DOS'ed? fictitious doesn't sound right.
> my intention was to say tokens not meant to be used.]

Yes, this is still vulnerable to DDoS, but the pool of potential attackers
is somewhat restricted.  In particular, if the AS only issues a small
number of tokens with the RS as the audience, that bounds the amount of
state that the RS has to maintain (but it could still be subject to
resource exhaustion of resources like CPU when processing a bunch of
invalid tokens).  So "may need to store indefinitely" might be the main
thing I was commenting about here.

Thanks for all the replies and updates!

-Ben

> >
> > Section 10.1
> >
> >
> > RFC 8447 probably does not need to be normative; it is just mentioned as
> > the thing that is the reference for the IANA registry of TLS exporter
> > values.
> >
> [CS: OK]
> 
> 
> >
> > Section 10.2
> >
> > I think the DTLS profile needs to be a normative reference, since we
> > defer to it for TLS session set-up.
> >
> 
> [CS: OK]
> 
> >
> > Likewise, RFC 6234 and RFC 8032 specify MTI algorithms for
> > authentication, which would make them normative.
> >
> [CS: OK]
> 
> >
> > Appendix A
> >
> > I think we need to update the checklist to match the current template
> > from
> >
> > https://datatracker.ietf.org/doc/html/draft-ietf-ace-oauth-authz-43#appendix-C
> 
> 
> [CS: OK]
> 
> >
> >
> > -Ben
> >