Re: [nfsv4] Benjamin Kaduk's Discuss on draft-ietf-nfsv4-rfc5661sesqui-msns-03: (with DISCUSS and COMMENT)

David Noveck <davenoveck@gmail.com> Fri, 20 December 2019 14:46 UTC
MIME-Version: 1.0
References: <157665795217.30033.16985899397047966102.idtracker@ietfa.amsl.com>
In-Reply-To: <157665795217.30033.16985899397047966102.idtracker@ietfa.amsl.com>
From: David Noveck <davenoveck@gmail.com>
Date: Fri, 20 Dec 2019 09:46:22 -0500
Message-ID: <CADaq8jegizL79V4yJf8=itMVUYDuHf=-pZgZEh-yqdT30ZdJ5w@mail.gmail.com>
To: Benjamin Kaduk <kaduk@mit.edu>
Cc: The IESG <iesg@ietf.org>, draft-ietf-nfsv4-rfc5661sesqui-msns@ietf.org, nfsv4-chairs@ietf.org, Magnus Westerlund <magnus.westerlund@ericsson.com>, NFSv4 <nfsv4@ietf.org>
Content-Type: multipart/alternative; boundary="00000000000087933c059a23beae"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/LP_KC4GIVkrY7aOk1FRoEXnXV-Q>
Subject: Re: [nfsv4] Benjamin Kaduk's Discuss on draft-ietf-nfsv4-rfc5661sesqui-msns-03: (with DISCUSS and COMMENT)
Precedence: list
> ----------------------------------------------------------------------
> DISCUSS:
> ----------------------------------------------------------------------
>
> it's important to get
> these points clarified, and sooner rather than later.

Agree.

> I expect that the
> following few issues should be quickly resolvable.

I don't doubt it.   The DISCUSS part is quite small and I
will respond to it first.   The other COMMENTs will be addressed
in a mail next week.

> Section 11.10.1 includes a reference to "Section 11.7.2.1 of RFC5661",
> but this document is obsoleting that document.

As a general matter, you cannot avoid referring (informatively) to the
the document you are obsoleting, if only to say, "This document obsoletes
RFC xxxx".   Similarly, I think you can say "This document says A about XYZ
while
RFC xxxx (incorrectly) said B".

> It seems internally
> inconsistent to both obsolete and depend on the same source -- if we rely
> on that content, it should be included in this document.

The question is whether we are relying on the content or merely referring
to it
as a best-eventually-forgotten piece of history. After looking at this in
detail, I have
concluded that the reference to 11.7.2.1 is, as you put it, "relying" on
the material
in RFC5661 and that deleting it was a mistake.  Nice catch.

I think I slipped into this because the material in 11.7.2.1 is way outside
the current
approach to these matters. I thought it could be deleted but have. now
realized that
it can't be but needs to be redescribed in more contemporary terms.

So I have devioded to recast this mateial in two small additional sections
dealing
trunking in place of replication and trnking of file sytem without explicit
file system
location entries.   Each section refer (informatively) to RFC5661 to
indicate to the reader
that the underlying reality has not changed, even though the form of
description has.

The anticipated new sections follow;

11.5.4.1.  File System Trunking Presented as Replication

   In some situations, a file system location entry may indicate a file
   system access path to be used as an alternate location, where
   trunking, rather than replication is to be used.  The situtions in
   which this is appropriate are limited to those in which both of the
   following are true.

   o  The two file system locations (i.e the one on which the location
      attribute is obtained and the one specified in the file system
      location entry) designate the same locations within their
      respective single-server namespaces.

   o  The two server network address(i.e. the one being used to obtain
      the location attribute and the one specified in the file system
      location entry) designate the same server (as indicated by the
      same value of the so_major_id field of the eir_server_owner field
      returned in response to EXCHANGE_ID).

   When these conditions hold, access to using both access paths is
   genrally trunked, although, when the attribute fs_locations_info is
   used, trunking msy be disallowed:

   o  When the fs_locations_info attribute shows the two entries as not
      having the same simultaneous-use class, trunking is inhibited and
      the two access paths cannot be used totgether.
     In this case the two paths can be used serially with no transition
      activity required on the part of the client.  In this case, any
      transition between access paths is transparent, and the client in
      transferring access from one to the other is acting as it would in
      the event that communication is interrupted, with a new connection
      and possibly a new session being established to continue access to
      the same file system.

   o  Note that for two such location entries, any information within
      the fs_locations_info attribute that indicates the need for
      special transition activity, i.e., the appearance of the two file
      system location entries with different handle, fileid, write-
      verifier, change, and readdir classes, indicates a serious
      problem.  The client, if it allows transition to the file system
      instance at all, must not treat any transition as a transparent
      one.  The server SHOULD NOT indicate that these entries belong to
      different handle, fileid, write-verifier, change, and readdir
      classes, whether or not the two entries are shown belonging to the
      same simultaneous-use class.

   This situation was recognized by [62], even though that document made
   no explicit mention of trunking.

   o  It treated the situation that we describe as trunking as one of
      simultaneous use of two distinct file system instances, even
      though, in the explanatory framework now used to describe the
      situstion, the case is one in which a single file system is
      accessed by two different trunked addresses.

   o  It treated the situation in which the two paths are to be used
      serially as a special sort of "transparent transition", while in
      the descriptve framework now used to categorize transition
      situations, this is a case of a "network endpoint transition" (see
      Section 11.9.

11.6.  Trunking without File System Location Information

   In situation in which a file system is accessed using two server-
   trunkableaddresses (as indicated by the same value of the so_major_id
   field of the eir_server_owner field returned in response to
   EXCHANGE_ID), trunked access is allowed even though there might not
   be any location entries specfically indicating the use of trunking
   for that file system.

   This situation was recognized by [62], even though that document made
   no explicit mention of trunking and treated the situation as one of
   simultaneous use of two distinct file system instances, even though,
   in the explanatory framework now used to describe the situstion, the
   case is one in which a single file system is accessed by two
   different trunked addresses.

In addition I anticipate rewriting the first paragraph of Section 11.10.1 to
read as follows:

   The fs_locations_info attribute (described in Section 11.17) may
   indicate that two replicas may be used simultaneously, although some
   situations in which such simultaneous access is permitted are more
   appropritaely described as instances of trunking (see
   Section 11.5.4.1).  Although situations in which multiple replicas
   may be accessed simultaneously are somewhat similar to those in which
   a single replica is accessed by multiple network addresses, there are
   important differences, since locking state is not shared among
   multiple replicas.


> This is somewhat awkward since the limited nature of the update results
> in my not having the full context of the rest of the document; with that
> limitation in my understanding in mind, I'd like to confirm that we're
> comfortable with the use of "network address" in the context of
> trunking/migration, specifically the extent to which we do not discuss
> port numbers.

Now that you point this out, I guess I shouldn't have been as comfortable as
I had been.

> The relevant XDR types do allow for optional port numbers
> to be included, with a default to be used when not specified,

The xdr types you refer to are not relevant here, since we have, in the
interest of allowing hostname as well as IP addresses,specfied these
IP adresses, when IP addresses are used in the form of ASCII text.
Nevertheless, given that port numbers can be given, optionally after a
colon, the essence of the issue is as you described.

> but in
> this document we do have a new note that different ports may be used for
> different connection types to the same logical server, and also that
> different ports "is not the essence of the distinction between the two
> endpoints".  I think there might be cases where the port is relevant for
> a distinction, but the main ones I can think of are of questionable
> relevance (essentially, roughly equivalent to multiple userspace NFS
> servers on a single host but in different trust/privilege domains) --
> I'd like another opinion or several.

I agree that your major case is of limited use, but that doesn't mean that
it should be ignored.

What I'm proposing doing is adding the following as a new paragraph
associated with the second bullet in the second set of bullets in Section
11.1.2 (i.e. after <vspace> in xml2rfc v2), to read as follows:

      The network addresses used in file system location entries
      typically appear without port number indications and are used to
      designate a server at one of the standard ports for NFS access,
      i.e. 2049 or 20049 for use with RPC-over-RDMA.  Port numbers may
      be used in file system location entries to designate servers
      (typically user-level ones) accesed using other port numbers.  In
      the case if network addresses indicating trunking relationships,
      use of explcit port number is inappropriate since trunking is a
      relationsip btetween network addresses.  See Section 11.5.2 for
      details.

I also anticipate revising the second bullet in section 11.5.2 to read as
foollows:

     The client may fetch the file system location attribute for the
      file system.  This will provide either the name of the server
      (which can be turned into a set of network addresses using DNS),
      or a set of server-trunkable location entries.  Using the latter
      alternative, the server can provide addresses it regards as
      desirable to use to access the file system in question.  Although
      these enties can contain port numbers, these port numbers are not
      used in determining trunking relationships.  Once the cadidate
      addresses have been determined and EXCHANGE_ID done to the proper
      server, only the value of the so_major field returned by the
      serrvers in question determines whether a trunking relationship
      actually exists.


> In a similar "discuss discuss" vein, Section 11.10.8 describes a
> scenario that does not give much clarity, at a protocol level, into what
> degree of replication synchronization a client can expect from a given
> file system that advertises multiple replicas.  I recognize that this is
> de facto just stating the deployed reality, but it's also hard to feel
> good about having this level of ambiguity in a propsed standard,

It gets easier over time.  Sigh!  This would be the fourth Proposed
Standard with this issue :-(

Still I think we can do a bit better by relyimg on the three special cases
listed at
the end of that section and adding something like the following after the
list:

   When none of these special situations apply, there is no basis,
   within the protool, for the client making assumptions about the
   contents of a replica file ststem and its relationship to previous
   file sytem instances.  This may mean that switching between file
   system that are not read-only is not available, where either the
   client does not use or the server does not support the
   fs_locations_info atribute.

> and the (unchanged) text in Section 11.5.5 seems to impose a stricter
> consistency requirement, at least on potential migration targets.  (A
> bit more detail in the COMMENT section.)

That is a different and has been since the multi-server
namespace features were first added to NFSv4 decades ago.
It is inherently easier to synchronize A and B when you know that A
is over befoe B begins.

> Section 11.13.2 mentions that "[i]ssues connected with a client
> impersonating another by presenting another client's id string are
> discussed in Section 21", but I failed to find this discussion in
> Section 21.

It is there but it is not that easy to find.   This is referring to the
4.1 state protection features.

I suggest replacing the sentence in question by  "Issues connected with a
client impersonating another
by presenting another client's client id string can be adressed using
NFSv4.1 state protection features, as described in Section 21."

> (The discuss-level issue is just the internal
> inconsistency; there's a decent argument that this is covered by
> Appendix C's "not written in accord with RFC3552".

True.   Note that this is within a list of "Security Issues that need to be
Addressed."

> Though if the text
> was already written for draft-ietf-nfsv4-mv1-msns-update, not including
> it here seems a little silly.)

There is no text but there has been some speculation that the client
authentication features of rpc-tls might be helpful here.

On Wed, Dec 18, 2019 at 3:32 AM Benjamin Kaduk via Datatracker <
noreply@ietf.org> wrote:

> Benjamin Kaduk has entered the following ballot position for
> draft-ietf-nfsv4-rfc5661sesqui-msns-03: Discuss
>
> When responding, please keep the subject line intact and reply to all
> email addresses included in the To and CC lines. (Feel free to cut this
> introductory paragraph, however.)
>
>
> Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
> for more information about IESG DISCUSS and COMMENT positions.
>
>
> The document, along with other ballot positions, can be found here:
> https://datatracker.ietf.org/doc/draft-ietf-nfsv4-rfc5661sesqui-msns/
>
>
>
> ----------------------------------------------------------------------
> DISCUSS:
> ----------------------------------------------------------------------
>
> Thank you for this document (and its predecessor); it's important to get
> these points clarified, and sooner rather than later.  I expect that the
> following few issues should be quickly resolvable.
>
> Section 11.10.1 includes a reference to "Section 11.7.2.1 of RFC5661",
> but this document is obsoleting that document.  It seems internally
> inconsistent to both obsolete and depend on the same source -- if we rely
> on that content, it should be included in this document.
>
> This is somewhat awkward since the limited nature of the update results
> in my not having the full context of the rest of the document; with that
> limitation in my understanding in mind, I'd like to confirm that we're
> comfortable with the use of "network address" in the context of
> trunking/migration, specifically the extent to which we do not discuss
> port numbers.  The relevant XDR types do allow for optional port numbers
> to be included, with a default to be used when not specified, but in
> this document we do have a new note that different ports may be used for
> different connection types to the same logical server, and also that
> different ports "is not the essence of the distinction between the two
> endpoints".  I think there might be cases where the port is relevant for
> a distinction, but the main ones I can think of are of questionable
> relevance (essentially, roughly equivalent to multiple userspace NFS
> servers on a single host but in different trust/privilege domains) --
> I'd like another opinion or several.
>
> In a similar "discuss discuss" vein, Section 11.10.8 describes a
> scenario that does not give much clarity, at a protocol level, into what
> degree of replication synchronization a client can expect from a given
> file system that advertises multiple replicas.  I recognize that this is
> de facto just stating the deployed reality, but it's also hard to feel
> good about having this level of ambiguity in a propsed standard, and the
> (unchanged) text in Section 11.5.5 seems to impose a stricter
> consistency requirement, at least on potential migration targets.  (A
> bit more detail in the COMMENT section.)
>
> Section 11.13.2 mentions that "[i]ssues connected with a client
> impersonating another by presenting another client's id string are
> discussed in Section 21", but I failed to find this discussion in
> Section 21.  (The discuss-level issue is just the internal
> inconsistency; there's a decent argument that this is covered by
> Appendix C's "not written in accord with RFC3552".  Though if the text
> was already written for draft-ietf-nfsv4-mv1-msns-update, not including
> it here seems a little silly.)
>
>
> ----------------------------------------------------------------------
> COMMENT:
> ----------------------------------------------------------------------
>
> I think I may have mistakenly commented on some sections that are
> actually just moved text, since my lookahead window in the diff was too
> small.  I expect it's most appropriate to defer those to for the full
> -bis, so sorry to have them lumped in here.
>
> Thank you for all the effort put in to get the diff against RFC 5661 to
> be minimal!  I know that the current default output formatting is rather
> different than what is done in RFC 5661, but this diff is pretty easy to
> read.
>
> Thank you also for the detailed discussion in Appendix C; I do not think
> I could add anything more!  While the security posture of the current
> deployed state of NFSv4 is not great (though, arguably, somewhat
> understandable given the path we took to get there), this is the right
> start to making any sort of improvement.
>
> Since the "Updates:" header is part of the immutable RFC text (though
> "Updated by:" is mutable), we should probably explicitly state that "the
> updates that RFCs 8178 and 8434 made to RFC 5661 apply equally to this
> document".
>
> I note inline (in what is probably too many places; please don't reply
> at all of them!) some question about how clear the text is that a file
> system migration is something done at a per-file-system granularity, and
> that migrating a client at a time is not possible.  As was the case for
> my Discuss point about addresses/port-numbers, I'm missing the context
> of the rest of the document, so perhaps this is a non-issue, but the
> consequences of getting it wrong seem severe enough that I wanted to
> check.
>
> Does a client have any way to know in advance that two addresses will be
> session-trunkable other than the one listed in Section 11.1.1 that "when
> two connections of different connection types are made to the same
> network address and are based on a single file system location entry
> they are always session-trunkable"?  It seems like mostly we're defining
> the property by saying that the client has to try it and see if it
> works; I'd love to be wrong about that.
>
> Section 1.1
>
>    The revised description of the NFS version 4 minor version 1
>    (NFSv4.1) protocol presented in this update is necessary to enable
>    full use of trunking in connection with multi-server namespace
>    features and to enable the use of transparent state migration in
>    connection with NFSv4.1.  [...]
>
> nit: do we expect all readers to know what is meant by "trunking" with
> no other lead-in?
>
>    This limited scope update is applied to the main NFSv4.1 RFC with the
>
> nit: hyphenate "limited-scope"
>
>    scope as could expected by a full update of the protocol.  Below are
>    some areas which are known to need addressing in a future update of
>    the protocol.
>    [...]
>
> side note: I'd be interested in better understanding the preference for
> the subjunctive verb tense for most of these points ("work would have to
> be done"); my naive expectation would be that since there are plans to
> undertake the work, just "work needs to be done" or "work will be done"
> might suffice.
>
>    o  Work would have to be done with regard to RFC8178 [63] which
>       establishes NFSv4-wide versioning rules.  As RFC5661 is curretly
>       inconsistent with this document, changes are needed in order to
>       arrive at a situation in which there would be no need for RFC8178
>       to update the NFSv4.1 specfication.
>
> nit: s/this document/that document/ -- "this document" is
> draft-ietf-nfsv4-rfc5661sesqui-msns.
>
>    o  Work would have to be done with regard to RFC8434 [66], which
>       establishes the requirements for pNFS layout types, which are not
>       clearly defined in RFC5661.  When that work is done and the
>       resulting documents approved, the new NFSv4.1 specfication
>       document will provide a clear set of requirements for layout types
>       and a description of the file layout type that conforms to those
>       requirements.  Other layout types will have their own specfication
>       documents that conforms to those requirements as well.
>
> It's not entirely clear to me that the other layout types need to get
> mentioned in this document; how do they relate to the formal status of
> the "current NFSv4.1 core protocol specification document"?
>
>    o  Work would have to be done to address many erratas relevant to RFC
>       5661, other than errata 2006 [60], which is addressed in this
>       document.  That errata was not deferrable because of the
>       interaction of the changes suggested in that errata and handling
>       of state and session migration.  The erratas that have been
>       deferred include changes originally suggested by a particular
>       errata, which change consensus decisions made in RFC 5661, which
>       need to be changed to ensure compatibility with existing
>       implementations that do not follow the handling delineated in RFC
>       5661.  Note that it is expected that such erratas will remain
>
> This sentence is pretty long and hard to follow; maybe it could be split
> after "change consensus decisions made in RFC 5661" and the second half
> start with a more declarative statement about existing implementations?
> (E.g., "Existing implementations did not perform handling as delineated in
> RFC
> 5661 since the procedures therein were not workable, and in order to
> have the specification accurately reflect the existing deployment base,
> changes are needed [...]")
>
>       relevant to implementers and the authors of an eventual
>       rfc5661bis, despite the fact that this document, when approved,
>       will obsolete RFC 5661.
>
> (I assume the RFC Editor can tweak this line to reflect what actually
> happens; my understanding is that the errata reports will get cloned to
> this-RFC.)
> [rant about "errata" vs. "erratum" elided]
>
> Section 2.10.4
>
>    Servers each specify a server scope value in the form of an opaque
>    string eir_server_scope returned as part of the results of an
>    EXCHANGE_ID operation.  The purpose of the server scope is to allow a
>    group of servers to indicate to clients that a set of servers sharing
>    the same server scope value has arranged to use compatible values of
>    otherwise opaque identifiers.  Thus, the identifiers generated by two
>    servers within that set can be assumed compatible so that, in some
>    cases, identifiers generated by one server in that set may be
>    presented to another server of the same scope.
>
> Is there more that we can say than "in some cases"?  The previous text
> implies a higher level of reliability than just "some cases", to me.
>
> Section 2.10.4
>
> I see the list of identifier types for which same-scope compatibility
> applies got reduced from RFC 5661 to this document, by removing session
> ID, client ID, and state ID values.  For at least one of those I can see
> this making sense as only being workable when the server really is "the
> same server", inline with the improved discussion of migration vs.
> trunking that is a main focus of this document.  Does that
> justification apply to all of them, or are there more reasons involved?
>
> We also remove the text about a client needing to compare server scope
> values during a potential migration event, to determine whether the
> migration preserved state or a reclaim is needed.  I thought this
> scenario would still be possible (and thus still need to be listed),
> though perhaps we are claiming that it is so under-specified so as to be
> never workable in practice?
>
> Section 2.10.5
>
>    o  When eir_server_scope changes, the client has no assurance that
>       any id's it obtained previously (e.g. file handles, state ids,
>       client ids) can be validly used on the new server, and, even if
>
> It's interesting to see file handles, state ids, and client ids listed
> together here (nit: also with lowercase "id"), when in the previous
> section we have removed state IDs and client IDs from a list that
> includes all three in RFC 5661.
>
>    o  When eir_server_scope remains the same and
>       eir_server_owner.so_major_id changes, the client can use the
>       filehandles it has, consider its locking state lost, and attempt
>       to reclaim or otherwise re-obtain its locks.  It may find that its
>       file handle IS now stale but if NFS4ERR_STALE is not received, it
>       can proceed to reclaim or otherwise re-obtain its open locking
>       state.
>
> nit(?): this bit about "It may find that its file handle IS now stale
> but if NFS4ERR_STALE is not received" seems to assume some familiarity
> by the reader as to what actions would be performed that would get
> NFS4ERR_STALE back.
>
> Section 2.10.5.1
>
>    When the server responds using two different connections claim
>    matching or partially matching eir_server_owner, eir_server_scope,
>
> nit: The grammar got wonky here; maybe s/claim/claiming/?
>
> Section 11.1.1
>
>       In the case of NFS version 4.1 and later minor versions, the means
>       of trunking detection are as described in this document and are
>       available to every client.  Two network addresses connected to the
>       same server are always server-trunkable but cannot necessarily be
>       used together to access a single session.
>
> nit: we haven't defined "server-trunkable" yet, so it may be worth a
> hint that the definition is coming soon.
>
>    The combination of a server network address and a particular
>    connection type to be used by a connection is referred to as a
>    "server endpoint".  Although using different connection types may
>    result in different ports being used, the use of different ports by
>    multiple connections to the same network address is not the essence
>    of the distinction between the two endpoints used.
>
> There's perhaps a fine line to walk here, as the port can still have
> significant relevance, in general, and we are frequently in the IETF
> told to make no assumption about what is behind specific port values at
> a given network address.  (Consider, for example, a hypothetical virtual
> hosting service that provides "DS-as-a-service" where customers run
> their own MDS that point to configured DSes for actual storage.
> Different ports on that cloud provider would represent entirely
> different customers/servers!)  [This became a discuss point but it
> didn't end up including all the discussion here, so I left it as an
> informational thing; discussion should happen in the Discuss section]
>
> Section 11.1.2
>
>    o  In some cases, a server will have a namespace more extensive than
>       its local namespace by using features associated with attributes
>       that provide file system location information.  These features,
>       which allow construction of a multi-server namespace are all
>
> nit: comma after "multi-server namespace".
>
>    o  A file system present in a server's pseudo-fs may have multiple
>       file system instances on different servers associated with it.
>       All such instances are considered replicas of one another.
>
> [Some readers might take this as requiring live read/write replication
> such that all writes to any instance are immediately visible on all
> other instances.  The rest of the document ought to disabuse them of
> that notion, and yet...]
>
>    o  File system location entries provide the individual file system
>       locations within the file system location attributes.  Each such
>       entry specifies a server, in the form of a host name or IP an
>       address, and an fs name, which designates the location of the file
>
> nit: s/IP an/an IP/.
>
>       client may establish connections.  There may be multiple endpoints
>       because a host name may map to multiple network addresses and
>       because multiple connection types may be used to communicate with
>       a single network address.  However, all such endpoints MUST
>       provide a way of connecting to a single server.  The exact form of
>
> nit: "MUST provide" feels strange here, since it implies in some sense
> an extra layer of indirection ("A lists X, and X among other things
> provides Y"); would a different word like "indicate" work?
>
>       element derives from a corresponding location entry.  When a
>       location entry specifies an IP address there is only a single
>       corresponding location element.  File system location entries that
>       contain a host name are resolved using DNS, and may result in one
>       or more location elements.  All location elements consist of a
>       location address which is the IP address of an interface to a
>       server and an fs name which is the location of the file system
>       within the server's local namespace.  The fs name can be empty if
>
> I can't decide whether both instances of "IP address" are pedantically
> correct, in the presence of the potential for port information to be
> included/available.  The former is probably okay, but the latter might
> need some clarification.
>
> Section 11.2
>
>    The fs_locations attribute defined in NFSv4.0 is also a part of
>    NFSv4.1.  This attribute only allows specification of the file system
>    locations where the data corresponding to a given file system may be
>    found.  Servers should make this attribute available whenever
>    fs_locations_info is supported, but client use of fs_locations_info
>    is preferable, as it provides more information.
>
> I think this was probably okay as "SHOULD make this attribute available"
> (as it was in 5661), but don't object to the lowercase version either.
>
> Section 11.5
>
>    Where a file system had been absent, specification of file system
>
> I guess I'm probably in the rough on this one (since 5661 had my
> more-preferred language), but it still feels like "had been absent"
> implies that it is no longer absent, i.e., that it is now present or has
> otherwise changed.  What's going on here with referrals is more like a
> "was never present" case, though using "never" is of course problematic
> as it's more absolute than is appropriate.
>
>
> If we're going to talk about "pure referral"s, do we want to make
> mention of or otherwise differentiate/characterize "non-pure"
> ("impure"?) referrals?
>
> Section 11.5.1
>
>    In order to simplify client handling and allow the best choice of
>    replicas to access, the server should adhere to the following
>    guidelines.
>
> Just to check: these are just informal "guidelines" and not something
> that a server SHOULD or even MUST adhere to?
>
> Section 11.5.2
>
>    Locations entries used to discover candidate addresses for use in
>
> nit(?): is this supposed to just be "Location" singular?
>
> Section 11.5.3
>
>    Irrespective of the particular attribute used, when there is no
>    indication that a step-up operation can be performed, a client
>    supporting RDMA operation can establish a new RDMA connection and it
>    can be bound to the session already established by the TCP
>    connection, allowing the TCP connection to be dropped and the session
>    converted to further use in RDMA node.
>
> Should we say something to make this contingent on the server also
> supporting RDMA?
>
> Section 11.5.5
>
>    will typically use the first one provided.  If that is inaccessible
>    for some reason, later ones can be used.  In such cases the client
>    might consider that the transition to the new replica as a migration
>    event, even though some of the servers involved might not be aware of
>    the use of the server which was inaccessible.  In such a case, a
>
> nit: the grammar here got wonky; maybe s/as a/is a/?
>
> Section ??
>
> The old (RFC 5661) Section 11.5 mentioned several things, and I'd like
> to check that we have either covered or disavowed all of them.
> My current understanding is that:
>
> The first paragraph basically talked about trunking detection, and is
> covered elsewhere.
>
> The second paragraph talks about something that I would call "implicit
> replication" with the 5661 definition of "replica", but in the new model
> is essentially definitionally true, since we consider all addresses for
> the same server to be ... part of the same server, so of course that
> server's namespaces match up.  Though perhaps the discussion about not
> all of the cartesian product of (addresses-for-server, local path) being
> listed is still worth having?
>
> The third paragraph basically talks about the need for trunking
> detection, and includes some guidance to clients about assuming server
> misconfiguration that seems of questionable merit.
>
> Section 11.5.7
>
>    o  Deletions from the list of network addresses for the current file
>       system instance need not be acted on immediately, although the
>       client might need to be prepared for a shift in access whenever
>       the server indicates that a network access path is not usable to
>       access the current file system, by returning NFS4ERR_MOVED.
>
> I think this should be wordsmithed a bit more, as (IIUC) the idea here
> is that if a client notices in a location response that the address the
> client is currently using for a filesystem has disappeared from the
> list, the client should be prepared for imminent changes in server
> behavior relating to the presumed-move.  Those imminent changes would
> most likely be reflected in the form of the server returning
> NFS4ERR_MOVED, but there is no NFS4ERR_MOVED involved in the actual
> deletion from the list of network instances of the current system
> instance.
>
> Section 11.6
>
>    corresponding attribute is interrogated subsequently.  In the case of
>    a multi-server namespace, that same promise applies even if server
>    boundaries have been crossed.  Similarly, when the owner attribute of
>    a file is derived from the securiy principal which created the file,
>    that attribute should have the same value even if the interrogation
>    occurs on a different server from the file creation.
>
> I can see how the interrogation would be on a different server from file
> creation for "simple" replication scenarios, but I'm not sure I'm seeing
> how non-replication cases would arise, paritulcarly that cross server
> boundaries in a multi-server (hierarchical?) namespace.  Am I missing
> something obvious?
> nit: s/securiy/security/
>
>    o  All servers support a common set of domains which includes all of
>       the domains clients use and expect to see returned as the domain
>       portion of an owner or group in the form "id@domain".  Note that
>       although this set most ofen consists of a single domain, it is
>       possible for mutiple domains to be supported.
>
> I a little bit wonder if the "most often" still holds when client
> principals come from an AD forest.
>
>    o  All servers recognize the same set of security principals, and
>       each principal, the same credential are required, independent of
>       the server being accessed.  In addition, the group membership for
>
> nit: I think there's a missing word here, maybe "and for each
> principal"?
>
>    Note that there is no requirment that the users corresponding to
>
> nit: "requirement"
>
>    o  The "local" representation of all owners and groups must be the
>       same on all servers.  The word "local" is used here since that is
>       the way that numeric user and group ids are described in
>       Section 5.9.  However, when AUTH_SYS or stringified owners or
>       group are used, these identifiers are not truly local, since they
>       are known tothe clients as well as the server.
>
> I am trying to find a way to note that the AUTH_SYS case mentioned here
> is precisely because of the requirement being imposed by this bullet
> point, while acknowledging that the "stringified owners or group" case
> is separate, but not having much luck.
> Also, nit: "to the"
>
> Section 11.9
>
>    o  When use of a particular address is to cease and there is also one
>       currently in use which is server-trunkable with it, requests that
>       would have been issued on the address whose use is to be
>       discontinued can be issued on the remaining address(es).  When an
>       address is not a session-trunkable one, the request might need to
>       be modified to reflect the fact that a different session will be
>       used.
>
> I suggest writing this as "when an address is server-trunkable but not
> session-trunkable,".
>
>    o  When use of a particular connection is to cease, as indicated by
>       receiving NFS4ERR_MOVED when using that connection but that
>       address is still indicated as accessible according to the
>       appropriate file system location entries, it is likely that
>       requests can be issued on a new connection of a different
>       connection type, once that connection is established.  Since any
>       two server endpoints that share a network address are inherently
>       session-trunkable, the client can use BIND_CONN_TO_SESSION to
>       access the existing session using the new connection and proceed
>       to access the file system using the new connection.
>
> I'm not entirely sure how "inherent" this is (in the vein of my Discuss
> point, and what we mean by "network address").
>
>    o  When there are no potential replacement addresses in use but there
>
> What is a "replacement address"?
>
>       are valid addresses session-trunkable with the one whose use is to
>       be discontinued, the client can use BIND_CONN_TO_SESSION to access
>       the existing session using the new address.  Although the target
>       session will generally be accessible, there may be cases in which
>       that session is no longer accessible.  In this case, the client
>       can create a new session to enable continued access to the
>       existing instance and provide for use of existing filehandles,
>       stateids, and client ids while providing continuity of locking
>       state.
>
> I'm not sure I understand this last sentence.  On its own, the "new
> session to enable continued access to the existing instance" sounds like
> the continued access would be on the address whose use is to cease, and
> thus the new session would be there.  But why make a new session when
> the old one is still good, especially when we just said in the previous
> sentence that the old session can't be moved to the new
> connection/address?
> Perhaps a forward reference down to Section 11.12.{4,5} for this and the
> next bullet point would help as well as rewording?
>
> Section 11.10.6
>
>    In a file system transition, the two file systems might be clustered
>    in the handling of unstably written data.  When this is the case, and
>
> What does "clustered in the handling of unstably written data" mean?
>
>    the two file systems belong to the same write-verifier class, write
>
> How is the client supposed to determine "when this is the case"?
>
> Section 11.10.7
>
>    In a file system transition, the two file systems might be consistent
>    in their handling of READDIR cookies and verifiers.  When this is the
>    case, and the two file systems belong to the same readdir class,
>
> As above, how is the client supposed to determine "when this is the
> case"?
>
>    READDIR cookies and verifiers from one system may be recognized by
>    the other and READDIR operations started on one server may be validly
>    continued on the other, simply by presenting the cookie and verifier
>    returned by a READDIR operation done on the first file system to the
>    second.
>
> Are these "may be"s supposed to admit the possibility that the
> destination server can just decide to not honor them arbitrarily?
>
> Section 11.10.8
>
>    the degree indicated by the fs_locations_info attribute).  However,
>    when multiple file systems are presented as replicas of one another,
>    the precise relationship between the data of one and the data of
>    another is not, as a general matter, specified by the NFSv4.1
>    protocol.  It is quite possible to present as replicas file systems
>    where the data of those file systems is sufficiently different that
>    some applications have problems dealing with the transition between
>    replicas.  The namespace will typically be constructed so that
>    applications can choose an appropriate level of support, so that in
>    one position in the namespace a varied set of replicas will be
>    listed, while in another only those that are up-to-date may be
>    considered replicas.  [...]
>
> This seems quite wishy-washy for a standards-track protocol!  We give no
> hard bounds on how "different" replicas may be, no protocol element to
> convey even a qualitative sense of where on the spectrum of replication
> fidelity a replica may lie, and no indication as to how the namespace
> might be constructed to indicate a level of support.
>
>                          The protocol does define three special cases of
>    the relationship among replicas to be specified by the server and
>    relied upon by clients:
>
> I'd like to hear from the rest of the IESG, but we may need to consider
> limiting "replication" to just these special cases until we can be more
> precise about the other cases.
>
>    o  When multiple replicas exist and are used simultaneously by a
>       client (see the FSLIB4_CLSIMUL definition within
>       fs_locations_info), they must designate the same data.  Where file
>       systems are writable, a change made on one instance must be
>       visible on all instances, immediately upon the earlier of the
>       return of the modifying requester or the visibility of that change
>       on any of the associated replicas.  This allows a client to use
>
> Hmm, how would this "earlier of [...]" work when there are three
> nominally equivalent machines?  Assume the RPC is made to A, and the
> other two are B and C.  If the update first goes visible on B, it must
> also be visible on C, instilling what is apparently a hard requirement
> for exact synchronization between B an C, perhaps by some sort of
> negotiated "make visible at timestamp X" mechanism.  But if the RPC
> returns from A first, then the change still has to be visible on B and C
> at the same time.  Does this phrasing give any weaker a requirement than
> "must be visible on all machines at the same time", in practice?  (There
> are, of course, various distributed-consensus protocols that can do
> this, as could a scenario where all NFS servers are connected to a
> common file store backend.)
>
> Section 11.10.9
>
>    When access is transferred between replicas, clients need to be
>    assured that the actions disallowed by holding these locks cannot
>
> To check my understanding: this "access is transferred" means *all*
> clients' access (not just one particular client)?  Otherwise I'm not
> sure how the destination would know to enforce the grace period.
>
> Section 11.11.1
>
> I think the last two paragraphs might be duplicating some things
> mentioned earlier in the section, but the repetition is probably not
> harmful.
>
> Section 11.12.1
>
>    Because of the absence of NFSV4ERR_LEASE_MOVED, it is possible for
>    file systems whose access path has not changed to be successfully
>
> It might be worth phrasing this as "SEQ4_STATUS_LEASE_MOVED is not an
> error condition".
>
> Section 11.12.2
>
>    o  No action needs to be taken for such indications received by the
>       those performing migration discovery, since continuation of that
>       work will address the issue.
>
> nit: "by the those" is not right, but the proper fix eludes me, as this
> bullet point needs to be more specific somehow than the next one.
>
>    o  If the fs_status attribute indicates that the file system is a
>       migrated one (i.e. fss_absent is true and fss_type !=
>       STATUS4_REFERRAL) and thus that it is likely that the fetch of the
>       file system location attribute has cleared one the file systems
>       contributing to the lease-migrated indication.
>
> This looks like a sentence fragment -- it's of the form "If X, and thus
> Y." with no concluding clause.
>
> Section 11.12.4
>
>    Once the client has determined the initial migration status, and
>    determined that there was a shift to a new server, it needs to re-
>    establish its locking state, if possible.  To enable this to happen
>    without loss of the guarantees normally provided by locking, the
>    destination server needs to implement a per-fs grace period in all
>    cases in which lock state was lost, including those in which
>    Transparent State Migration was not implemented.
>
> Similarly to above, does this imply that the migration has to happen for
> all clients concurrently, as opposed to clients getting migrated in
> sequence?
>
> Section 11.3.1
>
>    In this case, destination server need have no knowledge of the locks
>
> nit: singular/plural mismatch "destination server"/"need"
>
> Section 11.13.3
>
>    o  Not responding with NFS4ERR_SEQ_MISORDERED for the initial request
>       on a slot within a transferred session, since the destination
>
> Does this then translate to "process as usual in the absence of
> migration"?  "Don't return error X" tells me what not to do, but doesn't
> really tell me what to do instead.
>
> Section 11.16.1
>
>    With the exception of the transport-flag field (at offset
>    FSLI4BX_TFLAGS with the fls_info array), all of this data applies to
>    the replica specified by the entry, rather that the specific network
>    path used to access it.
>
> Is it clear that this applies only to the fields defined by this
> specification (since, as mentioned later, future extensions must specify
> whether they apply to the replica or the entry)?
>
> Section 15.1.1.3
>
>    o  When NFS4ERR_DELAY is returned on an operation other than the
>       first within a request and there has been a non-idempotent
>       operation processed before the NFS4ERR_DELAY was returned, the
>       reissued request should avoid the non-idempotent operation.  The
>       request still must use a SEQUENCE operation with either a
>       different slot id or sequence value from the SEQUENCE in the
>       original request.  Because this is done, there is no way the
>       replier could avoid spuriously re-executing the non-idempotent
>       operation since the different SEQUENCE parameters prevent the
>       requester from recognizing that the non-idempotent operation is
>       being retried.
>
> I don't think that this is very clear about the counterfactual scenario
> in which the replier is trying to avoid spuriously re-executing the
> non-idempotent operation.  Is it supposed to be explaining why the
> client has to use a different slot or sequence value, because the
> replier would reexecute the non-idempotent operation otherwise?
>
> Section 18.35.3
>
> I a little bit wonder if we want to reaffirm that co_verifier remains
> fixed when the client is establishing multiple connections for trunking
> usage -- the "incarnation of the client" language here could make a
> reader wonder, though I think the discussion of its use elsewhere as
> relating to "client restart" is sufficiently clear.
>
>    The eia_clientowner field is composed of a co_verifier field and a
>    co_ownerid string.  As noted in s Section 2.4, the co_ownerid
>
> s/s //
>
> Section 18.51.4
>
>    o  When a server might become the destination for a file system being
>       migrated, inappropriate use of per-fs RECLAIM_COMPLETE is more
>       concerning.  In the case in which the file system designated is
>       not within a per-fs grace period, the per-fs RECLAIM_COMPLETE
>       SHOULD be ignored, with the negative consequences of accepting it
>       being limited, as in the case in which migration is not supported.
>       However, if the server encounters a file system undergoing
>       migration, the operation cannot be accepted as if it were a global
>       RECLAIM_COMPLETE without invalidating its intended use.
>
> This seems to be the only place where we acknowledge that the "misuse"
> in question was to "treat rca_one_fs of TRUE as if it was FALSE", which
> is probably not so great for clarity.
>
> Section 21
>
> Some other topics at least somewhat related to trunking and migration
> that we could potentially justify including in the current,
> limited-scope, update (as opposed to deferring for a full -bis) include:
>
> - clients that lie about reclaimed locks during a post-migration grace
>   period
> - how attacker capabilities compare by using a compromised server to
>   give bogus referrals/etc. as opposed to just giving bogus data/etc.
> - an attacker in the network trying to shift client traffic (in terms of
>   what endpoints/connections they use) to overload a server
> - how asynchronous replication can cause clients to repeat
>   non-idempotent actions
> - the potential for state skew and/or data loss if migration events
>   happen in close succession and the client "misses a notification"
> - cases where a filesystem moves and there's no longer anything running
>   at the old network endpoint to return NFS4ERR_MOVED
> - what can happen when non-idempotent requests are in a COMPOUND before
>   a request that gets NFS4ERR_MOVED
> - how bad it is if the client messes up at Transparent State Migration
>   discovery, most notably in the case when some lock state is lost
> - the interactions between cached replies and migration(-like) events,
>   though a lot of this is discussed in section 11.13.X and 15.1.1.3
>   already
>
> but I defer to the WG as to what to cover now vs. later.
>
> In light of the ongoing work on draft-ietf-nfsv4-rpc-tls, it might be
> reasonable to just talk about "integrity protection" as an abstract
> thing without the specific focus on RPCSEC_GSS's integrity protection
> (or authentication)
>
>       being returned.  These include cases in which the client is
>       directed a server under the control of an attacker, who might get
>
> nit: "directed to"
>
>    o  Despite the fact that it is a requirement that "implementations"
>       provide "support" for use of RPCSEC_GSS, it cannot be assumed that
>       use of RPCSEC_GSS is always available between any particular
>       client-server pair.
>
> side note: scare-quotes around "support" makes sense to me, but not
> around "implementations".
>
>    the destination.  Even when RPCSEC_GSS authentication is available on
>    the destination, the server might validly represent itself as the
>    server to which the client was erroneously directed.  Without a way
>
> Something about the wording here tickles me funny; at first I thought it
> was the "validly", but now I think it's "represent itself", perhaps
> because that phrasing can have connotations of "falsely represent".
> ("Valid" is fine -- the attack here is the misdirection, and the target
> of the misdirection doesn't have to misbehave at all for it to be a
> damaging attack.)  The best remedy I can come up with is a somewhat
> drastic change, and thus questionable: "Even when [...], the server
> might still properly authenticate as the server to which the client was
> erroneously directed."
>
>
> I'd also consider adding a third bullet point to the final list ("to
> summarize considerations regarding the use of RPCSEC_GSS"):
>
> % o The integrity protection afforded to results by RPCSEC_GSS protects
> %   only a given request/response transaction; RPCSEC_GSS does not
> %   protect the binding from one server to another as part of a referral
> %   or migration event.  The source server must be trusted to provide
> %   the correct information, based on whatever factors are available to
> %   the client.
>
> Section 22.1
>
> Thank you for thinking about how the IANA considerations should be
> presented in the post-update document.  (I think I've had to place at
> least two Discuss positions on bis documents that did not...)
>
> Section 23.2
>
> I'm not sure that all of the moves from Normative to Informative should
> stick; e.g., HMAC (which went from [11] to [59]) is needed for SSV
> calculation.  Hmm, actually, maybe that's the only one.
>
> Appendix B
>
> I have mixed feelings about whether to keep this content for the final
> RFC.  (Appendix A seems clearly useful; the specific details of the
> reorganization are less clear, as to some extent they can be deduced
> from the changes themselves.  But only to some extent...)
>
> Appendix B.1.2
>
>    o  The new Sections 11.8 and 11.9 have resulted in existing sections
>       wit these numbers to be renumbered.
>
> s/wit/with/
>
> Section B.2.1
>
>    The new treatment can be found in Section 18.35 below.  It is
>
> s/below/above/
>
>    intended to supersede the treatment in Section 18.35 of RFC5661 [62].
>    Publishing a complete replacement for Section 18.35 allows the
>    corrected definition to be read as a whole, in place of the one in
>    RFC5661 [62].
>
> This seems like it was more appropriate in the scope of
> draft-ietf-nfsv4-mv1-msns-update but could be obsolete here.
>
> Section B.4
>
>    o  The discussion of trunking which appeared in Section 2.10.5 of
>       RFC5661 [62] needed to be revised, to more clearly explain the
>       multiple types of trunking supporting and how the client can be
>       made aware of the existing trunking configuration.  In addition
>       the last paragraph (exclusive of sub-sections) of that section,
>       dealing with server_owner changes, is literally true, it has been
>       a source of confusion.  [...]
>
> nit: the grammar here is weird; I think there's a missing "while" or
> similar.
>
>
>
[nfsv4] Benjamin Kaduk's Discuss on draft-ietf-nf… Benjamin Kaduk via Datatracker
Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… David Noveck
Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… Benjamin Kaduk
Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… David Noveck
[nfsv4] Benjamin Kaduk's Discuss on draft-ietf-nf… David Noveck
Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… Benjamin Kaduk
Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… Benjamin Kaduk
Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… Benjamin Kaduk
Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… David Noveck
Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… David Noveck
Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… Benjamin Kaduk
Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… Benjamin Kaduk
Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… David Noveck
Re: [nfsv4] Benjamin Kaduk's Discuss on draft-iet… Magnus Westerlund