Re: [Gen-art] Gen-ART Telechat review of draft-ietf-anima-grasp-api-08

Paul Kyzivat <> Wed, 02 December 2020 17:29 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 48F2F3A1474; Wed, 2 Dec 2020 09:29:55 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.002
X-Spam-Status: No, score=-2.002 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, NICE_REPLY_A=-0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (1024-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id ISednv7fa6vf; Wed, 2 Dec 2020 09:29:51 -0800 (PST)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id E94D53A14E7; Wed, 2 Dec 2020 09:29:50 -0800 (PST)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901;; cv=none; b=PwZPksMrvGbe/Er4dbbeS1yjkO2l86rvFdJ10dWqG6SXhnlqa9vZt36581N5meIcJhAcQp2fsurxolBtWenY/XtUbzHmKEPKeaAWjmhb03Qy1zhTFesg4CcBvIdQygrBce0Ni8ld0B+zlJ5NjsjLj50sGmzXPBDBNHPRibULAHluCoCdnzssSZ/eh3m3y8e2WtDNc6nGKGXnUtv7Hsjq1dlnuTfaN4MDQNJUa1YI2BAEavpYGtT+bsK7tbH48ifzT5ZuX7OMF4P/DxSlSGbgcZnE20iyTFnLkoPpkbUb28SZtbjoLCZ+0grLV8Q0jImMh8Gut0/k0QGorKb+4lo9mA==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed;; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=W+M9bawask7YDPn9JDd1dIp4r1mek9lBKM1zwWfgRRk=; b=RViwdRrAcZWUuY3cjP5oVtW8AHdA47HFWfhoA9wI2Oyv065lyM1UkBM8WbMfl+02wiyFzKunn/WZHA48x0hIx62h0fNZ6Mfb8GGr8icS2HBwcWgZGFhKW+A99GEmhFM8P4YEzcCFi1lqosA+8Ow25oGI9RMXMz+ejZ0iRVOIYkSi5JuVgYLAKnEjCZLiJus/vwnD5Vdw4fzeQLLjxoVcgUeiFKf1Ncu5oMV1SvAc1OhVr+NcU2bLSh22md4sBrQorNNo0gAwM1doCCyDQpbtzMYRV06JWqO6SQXlQO10Ks88w+isBUtFOuKIGKJA4PVu5F8JpRa2kd/+rjVRMFLs1Q==
ARC-Authentication-Results: i=1; 1; spf=pass (sender ip is; dmarc=bestguesspass action=none; dkim=none (message not signed); arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=W+M9bawask7YDPn9JDd1dIp4r1mek9lBKM1zwWfgRRk=; b=X17chL5LY22CKU0tKRkIxdoX+mPRY8hG2fs69lIG3W84HdoypK7ALlKkyGPhCm+oG0LtoF1pFrbCLGQqVy8Jd+kMkZQHLU6oDaQP0WAO0vvPJZugPzqh4aV5dj0ff1wD0KMGhari3RPtcqrVRudsKFCy5V8kKy17tS+ZAM/Uwxw=
Received: from (2603:10b6:903:117::32) by (2603:10b6:408:61::29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3632.17; Wed, 2 Dec 2020 17:29:49 +0000
Received: from (2603:10b6:903:117:cafe::6a) by (2603:10b6:903:117::32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3632.17 via Frontend Transport; Wed, 2 Dec 2020 17:29:48 +0000
X-MS-Exchange-Authentication-Results: spf=pass (sender IP is;; dkim=none (message not signed) header.d=none;; dmarc=bestguesspass action=none;
Received-SPF: Pass ( domain of designates as permitted sender); client-ip=;;
Received: from ( by ( with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3611.27 via Frontend Transport; Wed, 2 Dec 2020 17:29:47 +0000
Received: from PaulKyzivatsMBP.localdomain ( []) (authenticated bits=0) (User authenticated as pkyzivat@ALUM.MIT.EDU) by (8.14.7/8.12.4) with ESMTP id 0B2HTjbf010232 (version=TLSv1/SSLv3 cipher=AES128-GCM-SHA256 bits=128 verify=NOT); Wed, 2 Dec 2020 12:29:45 -0500
To: Brian E Carpenter <>,
Cc: General Area Review Team <>
References: <> <>
From: Paul Kyzivat <>
Message-ID: <>
Date: Wed, 02 Dec 2020 12:29:44 -0500
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:78.0) Gecko/20100101 Thunderbird/78.5.0
MIME-Version: 1.0
In-Reply-To: <>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Language: en-US
Content-Transfer-Encoding: 8bit
X-EOPAttributedMessage: 0
X-MS-PublicTrafficType: Email
X-MS-Office365-Filtering-Correlation-Id: 4648caad-9e68-49c0-75a5-08d896e7de4b
X-MS-TrafficTypeDiagnostic: BN8PR12MB3329:
X-Microsoft-Antispam-PRVS: <>
X-MS-Oob-TLC-OOBClassifiers: OLM:10000;
X-MS-Exchange-SenderADCheck: 1
X-Microsoft-Antispam: BCL:0;
X-Microsoft-Antispam-Message-Info: elADOVhhmhE79CConmHdvQpz9SuXLeP/Xk9BVbf4y9cBZM5uGUq4HyIM3JeDP0f7OYgiEadvXgqcjYXu/ilyw0JG8RN/VHC7Ev1Ugg5V0wbtibjFUp+xAvFxVPxnJicbjo88Ja0XpTLe+quBLCqFJxX7FeQ8NLP6/ZFww5DeSV4tDXfTSctROw/oVkBAWc5CKBXGFqyGFp7gfcl/QpjkaWvFp6C0m/TJ573r8HVjl/EhU5vvarUIEe8gk8NRcKCjFahVhM1gDDEsXsD9hcCz+azRO+urs4IgED/Bj3aTByzXTtuvIkrIr71IJ3ygeNFObgaDBSrYuyeucjPHjXsiCENOfi57jayGXFWRFOgJB1xeuwevHPsH1HmGN6s6w3IaBv1o6In6h9Ud16D5kByjBsQuolJgjTVnYAkG2ejN25ef1r+2645WeIuVKvKbPHk0vzvUn8Ke7oNoWH/AibWlF2P7//E6UKh171lCyeuxMTVQY6zsc68qdp8nIf4W77wW
X-Forefront-Antispam-Report: CIP:; CTRY:US; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM;;; CAT:NONE; SFS:(39860400002)(376002)(346002)(136003)(396003)(46966005)(26005)(7596003)(53546011)(30864003)(83380400001)(82310400003)(82740400003)(2906002)(356005)(8936002)(86362001)(2616005)(31696002)(5660300002)(956004)(8676002)(47076004)(186003)(31686004)(70586007)(70206006)(336012)(478600001)(4001150100001)(75432002)(316002)(786003)(4326008)(966005)(43740500002); DIR:OUT; SFP:1101;
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 02 Dec 2020 17:29:47.8206 (UTC)
X-MS-Exchange-CrossTenant-Network-Message-Id: 4648caad-9e68-49c0-75a5-08d896e7de4b
X-MS-Exchange-CrossTenant-Id: 3326b102-c043-408b-a990-b89e477d582f
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3326b102-c043-408b-a990-b89e477d582f; Ip=[]; Helo=[]
X-MS-Exchange-CrossTenant-AuthAs: Anonymous
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN8PR12MB3329
Archived-At: <>
Subject: Re: [Gen-art] Gen-ART Telechat review of draft-ietf-anima-grasp-api-08
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "GEN-ART: General Area Review Team" <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 02 Dec 2020 17:30:03 -0000


On 12/1/20 8:13 PM, Brian E Carpenter wrote:
> Hi Paul,
> Comments in line. There's one definite good catch in your
> review, and obviously more clarifications are needed.

I'll comment this one last time. Then I will be satisfied that I have 
communicated my thoughts, and I'm happy with whatever you decide to do.

> On 01-Dec-20 15:06, Paul Kyzivat wrote:
>> I am the assigned Gen-ART reviewer for this draft. The General Area
>> Review Team (Gen-ART) reviews all IETF documents being processed by the
>> IESG for the IETF Chair. Please wait for direction from your document
>> shepherd or AD before posting a new version of the draft. For more
>> information, please see the FAQ at <​
>> Document: draft-ietf-anima-grasp-api-08
>> Reviewer: Paul Kyzivat
>> Review Date: 2020-11-30
>> IETF LC End Date: 2020-10-28
>> IESG Telechat date: 2020-12-01
>> Summary:
>> This draft is on the right track but has open issues, described in the
>> review.
>> General:
>> This document has addressed some of the concerns I had during the last
>> call review. However some of my concerns remain and some new ones have
>> arisen in this version.
>> Issues:
>> Major: 3
>> Minor: 6
>> Nits:  1
>> 1) MAJOR: Negotiation
>> The text in section 2.3.5 now makes clear that the sequence of steps in
>> the negotiation is non-deterministic - both sides can call
>> negotiate_step and negotiate_wait. I believe this can result in the two
>> sides not agreeing on what values have been negotiated. (For instance,
>> what if one side calls negotiate_step concurrently with the other side
>> calling end_negotiate? Which value has been agreed upon?)
> The negotiate_step calls alternate between the two peers, until one of them
> calls end_negotiate (or a timeout kills the session). I hoped that
> was clear in the protocol diagram. We can make it explicit, for people
> who haven't fully digested the protocol spec. Since that's ~50 pages, it
> certainly takes some digestion.
> (negotiate_wait would be interjected by the peer who has the next go, simply
> indicating that the next step is delayed. That could happen, for example,
> if the ASA needed to negotiate something with a third party before
> continuing.)

The figure shows alternation, but IIUC that is only an example. I 
understood the text as permitting either side to perform negotiate_step 
or negotiate_wait at will. A requirement for alternation would solve the 

Reading it again, I see how it can be interpreted as you say is 
intended. But it isn't clear. To clarify, is the intent that a 
negotiate_step, negotiate_wait, or end_negotiate is required in response 
to receipt of request_negotiate or negotiate_step? And a negotiate_step 
or end_negotiate is also required following the sending of 
negotiate_wait. And those are the only permitted sequences?

A state machine would be helpful to show this in an unambiguous way.

>> The loop_count
>> adds to the confusion. Are the two sides intended to have independent
>> loop count values? It seems these too can become unsynchronized.
> The loop count also bounces backwards and forwards in alternate steps.
> Again, we can underline that in the text.

That would be helpful.

>> Also, the goal of negotiation isn't clear to me. I gather it must be for
>> the two sides to agree on a particular value for the objective. But for
>> that to work there must be some rules about how values can change in
>> each step so that the result stabililizes, rather than causing a battle
>> that ends with loop count exhaustion. This could be achieved by always
>> negotiating *down*, or always *up*. But that requires that the objective
>> value type have an ordering function. Given the general nature of the
>> objective I don't think that can be assumed.
> No, it explicitly is not defined either in the protocol nor the API.
> The syntax and semantics of the objective value are defined per-objective,
> and the objective might or might not be ordered. So there is intentionally
> no answer to your question.
> In most cases I'd expect that there would be an ordering but we didn't want
> to constrain the use cases in that way. Also note that a failed negotiation
> (e.g. the loop count expires, or where one end simply rejects the other's
> offer) is not a protocol failure.
>> ISTM that more work is needed to define the negotiation process in a way
>> that ensures it ends with both sides agreeing on a single value for the
>> objective.
> As noted, that is per-objective. The most complicated case I've coded
> is IP prefix assignment, and it works fine, except that if there is
> no prefix available of the maximum desired length, the requester ends
> up unsatisfied - as intended. There should be no condition in which
> the negotiation loops indefinitely; it either succeeds or fails.

Without that the result in non-deterministic. The two sides may have 
conflicting goals, and then the result will only be determined by the 
loop count and timeout.

Alternately, implementors will establish side agreements that aren't 
governed by standards.

That seems like an undesirable state of affairs.

>> 2) MINOR: Dry Run Negotiation
>> Dry Run negotiation is very under-specified. Why would it be used? I
>> guess that an ASA might use dry run negotiation to inform future actual
>> negotiation. Can anything be inferred from a dry run negotiation about
>> how an actual negotiation will go? When participating in a dry run
>> negotiation, how should an ASA decide what response to make? Should it
>> take into account current resource availability? Or should it respond
>> based on best-case or worst-case resource availability? Or what?
>> This requires further clarification.
> Again, *all* these issues are specific to the objective in question,
> and they are intentionally not addressed in the protocol design or
> the API.

So this is a semantics free feature? Then all semantics are determined 
by side agreements? Again, IMO this is a bad thing.

>> 3) MAJOR: Confusing semantics of 'request_negotiate'
>> In section 2.3.5 I don't understand the following:
>>            1.  The 'session_nonce' parameter is null.  In this case the
>>                negotiation has succeeded in one step and the peer has
>>                accepted the request.  The returned 'proffered_objective'
>>                contains the value accepted by the peer, which is therefore
>>                equal to the value in the requested 'objective'.  For this
>>                reason, no session nonce is needed, since the session has
>>                ended.
>> IIUC this requires a network exchange with the peer. I don't see how
>> this can complete *immediately*. ISTM that this could only complete
>> immediately if it were satisfied from a local cache. That doesn't seem
>> appropriate for this function.
> I changed that at your request from "immediately" to "in one step": the request_negotiate() from A gets back an immediate end_negotiate()
> from B and we're done. If "in one step" still isn't clear, we could
> change to "in one message exchange".
>> Similarly, in bullet 2 I don't see how the proffered_objective would be
>> available in the initial call, before a response has been received from
>> the peer..
> It can't. The call doesn't complete until the first response has arrived
> (in a threaded implementation).
>> Does "immediately" here simply mean that the negotiation is completed in
>> one exchange between the two ends? If so, isn't a session nonce still
>> required in an event loop implementation in order to handle the one
>> response?
> I think that's a good catch. It's fixable, because the session nonce is
> generated anyway when the request message is generated, but if the
> return code is noReply, the nonce is needed. The same applies to
> synchronize(). The issue is implied by this bullet:
>   * Event loop implementation: An additional read/write 'session_nonce'
>     parameter is used.
> but that's incomplete.
> My bad. I missed this point because I only coded the threaded version,
> which is the natural approach in Python. We need to add an extra case
> that only applies in the event loop case:
> * If the 'errorcode' parameter has the value 2 ('noReply'), no response
> has been received so far. The 'session_nonce' parameter must be presented
> in subsequent calls.

IIUC, the session nonce isn't needed at all in threaded implementations. 
And in the event loop implementation it is needed, even in this case. So 
ISTM this need not be singled out as a special case.

>> Bullet 2 also says:
>>                ... The
>>                returned 'proffered_objective' contains the first value
>>                proffered by the negotiation peer.  The contents of this
>>                instance of the objective must be used to prepare the next
>>                negotiation step (see negotiate_step() below) because it
>>                contains the updated loop count, sent by the negotiation
>>                peer.  The GRASP code automatically decrements the loop
>>                count by 1 at each step, and returns an error if it becomes
>>                zero.
>> I guess that the 'proffered_objective' in the return parameters is the
>> counter-offer to the objective passed in the call. And that you expect
>> the objective value used in any subsequent negotiate_step to be derived
>> by modifying this value. So far this new wording has improved my
>> understanding.
> Yes, and it's probably worth adding the term "counter-offer" for clarity.
>> But the loop_count in the objective is especially confusing. It seems
>> that it is handled quite differently from the rest of the objective. You
>> specify (in that it has a default value of GRASP_DEP_LOOPCT.
>> But who is expected to initialize this? (Is it simply that the ASP
>> should use this value if it doesn't have any particular preference?)
> Yes, that's the intention (but the details depend on the
> implementation; in my Python code the class definition for
> 'objective' sets the default). The default is set quite low
> since we assumed that normal negotations require few steps.
>> Then you say that the GRASP decrements this. Is this decrementing done
>> on the calling side before sending the message, the calling side after
>> receiving the response? Or by the peer, on receipt or when sending the
>> response? Is it permissible for the ASA to modify this value during
>> negotiation? Since this seems intended to prevent a loop, having clarity
>> about how this value is managed seems important.
> That is well specified in the protocol spec itself. The sender
> of each M_NEGOTIATE message decrements it, i.e. it happens
> inside negotiate_step(). It would do no harm to mention that.
> What happens if an ASA messes with the loop count? Bad things
> could happen, but just as one end can prolong the timeout with
> negotiate_wait(), it could prolong the negotiation with
> obj.loop_count += 1 if it's useful. That, for example, enables
> using GRASP for bulk transfer, if you don't like FTP (joke
> intended). (Not really a joke, though; I have running code for
> simple file transfer using GRASP, and I find it a handy way to
> move stuff from Windows to my Linux test machine.)
> A malicious ASA could do all sorts of damage, so I don't think
> that loop count manipulation is a serious concern. The other
> end, if paranoid, can always check if the loop count is behaving
> strangely.

IIUC the only involvement the application should have with the loop 
count is to override the default at the start of negotiation. After 
that, it is really up to the GRASP to manage it. If that is right, why 
expose it as part of the objective? Why not simply allow the override 
value in the request_negotiate call, and otherwise let the GRASP manage 
it for the duration of the negotiation?

>> 4) MINOR: negotiate_wait
>> The negotiate_wait call allows one ASA to extend the timeout of another
>> ASA. This could, in perverse cases, cause an ASA to wait indefinitely.
>> ISTM that this is dangerous. I would think it better make the other ASA
>> aware of the desire to extend the timeout and let it decide whether to
>> do so.
> That's more of a comment on the GRASP spec & implementation than
> on the API. But our trust model is that if a node can get access
> to the autonomic control plane, we trust the ASAs in that node.
> There is new text on that aspect following the Security Area review.

As you wish.

>> 5) MAJOR: Consistency of Objective definitions
>> In section and elsewhere, presumably all parties that use a
>> particular objective must agree on the values of synch, neg, dry, and
>> the size and structure of the value.
>> There is no communication of the size and structure in the abstract API.
>> Presumably the implementation of a language binding to the API is
>> required to at least communicate the size and alignment requirements to
>> the core.
> No, it's not needed. That's the great advantage of using CBOR;
> the CBOR object is self-defining and flows opaquely through the
> GRASP code (modulo the maximum message size, which should prevent
> buffer overflow issues).

Sorry. I was thinking that CBOR was just a possibility for the value. 
Now I see that the value be representable in CBOR and in fact will be so 
represented on the wire.

Based on that I withdraw my concern.

>> The matching of definitions between nodes must be achieved
>> solely by the name,
> Yes.
>> the respective language bindings at the two ends,
> No, again CBOR bridges over this. In a C implementation,
> the value part of the objective is delivered to the ASA
> as a CBOR-encoded byte string, and the programmer uses
> a CBOR library to de-serialize it. In Python, Ruby, etc.
> the API can deserialize it automatically.
>> and out of band mutual agreement.
> Yes, that's why each objective needs a spec of its syntax and semantics
> and an IANA registration of its name. (Or it can use the name format
> for privately defined objectives, but it still needs a defined syntax
> and semantics).

Ah, got it! While I read draft-ietf-anima-grasp in preparation for my 
last call review I obviously didn't absorb it all, and missed this point.

>> Furthermore, different language
>> bindings may use different in-memory representations of the value. In
>> such cases, how is the on-wire format to be determined?
> CBOR (specified using CDDL). For the protocol itself, that is all
> you need to specify. The semantics needs English.

OK. Got it.

>> If the two ends disagree on size and structure then problems will occur.
>> Perhaps the core can identify size mismatches based on size communicated
>> on the wire vs the size defined by the language binding, but there are
>> no error codes defined for this situation. And of course differing
>> structures with the same size would not be detectable.
>> Furthermore, there is potential for different ASAs to (accidentally)
>> have incompatible definitions for the same objective. What happens in
>> this case? How can blame be ascribed so that the problem can be fixed?
> You have to find out which one doesn't meet the spec. It's app-level
> debugging. (It's also one of the reasons I made sure my prototype
> could run two separate instances in one machine: makes debugging
> a whole lot easier.)

The IANA registration of names/formats resolves my concern. I suppose 
that means that revisions to the format will require choosing a new name.

>> IMO more needs to be said about all of this. At the least a number of
>> disclaimers that put the burden on the ASAs to recognize the risk, take
>> these potential problems into account and avoid them. But there could be
>> some requirements placed on API language bindings and core
>> implementations to deal with some of these. And probably some added
>> error codes to report what problems can be detected.
> The only one I found necessary was CBORfail ("CBOR decode failure")
> and that's pretty fatal. It means the other party has sent a byte
> string that is not in fact valid CBOR. Beyond that, the value
> is passed up to the consumer (the ASA) which does indeed have to
> validate the contents. That's CBOR 101, but it should be stated.


>> 6) MINOR/MAJOR: Session State
>> I continue to find the lifetime and state of a session to be unclear.
>> The API calls that return session_nonce seem to signal creation of a new
>> session. The end_negotiate() call seems to terminate a negotiation
>> session. But what causes other sessions to end? This seems important
>> because there is state associated with a session that consumes resources
>> and can't be reclaimed until the session ends. So it should be important
>> for the ASA to end all sessions. Some clarification of this seems
>> important both for core implementors and for ASA developers that will be
>> using the API.
> I'm not sure this belongs in the API though. The ASA doesn't need
> to worry about this. A session ends either when it completes normally
> by end_negotiate(), or a synchronize() reply is received, or discovery
> terminates, or a session times out, or a loop count is exhausted, or
> something causes a socket error, etc. There are 29 calls to
> _disactivate_session() in my GRASP implementation.

So IIUC the GRASP will is expected to discard all state either when it 
sees an end_negotiate, the loop count is exhausted, or the timeout 
expires. In that case my concerns are resolved.

>> (Or is this document only for implementors of core and those
>> instantiating a particular language binding of the API, with
>> documentation for end users left to others?)
> I'm pretty sure we'll need an O'Reilly book one of these days.
> But in fact draft-ietf-anima-asa-guidelines is on its way,
> and will be recommended reading for ASA implementers, along
> with the API.
>> 7) MINOR/MAJOR: Timeout
>> Section indicates that the API returns an error response to the
>> ASA if the timeout expires. But the other end is presumably still
>> working on the request and will eventually send a response. What does
>> the core do when it receives this? Must it retain state so that it can
>> detect the case and ignore the message? It seems that this could result
>> in the two peers disagreeing on some state.
> The timeout expiring is fatal to the session, so any further
> messages for that session will not be processed. Depending on
> the sequence of events, the actual error code returned at both
> ends might vary. The most likely case is that it will show up
> as a socket error at both ends - a read() timeout at the
> receiving end and a dead socket at the sending end.
> So the two ends will both see a failed negotiation.

As long as the GRASP discards incoming messages for sessions it has no 
state for then this seems fine.

>> 8) MINOR: Text regarding "minimum_TTL"
>> There is a small problem with the following in section 2.3.4:
>>         -  If the parameter 'minimum_TTL' is greater than zero, any
>>            locally cached locators for the objective whose remaining time
>>            to live in milliseconds is less than or equal to 'minimum_TTL'
>>            are deleted first.  Thus 'minimum_TTL' = 0 will flush all
>>            entries.
>> The first sentence qualifies the paragraph to cases where minimum_TTL is
>> greater than zero. But the final sentence then infers the behavior when
>> minimum_TTL is equal to zero.
>> Also, minimum_TTL is typed as an integer, which permits negative values.
>> I gather that negative values are not allowed. I can suggest two ways to
>> fix this:
>>         -  The parameter 'minimum_TTL' MUST be greater than or equal to
>>            zero. Any locally cached locators for the objective whose
>>            remaining time to live in milliseconds is less than or equal to
>>            'minimum_TTL' are deleted first.  Thus 'minimum_TTL' = 0 will
>>            flush all entries.
> OK.
>> Or, change they type to unsigned integer.
> Not all languages have unsigned integers, so I'd prefer your version.
>> Then the statement can be
>> simplified by removing the first sentence:
>>         -  Any locally cached locators for the objective whose remaining
>>            time to live in milliseconds is less than or equal to
>>            'minimum_TTL' are deleted first.  Thus 'minimum_TTL' = 0 will
>>            flush all entries.
>> 9) MINOR: Terminology - Session nonce
>> The new first paragraph of section 2.2.3 talks about identifying the
>> session by a pseudo-random session identifier, and tagging it with an IP
>> for further uniqueness. The 2nd paragraph talks about a session_nonce.
>> It isn't clear at this point in the text if these the same thing. Or is
>> the session id shared on the wire, the IP tag added by the core, and the
>> session_nonce an artifact of the API, shared only between the ASA and
>> the core?
> Yes, it's locally munfactured, but the natural implementation is
> specified in the GRASP spec:
>   However, there is a finite probability that two nodes might generate
>   the same Session ID value.  For that reason, when a Session ID is
>   communicated via GRASP, the receiving node MUST tag it with the
>   initiator's IP address to allow disambiguation.
> Will clarify.
>> Section seems to confirm that the nonce is just an identifier
>> used between the core and the ASA. But here it says that using the id
>> plus the IP is simply one possible implementation choice.
> Right. Because although GRASP says what I just quoted, an implementor
> might choose anything else that they preferred. Is there any reason
> to specify this more tightly?

No. No need. Its just a matter of being clear. My confusion arose from 
the text in 2.2.3:

    ... It is
    identified by a pseudo-random session identifier tagged with an IP
    address of the initiator of the session to guarantee uniqueness.
    On the first call in a new GRASP session, the API returns a
    'session_nonce' value based on the GRASP session identifier.

I took this as specifying the representation of the session noncee. 
(Though on rereading I see the "based on" does give some wiggle room.)

Perhaps a minor tweak in wording would help:

    On the first call in a new GRASP session, the API returns a
    'session_nonce' value used to identify the session in the API.

Then spells out implementation options for this.

>> Further, I question whether "nonce" is the best term to use here. ISTM
>> that "handle" (session_handle) would more clearly reflect the purpose of
>> this item.
> We could debate that at length, I think. To me 'nonce' suggests
> something that's intentionally meaningless and for a single session.
> 'Handle' suggests something with longer term significance.
> seems to be
> on my side. Anyway, we will await advice from the AD before
> changing the terminology.

Good naming is difficult. This was only a suggestion.

>> I think it would be helpful to be clearer in distinguishing what is
>> fundamental vs what is implementation choice. For instance, in section
>> 2.2.3:
>>      A GRASP session consists of a finite sequence of messages (for
>>      discovery, synchronization, or negotiation) between a pair of ASAs.
>>      The core identifies it on the wire by a pseudo-random session
>>      identifier. Further details are given in [I-D.ietf-anima-grasp].
>>      On the first call in a new GRASP session, the API returns a
>>      'session_handle' value used to identify the session. This
>>      value must be used in all subsequent calls for the same session, and
>>      will be provided as a parameter in the callback functions.  By this
>>      mechanism, multiple overlapping sessions can be distinguished, both
>>      in the ASA and in the GRASP core.  The value of the 'session_handle"
>>      is opaque to the ASA.
>> This establishes the role and relationship of the two terms, while
>> section gives a possible implementation without as much
>> confusion. (It will require some rewording to switch from session_nonce
>> to session_handle. It already uses "session handle" in passing.)
>> 10) NIT: Terminology - ASA nonce
>> For similar reasons to those above for session_nonce/session_handle, IMO
>> it would be clearer to use asa_handle rather than asa_nonce. But this is
>> only a suggestion.
> Roman suggested using different terms, e.g. ASA "nonce" and
> Session "handle".

Again, only a suggestion.

> Thanks again for all your work on this.

You are welcome.