Re: [Gen-art] Gen-ART review of draft-ietf-dime-overload-reqs-10

Ben Campbell <ben@nostrum.com> Thu, 22 August 2013 20:50 UTC

Return-Path: <ben@nostrum.com>
X-Original-To: gen-art@ietfa.amsl.com
Delivered-To: gen-art@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id F067811E820D; Thu, 22 Aug 2013 13:50:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.486
X-Spam-Level:
X-Spam-Status: No, score=-102.486 tagged_above=-999 required=5 tests=[AWL=-0.113, BAYES_00=-2.599, SARE_SUB_OBFU_Q1=0.227, SPF_PASS=-0.001, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OXc8-2JcpDCq; Thu, 22 Aug 2013 13:50:24 -0700 (PDT)
Received: from shaman.nostrum.com (nostrum-pt.tunnel.tserv2.fmt.ipv6.he.net [IPv6:2001:470:1f03:267::2]) by ietfa.amsl.com (Postfix) with ESMTP id BF54411E812C; Thu, 22 Aug 2013 13:50:23 -0700 (PDT)
Received: from [10.0.1.9] (cpe-76-187-89-238.tx.res.rr.com [76.187.89.238]) (authenticated bits=0) by shaman.nostrum.com (8.14.3/8.14.3) with ESMTP id r7MKoAjg003176 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Thu, 22 Aug 2013 15:50:10 -0500 (CDT) (envelope-from ben@nostrum.com)
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\))
From: Ben Campbell <ben@nostrum.com>
In-Reply-To: <8D3D17ACE214DC429325B2B98F3AE7129C489C55@MX15A.corp.emc.com>
Date: Thu, 22 Aug 2013 15:50:10 -0500
Content-Transfer-Encoding: quoted-printable
Message-Id: <731F5428-B21C-411E-AC44-EEAD241202C4@nostrum.com>
References: <8D3D17ACE214DC429325B2B98F3AE7129C489319@MX15A.corp.emc.com> <AEC7E92F-7370-4D76-B0DF-E9738FABCB86@computer.org> <8D3D17ACE214DC429325B2B98F3AE7129C489C55@MX15A.corp.emc.com>
To: "Black, David" <david.black@emc.com>
X-Mailer: Apple Mail (2.1508)
Received-SPF: pass (shaman.nostrum.com: 76.187.89.238 is authenticated by a trusted mechanism)
Cc: "bclaise@cisco.com" <bclaise@cisco.com>, "General Area Review Team (gen-art@ietf.org)" <gen-art@ietf.org>, "dime@ietf.org" <dime@ietf.org>, "ietf@ietf.org" <ietf@ietf.org>, Eric McMurry <emcmurry@computer.org>
Subject: Re: [Gen-art] Gen-ART review of draft-ietf-dime-overload-reqs-10
X-BeenThere: gen-art@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "GEN-ART: General Area Review Team" <gen-art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/gen-art>, <mailto:gen-art-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/gen-art>
List-Post: <mailto:gen-art@ietf.org>
List-Help: <mailto:gen-art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/gen-art>, <mailto:gen-art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 22 Aug 2013 20:50:25 -0000

Hi David,

We agree on all your points, and will make the updates in the next version, pending shepherd instructions.

Thanks!

Ben.

On Aug 22, 2013, at 2:50 PM, "Black, David" <david.black@emc.com> wrote:

> Hi Eric,
> 
> This looks good - comments follow ...
> 
>>> a) I assume that overload control development work will derive more specific
>>> security requirements - e.g., as REQ 27 is stated at a rather high level.
>>> The discussion in security considerations section seems reasonable.
>> 
>> We agree with this.  The thinking here was that we didn't want to specify this
>> in a way that would be specific to a particular type of mechanism.  It might
>> not hurt to state that assumption, either as a note on Req 27 or in the sec
>> considerations.
> 
> That would be good to add as a note on REQ 27.
> 
>> The intent was very much as you say, where requirements on individual node
>> capabilities are hoped to result in better overall system behaviors. There are
>> also some requirements that are stated more at the system level (e.g. 7 and
>> 17.) Also the text in section 2.2 that discusses Figure 5 talks about how
>> insufficient server capacity at a cluster of servers behind a Diameter agent
>> can be treated as if the agent itself was overloaded.
>> 
>> On the other hand, any mechanism we design will have to focus on actions of
>> individual nodes, so the numbered requirements tend to focus on that. I'm not
>> sure where to change the balance here--do you have specific suggestions?
> 
> I noted this as editorial rather than a minor issue, as I was mostly concerned
> that the actual design work will be informed by a sufficient architectural "clue"
> that the goal is "better overall system behaviors", which your response indicates
> will definitely be the case ;-).
> 
> Rather than edit individual requirements, how about adding the following sentence
> immediately following the introductory sentence in Section 7?:
> 
> 	These requirements are stated primarily in terms of individual node
> 	behavior to inform the design of the improved mechanism;
> 	that design effort should keep in mind that the overall goal is
> 	improved overall system behavior across all the nodes involved, 
> 	not just improved behavior from specific individual nodes.
> 
>>> This inadequacy may, in turn, contribute to broader congestion collapse
>>> 
>>> "collapse" is not the right word here - I suggest "issues", "impacts",
>>> "effects" or "problems".
>> 
>> We are fine with any of those alternatives.  How about impacts.
> 
> That's fine.  FWIW, "congestion collapse" has a specific (rather severe)
> meaning over in the Transport Area, and that meaning was not intended here.
> 
>> 23.843 is the least stable reference.  I don't have any issue with pointing
>> that out.  The part of it we are referencing is historical front matter
>> though.
> 
> I'd note the reference as work in progress, and put the statement about stable
> front matter (historical is a bad work to use here) in the body of the draft
> that cites the reference.
> 
>> I tried the web and downloaded versions of 2.12.17 and was not able to get the
>> warnings you saw (about the references).  What did it say?
> 
> Sorry, I didn't mean to send you on a wild goose chase :-).  The idnits confusion
> manifested right at the top of the output, where everyone ignores it ...
> 
>   Attempted to download rfc272 state...
>   Failure fetching the file, proceeding without it.
> 
> You didn't reference RFC 272, so that output's apparently courtesy of idnits
> misinterpreting this reference:
> 
> 1195	   [TS29.272]
> 1196	              3GPP, "Evolved Packet System (EPS); Mobility Management
> 1197	              Entity (MME) and Serving GPRS Support Node (SGSN) related
> 1198	              interfaces based on Diameter protocol", TS 29.272 11.4.0,
> 1199	              September 2012.
> 
> I was amused :-).
> 
> Thanks,
> --David
> 
>> -----Original Message-----
>> From: Eric McMurry [mailto:emcmurry@computer.org]
>> Sent: Thursday, August 22, 2013 3:06 PM
>> To: Black, David
>> Cc: ben@nostrum.com; General Area Review Team (gen-art@ietf.org);
>> ietf@ietf.org; dime@ietf.org; bclaise@cisco.com
>> Subject: Re: Gen-ART review of draft-ietf-dime-overload-reqs-10
>> 
>> Hi David,
>> 
>> Thank you for the review.  Your time and comments are appreciated!
>> 
>> comments/questions inline.
>> 
>> 
>> Eric
>> 
>> 
>> 
>> On Aug 17, 2013, at 9:18 , "Black, David" <david.black@emc.com> wrote:
>> 
>>> 
>>> I am the assigned Gen-ART reviewer for this draft. For background on
>>> Gen-ART, please see the FAQ at
>>> 
>>> <http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.
>>> 
>>> Please resolve these comments along with any other Last Call comments
>>> you may receive.
>>> 
>>> Document: draft-ietf-dime-overload-reqs-10
>>> Reviewer: David L. Black
>>> Review Date: August 17, 2013
>>> IETF LC End Date: August 16, 2013
>>> IESG Telechat date: (if known)
>>> 
>>> Summary:
>>> This draft is basically ready for publication, but has nits that should be
>>> fixed before publication.
>>> 
>>> This draft describes scenarios in which Diameter overload can occur and provides
>>> requirements for development of new overload control functionality in Diameter.
>>> It is well written, and the inclusion of scenarios in which overload can occur,
>>> both in terms of the relationships among types of Diameter nodes and actual mobile
>>> network experience is very helpful.
>>> 
>>> I apologize for this review being a day late, as I've been on vacation for most
>>> of this draft's IETF Last Call period.
>>> 
>>> Major issues: (none)
>>> 
>>> Minor issues: (none)
>>> 
>>> Nits/editorial comments:
>>> 
>>> The following two comments could be minor issues, but I'm going to treat them
>>> as editorial, as I expect that they will be addressed in development of the
>>> actual overload functionality:
>>> 
>>> a) I assume that overload control development work will derive more specific
>>> security requirements - e.g., as REQ 27 is stated at a rather high level.
>>> The discussion in security considerations section seems reasonable.
>> 
>> We agree with this.  The thinking here was that we didn't want to specify this
>> in a way that would be specific to a particular type of mechanism.  It might
>> not hurt to state that assumption, either as a note on Req 27 or in the sec
>> considerations.
>> 
>>> 
>>> b) The draft, and especially its requirements in Section 7 are strongly
>>> focused on individual Diameter node overload.  That's necessary, but overload
>>> conditions can be broader, affecting an entire service or application, or
>>> multiple instances of either/both, even if not every individual Diameter node
>>> involved is overloaded.  A number of the requirements, starting with REQ 22
>>> could be generalized to cover broader overload conditions.
>>> 
>>> This (b) has implications for other requirements, e.g., REQ 13 should also be
>>> generalized beyond a single node to avoid increased traffic in an overload
>>> situation, even from a node that is not overloaded by itself.  There are limits
>>> on what is reasonable here, as the desired overload functionality is TCP/SCTP-
>>> like reaction to congestion where individual actions taken by nodes based on
>>> the information they have (which is not the complete state of the network)
>>> results in an overall reduction of load.
>> 
>> The intent was very much as you say, where requirements on individual node
>> capabilities are hoped to result in better overall system behaviors. There are
>> also some requirements that are stated more at the system level (e.g. 7 and
>> 17.) Also the text in section 2.2 that discusses Figure 5 talks about how
>> insufficient server capacity at a cluster of servers behind a Diameter agent
>> can be treated as if the agent itself was overloaded.
>> 
>> On the other hand, any mechanism we design will have to focus on actions of
>> individual nodes, so the numbered requirements tend to focus on that. I'm not
>> sure where to change the balance here--do you have specific suggestions?
>> 
>>> 
>>> Section 1.2, 2nd paragraph:
>>> 
>>>  as network congestion, network congestion can reduce a Diameter nodes
>>> 
>>> "nodes" -> "node's"
>> 
>> good catch.
>> 
>>> 
>>> Section 5, 1st paragraph:
>>> 
>>> This inadequacy may, in turn, contribute to broader congestion collapse
>>> 
>>> "collapse" is not the right word here - I suggest "issues", "impacts",
>>> "effects" or "problems".
>> 
>> We are fine with any of those alternatives.  How about impacts.
>> 
>>> 
>>> Section 7
>>> 
>>> The long enumerated list of requirements is not an easy read.  It would be
>>> better if these could somehow be grouped by functional category, e.g.,
>>> security, transport interactions, operational/administrative, etc.
>> 
>> agree.  It is actually in sections in the XML (denoted by comments), we just
>> did not promote those to visible sections in the txt.  I recall there being
>> some issue with xml2rfc and numbering, but now that the numbers are set, this
>> would not be hard to do.
>> 
>> 
>>> 
>>> idnits 2.12.17 noticed the non-standard RFC 2119 boilerplate - this is fine,
>>> as the boilerplate has been appropriately modified for this draft that
>>> expresses requirements (as opposed to a draft that specifies a protocol).
>>> 
>>> idnits 2.12.17 got confused by the 3GPP and GSMA Informative References.
>>> I assume that they're all sufficiently stable to be informative references.
>>> However, [TR23.843] is a work in progress, and should be noted as such in
>>> its reference - is this needed for any of the other 3GPP or GSMA references?
>> 
>> 23.843 is the least stable reference.  I don't have any issue with pointing
>> that out.  The part of it we are referencing is historical front matter
>> though.
>> 
>> 
>> I tried the web and downloaded versions of 2.12.17 and was not able to get the
>> warnings you saw (about the references).  What did it say?
>> 
>> 
>>> 
>>> Thanks,
>>> --David
>>> ----------------------------------------------------
>>> David L. Black, Distinguished Engineer
>>> EMC Corporation, 176 South St., Hopkinton, MA  01748
>>> +1 (508) 293-7953             FAX: +1 (508) 293-7786
>>> david.black@emc.com        Mobile: +1 (978) 394-7754
>>> ----------------------------------------------------
>>> 
>> 
>