Re: [Dime] Conclusion for Sequence Numbers - was Re: OVLI: comments to 4.3

"Nirav Salot (nsalot)" <nsalot@cisco.com> Thu, 12 December 2013 16:26 UTC

Return-Path: <nsalot@cisco.com>
X-Original-To: dime@ietfa.amsl.com
Delivered-To: dime@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7C32F1AE35B for <dime@ietfa.amsl.com>; Thu, 12 Dec 2013 08:26:18 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.501
X-Spam-Level:
X-Spam-Status: No, score=-9.501 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Gu0Mr01joUh5 for <dime@ietfa.amsl.com>; Thu, 12 Dec 2013 08:26:10 -0800 (PST)
Received: from alln-iport-6.cisco.com (alln-iport-6.cisco.com [173.37.142.93]) by ietfa.amsl.com (Postfix) with ESMTP id AE54E1AE358 for <dime@ietf.org>; Thu, 12 Dec 2013 08:26:09 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=49934; q=dns/txt; s=iport; t=1386865564; x=1388075164; h=from:to:cc:subject:date:message-id:references: in-reply-to:mime-version; bh=LLnJvVtkMqfvan5/nINHdFxXACBwLtg3q6AcWb0aLkg=; b=SYeeRHegewpxYY7Ju5igNBTDQMraE0joKxn7h4bKTJisCNMji+oibVXd YnopRLFhoVB9IwALLl9B1sLZf64/kxIdssZG0Ti7Bktb/STTttYPg148F WCQ8nnbpyC59Mlwm+eYNkKW5FCpx+r7wtpHcMyYFPs7UDRgmPntFUHGZb g=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AksFAGniqVKtJXG8/2dsb2JhbABQCYJGRDhVuFuBHBZ0giUBAQEEAQEBFxNBCwwEAgEIDgMEAQELFgEGByEGCxQJCAIEAQ0FCIdoAxENu1ANhxITBI0AgTkqLQQGAQaDG4ETBJQygXiORYU6gymCKg
X-IronPort-AV: E=Sophos;i="4.93,879,1378857600"; d="scan'208,217";a="6322408"
Received: from rcdn-core2-1.cisco.com ([173.37.113.188]) by alln-iport-6.cisco.com with ESMTP; 12 Dec 2013 16:26:02 +0000
Received: from xhc-aln-x04.cisco.com (xhc-aln-x04.cisco.com [173.36.12.78]) by rcdn-core2-1.cisco.com (8.14.5/8.14.5) with ESMTP id rBCGQ1bS025151 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Thu, 12 Dec 2013 16:26:02 GMT
Received: from xmb-rcd-x10.cisco.com ([169.254.15.227]) by xhc-aln-x04.cisco.com ([173.36.12.78]) with mapi id 14.03.0123.003; Thu, 12 Dec 2013 10:26:01 -0600
From: "Nirav Salot (nsalot)" <nsalot@cisco.com>
To: Steve Donovan <srdonovan@usdonovans.com>, Jouni Korhonen <jouni.nospam@gmail.com>
Thread-Topic: [Dime] Conclusion for Sequence Numbers - was Re: OVLI: comments to 4.3
Thread-Index: AQHO90VQs8ahl6t/j0WtaMBOVpqI85pQvUrg
Date: Thu, 12 Dec 2013 16:26:01 +0000
Message-ID: <A9CA33BB78081F478946E4F34BF9AAA014D31B3C@xmb-rcd-x10.cisco.com>
References: <5BCBA1FC2B7F0B4C9D935572D90006681519DB1B@DEMUMBX014.nsn-intra.net> <C66C8914-AA7A-47F5-8EA4-7B0ECEDA5368@gmail.com> <52A5E902.20605@usdonovans.com> <7475B713-1104-4791-96B1-CE97632A0D69@nostrum.com> <B81C3281-95F9-4F28-8662-2E20A6AE96A1@gmail.com> <5BCBA1FC2B7F0B4C9D935572D90006681519E476@DEMUMBX014.nsn-intra.net> <1CD20507-B0FE-4367-804A-B831734CF060@gmail.com> <5BCBA1FC2B7F0B4C9D935572D90006681519E6DC@DEMUMBX014.nsn-intra.net> <F60A8AF3-C853-4E4A-A023-13E7238066D7@gmail.com> <5BCBA1FC2B7F0B4C9D935572D90006681519E712@DEMUMBX014.nsn-intra.net> <4A151D70-0291-4238-85B1-03BB54B361E6@gmail.com> <52A864FF.10705@usdonovans.com> <F5B6CD52-1FE4-49C5-B827-C04290FA5FAB@gmail.com> <A9CA33BB78081F478946E4F34BF9AAA014D31883@xmb-rcd-x10.cisco.com> <52A9C62A.6010207@usdonovans.com>
In-Reply-To: <52A9C62A.6010207@usdonovans.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.65.48.72]
Content-Type: multipart/alternative; boundary="_000_A9CA33BB78081F478946E4F34BF9AAA014D31B3Cxmbrcdx10ciscoc_"
MIME-Version: 1.0
Cc: Ben Campbell <ben@nostrum.com>, "dime@ietf.org list" <dime@ietf.org>
Subject: Re: [Dime] Conclusion for Sequence Numbers - was Re: OVLI: comments to 4.3
X-BeenThere: dime@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Diameter Maintanence and Extentions Working Group <dime.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dime>, <mailto:dime-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/dime/>
List-Post: <mailto:dime@ietf.org>
List-Help: <mailto:dime-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dime>, <mailto:dime-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 12 Dec 2013 16:26:18 -0000

Steve,

So as I understand it is not a common case for different agent to provide different view of the same realm and this may have happen during a small window when synchronization has not taken place between the geographically distributed agents.
Right?

If so, I can understand the following part of your proposal.
One proposal for how we deal with the fact that different reports can have different values is to have the reacting node treat the first reporting node as the authority for reporting realm overload state for that overload instance.

i.e. I can understand to define some behavior for the reacting node to handle the case (which is anyway rare case) when two agents provide different realm-report for the same report.
The behavior could be simply to consider only the last report when two agents have sent two different reports of the same realm. (And this will also work when the same agent has sent two different realm-reports, purposefully - e.g. due to the change in the realm overload).
But this still does not require adding of agent's identity in the overload-report.

Regards,
Nirav.

From: Steve Donovan [mailto:srdonovan@usdonovans.com]
Sent: Thursday, December 12, 2013 7:50 PM
To: Nirav Salot (nsalot); Jouni Korhonen
Cc: Ben Campbell; dime@ietf.org list
Subject: Re: [Dime] Conclusion for Sequence Numbers - was Re: OVLI: comments to 4.3

Nirav,

See inline.

Steve
On 12/12/13 6:40 AM, Nirav Salot (nsalot) wrote:

All,



I do not understand this discussion regarding different agents of the same realm having different view of the realm and provide different overload report.
We can make the statement that all senders of realm reports should send the same report.  This does not guarantee that it will always happen.  If agents are sending the report, they are generally distributed elements.  In very large networks, this distribution can span continents.  There will be a lag in the "synchronization" of the realm overload information.

My concern is that we have well defined behavior for when a reactor receives conflicting realm reports.  We need to avoid thrashing between different reduction levels, which could make the overload situation worse.




Additionally, I also do not understand the proposal of adding identity of the agent generating "realm report" into the report.
Adding the endpoint identity is needed to allow the reacting node to know that it is receiving two different views of Realm overload from two different reporting end-points.




What is the use of this identity at the reacting node when the report is realm report? Why should the reacting node care who generated the realm report?
One proposal for how we deal with the fact that different reports can have different values is to have the reacting node treat the first reporting node as the authority for reporting realm overload state for that overload instance.  In this case, the reacting node would ignore reports received from other reporting nodes. In order to ignore reports from non authoritative endpoints requires the reacting node to know which endpoints send which reports.






Regards,

Nirav.



-----Original Message-----

From: DiME [mailto:dime-bounces@ietf.org] On Behalf Of Jouni Korhonen

Sent: Thursday, December 12, 2013 5:06 PM

To: Steve Donovan

Cc: Ben Campbell; dime@ietf.org<mailto:dime@ietf.org> list

Subject: Re: [Dime] Conclusion for Sequence Numbers - was Re: OVLI: comments to 4.3



Steve,



On Dec 11, 2013, at 3:13 PM, Steve Donovan <srdonovan@usdonovans.com><mailto:srdonovan@usdonovans.com> wrote:



Jouni,



We need the sequence number to be strictly increasing.  I don't see the need for it to increase in uniform amounts.  Using time does fit these requirements.  I'm ok with using time as long as we don't call the AVP timestamp.



Ulrich does bring up an interesting use case, where a client is receiving realm reports for the same realm from different agents.  We need to define the clients behavior in this case.



Any suggestions? I mean agents may have hugely different view of the realm if they are acting on their own.



Presumably the client needs to be able to determine who generated the realm report.  This cannot be determine based on the content of the message or the connection on which the message arrived.  It seems like we might need "Report Generator Diameter ID" in the overload report specifically for Realm reports.



Once the client is able to differentiate between realm reports sent by different agents (or servers) we need logic defining how the client deals with a new overload report.



I need now to check one of the basic assumptions on DOIC now so that we have the same understanding. I went back to the endpoint text in Section 5.1. There, for example in Figures

4 and 5 the DOIC association and the endpoint assumption does does not work IMHO because we have no endpoint identity in the OLR. In order the endpoint assumption to work (as I drew it on the white board in Porto), it would require as many Diameter level sessions as there are DOIC associations.



So.. has assumptions shifted in a meanwhile and I have just not paid attention?



I see a couple of options (others will probably see options I am missing):



- Use the last received realm report - This introduces the possibility of thrashing between two different reduction values and different durations.  Note that this approach does not require the source of the report to be included in the report.



- Only listen to one source of realm overload - The approach would be to remember who sent the first overload report from the realm and ignore realm overload reports from other sources.  This behavior would likely be constrained to a single occurrence of realm overload.  Meaning that the "memory" of the report source would only last as long as that overload event persists.  Once the overload event goes away, the report source would be forgotten and a new source could be used for the next occurrence.



On the surface, the second approach looks better to me.



Or add the identity of the OLR originator explicitly if it cannot be determined implicitly (i.e. from the Diameter message's Origin-Host/Realm).



Or assume the endpoint really is the endpoint in DOIC and Diameter session sense.



- Jouni





Steve



On 12/11/13 2:15 AM, Jouni wrote:

Ulrich,



I might be slow but.. Section 4.4 says



   control endpoints.  The sequence number is only required to be unique

   between two overload control endpoints and does not need to be



Unique between two endpoints..



Section 5.1 talks about endpoints:



   of an arbitrary Diameter network.  The overload control information

   is exchanged over on a "DOIC association" between two communication

   endpoints.  The endpoints, namely the "reacting node" and the

   "reporting node" do not need to be adjacent Diameter peer nodes,

nor



So if your agents inject realm reports, they need to be endpoints to

the client. Similar to Figure 5. Therefore the sequence number spaces

between

C-A1 and C-A2 are separate.



Now it is not clear to me, whether in your reasoning the C would see

the server identity (as the endpoint) when there is an active "DEP

agent" on the path. That would not clearly work and not be align with

the endpoint assumption.



Note that at some point of time we had (at least on a discussion

level in f2f meeting) report originator identity in the OLR. That

would make endpoint identification trivial. Now a "DEP agent" needs

to act as a "server" for its clients in order to appear as an endpoint.



- Jouni



ps: still think the use of Time is simpler..





On Dec 11, 2013, at 9:43 AM, Wiehe, Ulrich (NSN - DE/Munich) wrote:





That's not predictable. It may be the same server in some cases, and different servers in other cases.



-----Original Message-----

From: ext Jouni [

mailto:jouni.nospam@gmail.com

]

Sent: Wednesday, December 11, 2013 8:38 AM

To: Wiehe, Ulrich (NSN - DE/Munich)

Cc: Ben Campbell;

dime@ietf.org<mailto:dime@ietf.org>

 list; Steve Donovan

Subject: Re: [Dime] Conclusion for Sequence Numbers - was Re: OVLI:

comments to 4.3





Ulrich,



On Dec 11, 2013, at 9:21 AM, Wiehe, Ulrich (NSN - DE/Munich) wrote:





Jouni,



ad 1. "monotonically" does not express your intention. What we are looking for may be "stepwise with fixed step".



Ad 2. Is not necessarily a mistake that could result in out-of-sequence sequence numbers. When a client C sends a realm-type requests towards any server in the realm, an agent A1 that selects the server would send back the realm-type OLR with sequence number s1. The next realm-type request sent by C (that survived the throttling) may take a path that does not include A1 but A2. A2 then selects the server and sends back a sequence number s2. Nothing ensures that s1 and s2 are in sequence.



Would the server in both cases (via A1 and A2) be the same?



- Jouni







Ulrich





-----Original Message-----

From: ext Jouni Korhonen [

mailto:jouni.nospam@gmail.com

]

Sent: Tuesday, December 10, 2013 10:31 PM

To: Wiehe, Ulrich (NSN - DE/Munich)

Cc: Ben Campbell;

dime@ietf.org<mailto:dime@ietf.org>

 list; Steve Donovan

Subject: Re: [Dime] Conclusion for Sequence Numbers - was Re: OVLI:

comments to 4.3



Ulrich,



On Dec 10, 2013, at 4:31 PM, "Wiehe, Ulrich (NSN - DE/Munich)"

<ulrich.wiehe@nsn.com><mailto:ulrich.wiehe@nsn.com>

 wrote:





Jouni,



1. I find the texts

a) "The sequence number ... does not need to be monotonically increasing"

and



Means the delta from old-seqno to new-seqno can be any non-negative

integer (within the given limits) not something fixed step/delta

(like +1). As long as "new-seqno >= old-seqno" holds we are fine.





b) "...the new sequence number MUST be greater or equal than the old sequence number..."

contradicting.

Can you please clarify.



See above. (mind the overflow case)





2. The expected behaviour when receiving an out-of-sequence sequence number within OC-OLR is described in 4.3:

"The receiver SHOULD discard an OC-OLR AVP with a sequence number that is less than previously received one."

I don't find this very robust. Once a higher sequence number (received erroneously by mistake) is accepted you cannot (easily) recover.



I find it more robust in a sense that I should not care about stale old information.

However, since we are piggybacking (by popular demand) we have

little room for seqno re-sync negotiation.



What is the mistake you refer here? A misbehaving implementation?

In that case, it deserves to get a manual intervention once figured

out by admins checking alarms and logs. If the mistake is due other

things, like endpoints being out of sync, we currently have no written down mechanism to survive that.





3. The expected behaviour when receiving an out-of-sequence sequence number within the OC-Supported-Features AVP is not described. What is the intention here?



No intention. Just a sloppy specification. You are right that

something needs to be done & clarified here. (again the semantics

of Time would nice..)



I'll propose something. Others should too ;)



- Jouni





Ulrich



-----Original Message-----

From: DiME [

mailto:dime-bounces@ietf.org

] On Behalf Of ext Jouni Korhonen

Sent: Tuesday, December 10, 2013 8:28 AM

To: Ben Campbell;

dime@ietf.org<mailto:dime@ietf.org>

 list; Steve Donovan

Subject: Re: [Dime] Conclusion for Sequence Numbers - was Re:

OVLI: comments to 4.3





Fine.. lets define then the sequence number semantics. Basic

unsigned integer math. The text proposal is the following:



4.4.  OC-Sequence-Number AVP



The OC-Sequence-Number AVP (AVP code TBD3) is type of Unsigned64.

Its usage in the context of the overload control is described in

Sections 4.1 and 4.3.



>From the functionality point of view, the OC-Sequence-Number AVP

MUST be used as a non-volatile increasing counter between two

overload control endpoints.  The sequence number is only required

to be unique between two overload control endpoints and does not

need to be monotonically increasing.



When comparing two sequence numbers, the new sequence number MUST

be greater or equal than the old sequence number within a window

that is half of the size of the maximum sequence number. This

allows a simple handling of the sequence number overflow using

unsigned integer arithmeticf:



  #define WINDOW 0x8000000000000000ULL



  bool verify_seqnum( uint64_t newsn, uint64_t oldsn ) {

      if (newsn - oldsn <= WINDOW)

          // newsn >= oldsn

          return true;

      } else

          // outside window or newsn < oldsn

          return false;

      }

  }







The above should even work is someone shovels NTP times into

sequence numbers with a blind typecasting.



- Jouni



On Dec 10, 2013, at 12:34 AM, Ben Campbell <ben@nostrum.com><mailto:ben@nostrum.com>

 wrote:





On Dec 9, 2013, at 10:00 AM, Steve Donovan <srdonovan@usdonovans.com><mailto:srdonovan@usdonovans.com>

 wrote:





Jouni,



I propose that we keep the name OC-Sequence-Number but that we use the Time type for OC-Sequence-Number.  It is misleading and potentially confusing to call it OC-Time-Stamp.





I could live with that, although I would rather just define the expected properties of the sequence number, and leave the implementation up to the implementor. I assume your reasoning for not calling it a timestamp is that you do not want people to try to use it as a time base reference. If so, then we don't require any connection to a clock. We just need it to be monotonically increasing.





We might consider expanding on the format of the AVP to make it something like Session-ID, where it is a concatenation of the Diameter-ID of the generating node and a timestamp.  This might help the reacting node keep track of which sequence number it has received.





Do we need a uniqueness across multiple nodes property? If so, why?





Steve



On 12/9/13 5:37 AM, Jouni Korhonen wrote:



Folks



Could we conclude on the sequence number vs. time stamp vs. something else?

We got more important places to spend our energy than this ;)



My proposal is the following (based on the original pre-00 design):



o We change the OC-Sequence-Number to OC-Time-Stamp in all occurrences

in the -01.

o We use RFC6733 Time type for the OC-Time-Stamp. RFC6733 gives us

already exact definition how to handle the AVP.

o Define that the OC-Time-Stamp is the time of the creation of the

"original" AVP within whose context the time stamp is present.

o The OC-Time-Stamp AVP uniqueness is still considered to be in scope

of the communicating endpoints.

o The time stamp can be used to quickly determine if the content of

the encapsulating AVP context has changed (among other properties).

This would be useful specifically in the future when the encapsulating

grouped AVPs  grow in size and functionality.





- Jouni



_______________________________________________

DiME mailing list





DiME@ietf.org<mailto:DiME@ietf.org>

https://www.ietf.org/mailman/listinfo/dime











_______________________________________________

DiME mailing list



DiME@ietf.org<mailto:DiME@ietf.org>

https://www.ietf.org/mailman/listinfo/dime

_______________________________________________

DiME mailing list



DiME@ietf.org<mailto:DiME@ietf.org>

https://www.ietf.org/mailman/listinfo/dime

_______________________________________________

DiME mailing list



DiME@ietf.org<mailto:DiME@ietf.org>

https://www.ietf.org/mailman/listinfo/dime







_______________________________________________

DiME mailing list

DiME@ietf.org<mailto:DiME@ietf.org>

https://www.ietf.org/mailman/listinfo/dime