Re: [Dime] Conclusion for Sequence Numbers

Nirav,

The issue with considering only the last report is that it introduces
the potential for thrashing between different values.  This could make
the overload event even worse.

I agree that this should be a rare case.  I still think, however, that
we need to have defined behavior.

One other argument for including the sender of the overload report in
the report itself is security.  The ability of a bad actor to insert a
malicious overload report can be a very effective DOS attack.  I know we
have said we aren't addressing security yet but this seems pretty short
sighted.  Being able to establish that the identity of the sender of an
overload report will be an important part of the security solution.  We
should take this step in that direction.

Steve

On 12/12/13 10:26 AM, Nirav Salot (nsalot) wrote:
>
> Steve,
>
>  
>
> So as I understand it is not a common case for different agent to
> provide different view of the same realm and this may have happen
> during a small window when synchronization has not taken place between
> the geographically distributed agents.
>
> Right?
>
>  
>
> If so, I can understand the following part of your proposal.
>
> One proposal for how we deal with the fact that different reports can
> have different values is to have the reacting node treat the first
> reporting node as the authority for reporting realm overload state for
> that overload instance.
>
>  
>
> i.e. I can understand to define some behavior for the reacting node to
> handle the case (which is anyway rare case) when two agents provide
> different realm-report for the same report.
>
> The behavior could be simply to consider only the last report when two
> agents have sent two different reports of the same realm. (And this
> will also work when the same agent has sent two different
> realm-reports, purposefully -- e.g. due to the change in the realm
> overload).
>
> But this still does not require adding of agent's identity in the
> overload-report.
>
>  
>
> Regards,
>
> Nirav.
>
>  
>
> *From:*Steve Donovan [mailto:srdonovan@usdonovans.com]
> *Sent:* Thursday, December 12, 2013 7:50 PM
> *To:* Nirav Salot (nsalot); Jouni Korhonen
> *Cc:* Ben Campbell; dime@ietf.org list
> *Subject:* Re: [Dime] Conclusion for Sequence Numbers - was Re: OVLI:
> comments to 4.3
>
>  
>
> Nirav,
>
> See inline.
>
> Steve
>
> On 12/12/13 6:40 AM, Nirav Salot (nsalot) wrote:
>
>     All,
>
>      
>
>     I do not understand this discussion regarding different agents of the same realm having different view of the realm and provide different overload report.
>
> We can make the statement that all senders of realm reports should
> send the same report.  This does not guarantee that it will always
> happen.  If agents are sending the report, they are generally
> distributed elements.  In very large networks, this distribution can
> span continents.  There will be a lag in the "synchronization" of the
> realm overload information.
>
> My concern is that we have well defined behavior for when a reactor
> receives conflicting realm reports.  We need to avoid thrashing
> between different reduction levels, which could make the overload
> situation worse.
>
>  
> Additionally, I also do not understand the proposal of adding identity of the agent generating "realm report" into the report.
>
> Adding the endpoint identity is needed to allow the reacting node to
> know that it is receiving two different views of Realm overload from
> two different reporting end-points.
>
>  
> What is the use of this identity at the reacting node when the report is realm report? Why should the reacting node care who generated the realm report?
>
> One proposal for how we deal with the fact that different reports can
> have different values is to have the reacting node treat the first
> reporting node as the authority for reporting realm overload state for
> that overload instance.  In this case, the reacting node would ignore
> reports received from other reporting nodes. In order to ignore
> reports from non authoritative endpoints requires the reacting node to
> know which endpoints send which reports.
>
>  
>  
> Regards,
> Nirav.
>  
> -----Original Message-----
> From: DiME [mailto:dime-bounces@ietf.org] On Behalf Of Jouni Korhonen
> Sent: Thursday, December 12, 2013 5:06 PM
> To: Steve Donovan
> Cc: Ben Campbell; dime@ietf.org <mailto:dime@ietf.org> list
> Subject: Re: [Dime] Conclusion for Sequence Numbers - was Re: OVLI: comments to 4.3
>  
> Steve,
>  
> On Dec 11, 2013, at 3:13 PM, Steve Donovan <srdonovan@usdonovans.com> <mailto:srdonovan@usdonovans.com> wrote:
>  
>
>     Jouni,
>
>      
>
>     We need the sequence number to be strictly increasing.  I don't see the need for it to increase in uniform amounts.  Using time does fit these requirements.  I'm ok with using time as long as we don't call the AVP timestamp.
>
>      
>
>     Ulrich does bring up an interesting use case, where a client is receiving realm reports for the same realm from different agents.  We need to define the clients behavior in this case.  
>
>  
> Any suggestions? I mean agents may have hugely different view of the realm if they are acting on their own.
>  
>
>     Presumably the client needs to be able to determine who generated the realm report.  This cannot be determine based on the content of the message or the connection on which the message arrived.  It seems like we might need "Report Generator Diameter ID" in the overload report specifically for Realm reports.  
>
>      
>
>     Once the client is able to differentiate between realm reports sent by different agents (or servers) we need logic defining how the client deals with a new overload report.  
>
>  
> I need now to check one of the basic assumptions on DOIC now so that we have the same understanding. I went back to the endpoint text in Section 5.1. There, for example in Figures
> 4 and 5 the DOIC association and the endpoint assumption does does not work IMHO because we have no endpoint identity in the OLR. In order the endpoint assumption to work (as I drew it on the white board in Porto), it would require as many Diameter level sessions as there are DOIC associations.
>  
> So.. has assumptions shifted in a meanwhile and I have just not paid attention?
>  
>
>     I see a couple of options (others will probably see options I am missing):
>
>      
>
>     - Use the last received realm report - This introduces the possibility of thrashing between two different reduction values and different durations.  Note that this approach does not require the source of the report to be included in the report.
>
>      
>
>     - Only listen to one source of realm overload - The approach would be to remember who sent the first overload report from the realm and ignore realm overload reports from other sources.  This behavior would likely be constrained to a single occurrence of realm overload.  Meaning that the "memory" of the report source would only last as long as that overload event persists.  Once the overload event goes away, the report source would be forgotten and a new source could be used for the next occurrence.
>
>      
>
>     On the surface, the second approach looks better to me.
>
>  
> Or add the identity of the OLR originator explicitly if it cannot be determined implicitly (i.e. from the Diameter message's Origin-Host/Realm).
>  
> Or assume the endpoint really is the endpoint in DOIC and Diameter session sense.
>  
> - Jouni
>  
>
>      
>
>     Steve
>
>      
>
>     On 12/11/13 2:15 AM, Jouni wrote:
>
>         Ulrich,
>
>          
>
>         I might be slow but.. Section 4.4 says
>
>          
>
>            control endpoints.  The sequence number is only required to be unique
>
>            between two overload control endpoints and does not need to be
>
>          
>
>         Unique between two endpoints..
>
>          
>
>         Section 5.1 talks about endpoints:
>
>          
>
>            of an arbitrary Diameter network.  The overload control information
>
>            is exchanged over on a "DOIC association" between two communication
>
>            endpoints.  The endpoints, namely the "reacting node" and the
>
>            "reporting node" do not need to be adjacent Diameter peer nodes, 
>
>         nor
>
>          
>
>         So if your agents inject realm reports, they need to be endpoints to 
>
>         the client. Similar to Figure 5. Therefore the sequence number spaces 
>
>         between
>
>         C-A1 and C-A2 are separate.
>
>          
>
>         Now it is not clear to me, whether in your reasoning the C would see 
>
>         the server identity (as the endpoint) when there is an active "DEP 
>
>         agent" on the path. That would not clearly work and not be align with 
>
>         the endpoint assumption.
>
>          
>
>         Note that at some point of time we had (at least on a discussion 
>
>         level in f2f meeting) report originator identity in the OLR. That 
>
>         would make endpoint identification trivial. Now a "DEP agent" needs 
>
>         to act as a "server" for its clients in order to appear as an endpoint.
>
>          
>
>         - Jouni
>
>          
>
>         ps: still think the use of Time is simpler..
>
>          
>
>          
>
>         On Dec 11, 2013, at 9:43 AM, Wiehe, Ulrich (NSN - DE/Munich) wrote:
>
>          
>
>          
>
>             That's not predictable. It may be the same server in some cases, and different servers in other cases.
>
>              
>
>             -----Original Message-----
>
>             From: ext Jouni [
>
>             mailto:jouni.nospam@gmail.com
>
>             ]
>
>             Sent: Wednesday, December 11, 2013 8:38 AM
>
>             To: Wiehe, Ulrich (NSN - DE/Munich)
>
>             Cc: Ben Campbell;
>
>             dime@ietf.org <mailto:dime@ietf.org>
>
>              list; Steve Donovan
>
>             Subject: Re: [Dime] Conclusion for Sequence Numbers - was Re: OVLI: 
>
>             comments to 4.3
>
>              
>
>              
>
>             Ulrich,
>
>              
>
>             On Dec 11, 2013, at 9:21 AM, Wiehe, Ulrich (NSN - DE/Munich) wrote:
>
>              
>
>              
>
>                 Jouni,
>
>                  
>
>                 ad 1. "monotonically" does not express your intention. What we are looking for may be "stepwise with fixed step".
>
>                  
>
>                 Ad 2. Is not necessarily a mistake that could result in out-of-sequence sequence numbers. When a client C sends a realm-type requests towards any server in the realm, an agent A1 that selects the server would send back the realm-type OLR with sequence number s1. The next realm-type request sent by C (that survived the throttling) may take a path that does not include A1 but A2. A2 then selects the server and sends back a sequence number s2. Nothing ensures that s1 and s2 are in sequence.
>
>                  
>
>             Would the server in both cases (via A1 and A2) be the same?
>
>              
>
>             - Jouni
>
>              
>
>              
>
>              
>
>                 Ulrich
>
>                  
>
>                  
>
>                 -----Original Message-----
>
>                 From: ext Jouni Korhonen [
>
>                 mailto:jouni.nospam@gmail.com
>
>                 ]
>
>                 Sent: Tuesday, December 10, 2013 10:31 PM
>
>                 To: Wiehe, Ulrich (NSN - DE/Munich)
>
>                 Cc: Ben Campbell;
>
>                 dime@ietf.org <mailto:dime@ietf.org>
>
>                  list; Steve Donovan
>
>                 Subject: Re: [Dime] Conclusion for Sequence Numbers - was Re: OVLI: 
>
>                 comments to 4.3
>
>                  
>
>                 Ulrich,
>
>                  
>
>                 On Dec 10, 2013, at 4:31 PM, "Wiehe, Ulrich (NSN - DE/Munich)" 
>
>                 <ulrich.wiehe@nsn.com> <mailto:ulrich.wiehe@nsn.com>
>
>                  wrote:
>
>                  
>
>                  
>
>                     Jouni,
>
>                      
>
>                     1. I find the texts
>
>                     a) "The sequence number ... does not need to be monotonically increasing"
>
>                     and
>
>                      
>
>                 Means the delta from old-seqno to new-seqno can be any non-negative 
>
>                 integer (within the given limits) not something fixed step/delta 
>
>                 (like +1). As long as "new-seqno >= old-seqno" holds we are fine.
>
>                  
>
>                  
>
>                     b) "...the new sequence number MUST be greater or equal than the old sequence number..."
>
>                     contradicting.
>
>                     Can you please clarify.
>
>                      
>
>                 See above. (mind the overflow case)
>
>                  
>
>                  
>
>                     2. The expected behaviour when receiving an out-of-sequence sequence number within OC-OLR is described in 4.3:
>
>                     "The receiver SHOULD discard an OC-OLR AVP with a sequence number that is less than previously received one."
>
>                     I don't find this very robust. Once a higher sequence number (received erroneously by mistake) is accepted you cannot (easily) recover.
>
>                      
>
>                 I find it more robust in a sense that I should not care about stale old information.
>
>                 However, since we are piggybacking (by popular demand) we have 
>
>                 little room for seqno re-sync negotiation.
>
>                  
>
>                 What is the mistake you refer here? A misbehaving implementation? 
>
>                 In that case, it deserves to get a manual intervention once figured 
>
>                 out by admins checking alarms and logs. If the mistake is due other 
>
>                 things, like endpoints being out of sync, we currently have no written down mechanism to survive that.
>
>                  
>
>                  
>
>                     3. The expected behaviour when receiving an out-of-sequence sequence number within the OC-Supported-Features AVP is not described. What is the intention here?
>
>                      
>
>                 No intention. Just a sloppy specification. You are right that 
>
>                 something needs to be done & clarified here. (again the semantics 
>
>                 of Time would nice..)
>
>                  
>
>                 I'll propose something. Others should too ;)
>
>                  
>
>                 - Jouni
>
>                  
>
>                  
>
>                     Ulrich
>
>                      
>
>                     -----Original Message-----
>
>                     From: DiME [
>
>                     mailto:dime-bounces@ietf.org
>
>                     ] On Behalf Of ext Jouni Korhonen
>
>                     Sent: Tuesday, December 10, 2013 8:28 AM
>
>                     To: Ben Campbell;
>
>                     dime@ietf.org <mailto:dime@ietf.org>
>
>                      list; Steve Donovan
>
>                     Subject: Re: [Dime] Conclusion for Sequence Numbers - was Re: 
>
>                     OVLI: comments to 4.3
>
>                      
>
>                      
>
>                     Fine.. lets define then the sequence number semantics. Basic 
>
>                     unsigned integer math. The text proposal is the following:
>
>                      
>
>                     4.4.  OC-Sequence-Number AVP
>
>                      
>
>                     The OC-Sequence-Number AVP (AVP code TBD3) is type of Unsigned64.
>
>                     Its usage in the context of the overload control is described in 
>
>                     Sections 4.1 and 4.3.
>
>                      
>
>                     From the functionality point of view, the OC-Sequence-Number AVP 
>
>                     MUST be used as a non-volatile increasing counter between two 
>
>                     overload control endpoints.  The sequence number is only required 
>
>                     to be unique between two overload control endpoints and does not 
>
>                     need to be monotonically increasing.
>
>                      
>
>                     When comparing two sequence numbers, the new sequence number MUST 
>
>                     be greater or equal than the old sequence number within a window 
>
>                     that is half of the size of the maximum sequence number. This 
>
>                     allows a simple handling of the sequence number overflow using 
>
>                     unsigned integer arithmeticf:
>
>                      
>
>                       #define WINDOW 0x8000000000000000ULL
>
>                      
>
>                       bool verify_seqnum( uint64_t newsn, uint64_t oldsn ) {
>
>                           if (newsn - oldsn <= WINDOW)
>
>                               // newsn >= oldsn
>
>                               return true;   
>
>                           } else
>
>                               // outside window or newsn < oldsn
>
>                               return false;  
>
>                           }
>
>                       }
>
>                      
>
>                      
>
>                      
>
>                     The above should even work is someone shovels NTP times into 
>
>                     sequence numbers with a blind typecasting.
>
>                      
>
>                     - Jouni
>
>                      
>
>                     On Dec 10, 2013, at 12:34 AM, Ben Campbell <ben@nostrum.com> <mailto:ben@nostrum.com>
>
>                      wrote:
>
>                      
>
>                      
>
>                         On Dec 9, 2013, at 10:00 AM, Steve Donovan <srdonovan@usdonovans.com> <mailto:srdonovan@usdonovans.com>
>
>                          wrote:
>
>                          
>
>                          
>
>                             Jouni,
>
>                              
>
>                             I propose that we keep the name OC-Sequence-Number but that we use the Time type for OC-Sequence-Number.  It is misleading and potentially confusing to call it OC-Time-Stamp.  
>
>                              
>
>                              
>
>                         I could live with that, although I would rather just define the expected properties of the sequence number, and leave the implementation up to the implementor. I assume your reasoning for not calling it a timestamp is that you do not want people to try to use it as a time base reference. If so, then we don't require any connection to a clock. We just need it to be monotonically increasing.
>
>                          
>
>                          
>
>                             We might consider expanding on the format of the AVP to make it something like Session-ID, where it is a concatenation of the Diameter-ID of the generating node and a timestamp.  This might help the reacting node keep track of which sequence number it has received.
>
>                              
>
>                              
>
>                         Do we need a uniqueness across multiple nodes property? If so, why?
>
>                          
>
>                          
>
>                             Steve
>
>                              
>
>                             On 12/9/13 5:37 AM, Jouni Korhonen wrote:
>
>                              
>
>                                 Folks
>
>                                  
>
>                                 Could we conclude on the sequence number vs. time stamp vs. something else?
>
>                                 We got more important places to spend our energy than this ;)
>
>                                  
>
>                                 My proposal is the following (based on the original pre-00 design):
>
>                                  
>
>                                 o We change the OC-Sequence-Number to OC-Time-Stamp in all occurrences
>
>                                 in the -01.
>
>                                 o We use RFC6733 Time type for the OC-Time-Stamp. RFC6733 gives us
>
>                                 already exact definition how to handle the AVP.
>
>                                 o Define that the OC-Time-Stamp is the time of the creation of the 
>
>                                 "original" AVP within whose context the time stamp is present.
>
>                                 o The OC-Time-Stamp AVP uniqueness is still considered to be in scope
>
>                                 of the communicating endpoints.
>
>                                 o The time stamp can be used to quickly determine if the content of
>
>                                 the encapsulating AVP context has changed (among other properties).
>
>                                 This would be useful specifically in the future when the encapsulating
>
>                                 grouped AVPs  grow in size and functionality.
>
>                                  
>
>                                  
>
>                                 - Jouni
>
>                                  
>
>                                 _______________________________________________
>
>                                 DiME mailing list
>
>                                  
>
>                                  
>
>                                 DiME@ietf.org <mailto:DiME@ietf.org>
>
>                                 https://www.ietf.org/mailman/listinfo/dime
>
>                                  
>
>                                  
>
>                                  
>
>                                  
>
>                                  
>
>                             _______________________________________________
>
>                             DiME mailing list
>
>                              
>
>                             DiME@ietf.org <mailto:DiME@ietf.org>
>
>                             https://www.ietf.org/mailman/listinfo/dime
>
>                         _______________________________________________
>
>                         DiME mailing list
>
>                          
>
>                         DiME@ietf.org <mailto:DiME@ietf.org>
>
>                         https://www.ietf.org/mailman/listinfo/dime
>
>                     _______________________________________________
>
>                     DiME mailing list
>
>                      
>
>                     DiME@ietf.org <mailto:DiME@ietf.org>
>
>                     https://www.ietf.org/mailman/listinfo/dime
>
>          
>
>      
>
>  
> _______________________________________________
> DiME mailing list
> DiME@ietf.org <mailto:DiME@ietf.org>
> https://www.ietf.org/mailman/listinfo/dime
>  
>
>  
>