[Dime] Mail regarding draft-ietf-dime-rfc3588bis - failover (handling not delivered/non-acknowledged STR) and session information

Wojciech Szczypta <wojciech.szczypta@motorolasolutions.com> Thu, 17 January 2019 16:57 UTC

Return-Path: <wojciech.szczypta@motorolasolutions.com>
X-Original-To: dime@ietfa.amsl.com
Delivered-To: dime@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DB2F7130E8E for <dime@ietfa.amsl.com>; Thu, 17 Jan 2019 08:57:36 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.589
X-Spam-Level:
X-Spam-Status: No, score=-0.589 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, KHOP_DYNAMIC=2, RCVD_IN_DNSWL_LOW=-0.7, T_SPF_PERMERROR=0.01] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EjPigITlKh3M for <dime@ietfa.amsl.com>; Thu, 17 Jan 2019 08:57:34 -0800 (PST)
Received: from mx0b-0019e102.pphosted.com (mx0a-0019e102.pphosted.com [67.231.149.242]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9655E130E77 for <dime@ietf.org>; Thu, 17 Jan 2019 08:57:34 -0800 (PST)
Received: from pps.filterd (m0074413.ppops.net [127.0.0.1]) by mx0a-0019e102.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x0HGqEON010001 for <dime@ietf.org>; Thu, 17 Jan 2019 10:57:34 -0600
Received: from mail-vk1-f197.google.com (mail-vk1-f197.google.com [209.85.221.197]) by mx0a-0019e102.pphosted.com with ESMTP id 2q2wgm81u0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for <dime@ietf.org>; Thu, 17 Jan 2019 10:57:33 -0600
Received: by mail-vk1-f197.google.com with SMTP id y72so2337145vky.14 for <dime@ietf.org>; Thu, 17 Jan 2019 08:57:33 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=GZvvhfVTo3g0asNJHDdfBj9RctqSRb8osGtvcP4jcDg=; b=fPv3vLgxNJ1+LMDneLDg/+uxcuwsoj/ickN2s2OA+0eeFiAZwoDEwThWh0T+0mZBXI zOq6NYlYzC4j+2oYv1T9LE8jiQotwKHWz1281f7t8fA21dcOYT/d/sPlv7O2axzJHAZA TqKm8eeUFevZOc79AOWVKmK3cdi37oivV23SV/ywprMMU70Y6QVc+97cQjvjwt/0pBD7 kmuNI0Tn8YD9Dq3rAtefMbmyovrKikLn4wMJyxaIxDIVhtuoSiFYmQsFBE/lImpcooAz EIWG49BhRj6xym7u8Zz6eSz/l45GY3esbhFWkFPh1WmLvSM0SLU1XEZCIYegV2pHAN3y L2hQ==
X-Gm-Message-State: AJcUukeYnQ707TGI/L8DpO3zjttIeQOltAGUhHe1WgNMloP3Fbc8kMjM JjPDHMOMpxgxnxTfXwFhhqwxkZW7lsE71mLBqeLzvxJzRT2ymSg/xe2a2j1aQqpJKIaTVaPhUYP ZqyrbrklEIAIkU+VWoSL3
X-Received: by 2002:ab0:526:: with SMTP id 35mr6119239uax.25.1547744252138; Thu, 17 Jan 2019 08:57:32 -0800 (PST)
X-Google-Smtp-Source: ALg8bN6zQblBG91+gPqUP0DAmDZW7p6Qmy6K5ONyM4yUaQNKHIpefC/rSF604TP6c8Ljpw3cM+41YbBTBCIJ4OKyqnI=
X-Received: by 2002:ab0:526:: with SMTP id 35mr6119224uax.25.1547744251750; Thu, 17 Jan 2019 08:57:31 -0800 (PST)
MIME-Version: 1.0
From: Wojciech Szczypta <wojciech.szczypta@motorolasolutions.com>
Date: Thu, 17 Jan 2019 17:57:20 +0100
Message-ID: <CACP0BpouvmXnjzysGoqKzQxOughvtyRrQdU5+EmF1-jrOrjF8g@mail.gmail.com>
To: dime@ietf.org
Content-Type: multipart/alternative; boundary="000000000000544aaf057faa4af7"
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=39 phishscore=0 bulkscore=0 spamscore=0 clxscore=1011 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1901170121
Archived-At: <https://mailarchive.ietf.org/arch/msg/dime/379ooJyr96PiXM7pct19YgIUSqA>
X-Mailman-Approved-At: Thu, 17 Jan 2019 09:54:25 -0800
Subject: [Dime] Mail regarding draft-ietf-dime-rfc3588bis - failover (handling not delivered/non-acknowledged STR) and session information
X-BeenThere: dime@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Diameter Maintanence and Extentions Working Group <dime.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dime>, <mailto:dime-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dime/>
List-Post: <mailto:dime@ietf.org>
List-Help: <mailto:dime-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dime>, <mailto:dime-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Jan 2019 16:59:08 -0000

Hello,

I have a problem with fail-over procedure interpretation described in RFC
6733.
We believe that Diameter stack implementation, that removes session
information after sending STR before it received the acknowledgment, does
not follow RFC 6733.

Scenario: a peer established connection with a Diameter server (using SCTP)
and the peer activated an user session on a Diameter server.
After some time the peer has to terminate the session, so it's initiating
the STR to Diameter server, but at the point of time, when peer sends the
message a network failure occurs, that causes STR never reaches the server
(and server never acknowledges the STR).

Our application is build on top of the Diameter stack.
When the application sends the message to the Diameter stack the connection
is up.
According to the Diameter stack, the link is still operational at the point
of time, when the message is passed down to the SCTP stack.
However the STR never reaches the destined Diameter server due to network
failure. In fact we can't even see the STR on the network interface.
We suspect a race-condition occurs, when SCTP stack receives the STR from
Diameter stack, the SCTP already knows that there is problem with connection
We see in the network capture that peer re-tries sending DWR and Diameter
server re-tries to SACK previous DWR and retries to send DWA to peer the
previous DWR.

The main problem is that since Diameter stack "thinks" the message was sent
successfully to the server it removes the session information without
waiting for the answer.
Since the session information is removed by the Diameter stack, there is no
way the application can request STR re-transmission, after the connection
is re-estbablished.
Any further AAR or STR requests are ignored by the Diameter stack.

The vendor providing the Diameter stack claims that their RFC 6733
implementation is correct and it's up to SCTP stack to take care of
re-transmission.
However the problem is not with the non-delivery of the STR but with lack
of possibility to request STR re-transmission, making this communication
unreliable.
We believe the Diameter stack implementation does not follow the RFC,
however our interpretation is that RFC 6733 isn't clear about what should
happen in case the STR is not be delivered or acknowledged.

The section 5.5.4.  Failover and Failback Procedures in the RFC 6733
focuses on the using alternate path (if possible), but it does look more
like a recommendation than hard requirement.

There is a section in RFC 6733 which  says:
* Session state (associated with a Session-Id) MUST be freed upon*
*   receipt of the Session-Termination-Request, Session-Termination-*
*   Answer, expiration of authorized service time in the Session-Timeout*
*   AVP, and according to rules established in a particular Diameter*
*   application.*

In our scenario session state was freed before receiving STA, so our
interpretation is Diameter stack implementation doesn't follow this part of
RFC - it shouldn't remove this information if it hasn't received the STA.
Note: We are not using Session-Timeout AVP, so we need to make sure that
STR was delivered and acknowledged by the Diameter server, otherwise the
dedicated bearer will not be removed.

Regards.

*Wojciech SzczyptaMotorola Solutions Systems Polska*
motorolasolutions.com