Re: [ippm] Discussion on extending TWAMP to monitor service KPIs and detect liveliness of an application

Srivathsa Sarangapani <srivathsas@juniper.net> Tue, 30 June 2015 04:45 UTC

Return-Path: <srivathsas@juniper.net>
X-Original-To: ippm@ietfa.amsl.com
Delivered-To: ippm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D95121B30B6 for <ippm@ietfa.amsl.com>; Mon, 29 Jun 2015 21:45:02 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.102
X-Spam-Level:
X-Spam-Status: No, score=-0.102 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, J_CHICKENPOX_32=0.6, J_CHICKENPOX_55=0.6, J_CHICKENPOX_84=0.6, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nWafpOmg-RUt for <ippm@ietfa.amsl.com>; Mon, 29 Jun 2015 21:45:00 -0700 (PDT)
Received: from na01-bn1-obe.outbound.protection.outlook.com (mail-bn1bon0720.outbound.protection.outlook.com [IPv6:2a01:111:f400:fc10::1:720]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 437F41B30B4 for <ippm@ietf.org>; Mon, 29 Jun 2015 21:45:00 -0700 (PDT)
Received: from DM2PR0501MB1501.namprd05.prod.outlook.com (10.161.224.21) by DM2PR0501MB1501.namprd05.prod.outlook.com (10.161.224.21) with Microsoft SMTP Server (TLS) id 15.1.201.16; Tue, 30 Jun 2015 04:44:42 +0000
Received: from DM2PR0501MB1501.namprd05.prod.outlook.com ([10.161.224.21]) by DM2PR0501MB1501.namprd05.prod.outlook.com ([10.161.224.21]) with mapi id 15.01.0201.000; Tue, 30 Jun 2015 04:44:42 +0000
From: Srivathsa Sarangapani <srivathsas@juniper.net>
To: "MORTON, ALFRED C (AL)" <acmorton@att.com>, Dave Taht <dave.taht@gmail.com>
Thread-Topic: [ippm] Discussion on extending TWAMP to monitor service KPIs and detect liveliness of an application
Thread-Index: AQHQogaXUDBkupL6lE+Yj7SC7dIQT52izRCAgAAeDYCAAZslgIAfp92AgAAuyvCAAJpEgA==
Date: Tue, 30 Jun 2015 04:44:42 +0000
Message-ID: <D1B814D9.313B9%srivathsas@juniper.net>
References: <D19BBBC4.2DBC4%srivathsas@juniper.net> <CAA93jw7UAn1=kz4U=7ZJsgWn3YupaY5=U+f3gTFwLwVxUfjR=Q@mail.gmail.com> <4AF73AA205019A4C8A1DDD32C034631D02EB7B2B2F@NJFPSRVEXG0.research.att.com> <D19CE2A0.2DEF4%srivathsas@juniper.net> <D1B773D6.312E8%srivathsas@juniper.net> <4AF73AA205019A4C8A1DDD32C034631D0662C6E367@NJFPSRVEXG0.research.att.com>
In-Reply-To: <4AF73AA205019A4C8A1DDD32C034631D0662C6E367@NJFPSRVEXG0.research.att.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/14.5.2.150604
authentication-results: att.com; dkim=none (message not signed) header.d=none;
x-ms-exchange-messagesentrepresentingtype: 1
x-originating-ip: [116.197.184.10]
x-microsoft-exchange-diagnostics: 1; DM2PR0501MB1501; 5:U1Des8F6sZmNRRGQ2ZneHc/vfLJH0GzR2CilTfOrdrvRcl8Yf2UMekAjm4AdSg2tzc9M13N/Q4r8C2xRayN4lboUqjc4ycmbzbiIuNWenfNiwgvA/RFQXO2C0ZWHyKMuPQoHEogVbRGJwDzWyCNNlw==; 24:fUD60ocMRCTIZzoKszNlCVldH4Ta8Z13pkWMK8FzcY0C8hkb8YVIZRG6locaEXBaY01xSDizB2UdJvDp0L+mb78IU3Wb7HKFvtrMdP+9tD4=; 20:TsVZuTjMDhM1iqgV0W8tItKiFZrFgPRjUZFqPCTAlAc0boWvYZJsCcRy0xzm6lePktysF2G8o8fm172QGk8/1A==
x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(42139001); SRVR:DM2PR0501MB1501;
x-microsoft-antispam-prvs: <DM2PR0501MB15013B5175058F0BBADB0F70D6A90@DM2PR0501MB1501.namprd05.prod.outlook.com>
x-exchange-antispam-report-test: UriScan:;
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(601004)(5005006)(3002001); SRVR:DM2PR0501MB1501; BCL:0; PCL:0; RULEID:; SRVR:DM2PR0501MB1501;
x-forefront-prvs: 06237E4555
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(6009001)(24454002)(377454003)(13464003)(62966003)(77156002)(106116001)(92566002)(93886004)(40100003)(5002640100001)(189998001)(36756003)(99286002)(122556002)(46102003)(83506001)(551934003)(19580395003)(19580405001)(76176999)(54356999)(50986999)(5001960100002)(107886002)(66066001)(5001770100001)(4001350100001)(2900100001)(15975445007)(102836002)(77096005)(2950100001)(87936001)(2656002)(86362001)(4001430100001); DIR:OUT; SFP:1102; SCL:1; SRVR:DM2PR0501MB1501; H:DM2PR0501MB1501.namprd05.prod.outlook.com; FPR:; SPF:None; MLV:sfv; LANG:en;
Content-Type: text/plain; charset="iso-8859-1"
Content-ID: <2D3A59F79B53A645B14FD9E31E9B4F89@namprd05.prod.outlook.com>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-OriginatorOrg: juniper.net
X-MS-Exchange-CrossTenant-originalarrivaltime: 30 Jun 2015 04:44:42.5150 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: bea78b3c-4cdb-4130-854a-1d193232e5f4
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM2PR0501MB1501
Archived-At: <http://mailarchive.ietf.org/arch/msg/ippm/on2i6eRzlsVVCQim7b1OVzzv4CA>
Cc: Peyush Gupta <peyushg@juniper.net>, Srivathsa Sarangapani <srivathsas@juniper.net>, "ippm@ietf.org" <ippm@ietf.org>
Subject: Re: [ippm] Discussion on extending TWAMP to monitor service KPIs and detect liveliness of an application
X-BeenThere: ippm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: IETF IP Performance Metrics Working Group <ippm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ippm>, <mailto:ippm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ippm/>
List-Post: <mailto:ippm@ietf.org>
List-Help: <mailto:ippm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ippm>, <mailto:ippm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 30 Jun 2015 04:45:03 -0000

Hi Al,

Thanks a lot for browsing through our draft.
Sorry, if the explanation is not clear in the draft. Please see my answers
inline:

-- 
Regards,
Vathsa




-----Original Message-----
From: "MORTON, ALFRED C (AL)" <acmorton@att.com>
Date: Tuesday, June 30, 2015 at 6:49 AM
To: Srivathsa Sarangapani <srivathsas@juniper.net>, Dave Taht
<dave.taht@gmail.com>
Cc: "ippm@ietf.org" <ippm@ietf.org>, Peyush Gupta <peyushg@juniper.net>
Subject: RE: [ippm] Discussion on extending TWAMP to monitor service KPIs
and detect liveliness of an application

Hi Vathsa,

I had a look at your draft tonight, but I clearly need to
spend more time to try to figure out how these new features
work - take the keep-alive for example in section 5.1:

5.1.  Services Keepalive Monitoring

   The Session-Sender MAY send the Service PDU as part of the TWAMP-Test
   Packet Padding.  When Session-Reflector receives the TWAMP-Test
   packet, it SHALL extract the Service PDU.  Then Session-Reflector
   SHALL inject the Service PDU to the Service Block for service
   processing.

   Based on whether the Session-Reflector received the response, the
   Session-Reflector SHALL decide whether the Service is alive or not.

Are the test packets sent between Sender and Reflector the basis for
evaluating Service liveliness?
Vathsa>>>No. 
If this alone was solving the purpose, then TWAMP does this functionality
as of today.

If so, it seems that *any* successful
test packet received at the Reflector and then received back at the Sender
is enough to assess bi-directional service liveliness. I'm missing the
need for the Reflector to decide anything - it just has to re-use payload
in the packet it reflects?
Vathsa>>>We mean to say that the TWAMP reflector module should extract the
Service PDU from the TWAMP data packet.
Please note that this Service PDU is filled by Session-Sender when it
sends the Test packet.
The Session-Reflector should inject the Service PDU to the Service Block.
Once the packet is received back at the reflector from the Service Block,
it will reply back to Session-Sender.
By this the Session-Sender is sure of 2 things:
   - The box where Session-Reflector is running is alive
   - The Service Block is also working fine.

Say for example, the HTTP server might be alive but the HTTP Server daemon
might not be working.
So these kinds of problems can be caught using our proposed extension.

Another example would be say a Router is running some Services like
CGNAT(JFLOW,DPI) outside the box as a VNF(say on a X86 cots server).
This router can run a TWAMP Session-Reflectot and using our extension, the
Session-Reflectore SHALL inject a test Service PDU to CGNAT VNF.
Based on the reply(possibly comparing the reply packet with expected
packet), the Session-Sender can really know if the CGNAT module is working
as expected or not.
Not just the liveliness, other service parameters like:
 - How many sessions are currently running a part of this VNF(to know the
current load on a VNF)
 - What is the extra latency being introduced for enabling CGNAT service
in a VNF.
These information can be the basis for the network operator to spawn more
VNFs or use some Service block inside the router for some gold customers
and use VNF for other customers etc.

Basically we want to aid the Network administrator with the live data
regarding Service Load, Service latency etc so that he can plan his
network for best results.

Please let me know if you need more info and we can provide the same.

Al

> -----Original Message-----
> From: Srivathsa Sarangapani [mailto:srivathsas@juniper.net]
> Sent: Monday, June 29, 2015 12:45 PM
> To: MORTON, ALFRED C (AL); Dave Taht
> Cc: ippm@ietf.org; Peyush Gupta
> Subject: Re: [ippm] Discussion on extending TWAMP to monitor service
> KPIs and detect liveliness of an application
> 
> Hi Al,
> 
> I hope you agree to our below comments.
> In continuation of the same idea, we have submitted a draft:
> https://datatracker.ietf.org/doc/draft-sp-ippm-monitor-services-kpi/
> 
> Request you to please go through the same and provide your
> comments/suggestions.
> 
> 
> --
> Regards,
> Vathsa
> 
> 
> 
> 
> -----Original Message-----
> From: Srivathsa Sarangapani <srivathsas@juniper.net>
> Date: Tuesday, June 9, 2015 at 6:50 PM
> To: "MORTON, ALFRED C (AL)" <acmorton@att.com>, Dave Taht
> <dave.taht@gmail.com>
> Cc: "ippm@ietf.org" <ippm@ietf.org>, Peyush Gupta <peyushg@juniper.net>
> Subject: Re: [ippm] Discussion on extending TWAMP to monitor service
> KPIs
> and detect liveliness of an application
> 
> Hi Al,
> 
> Thanks for your reply. Please see my answers inline:
> 
> --
> Regards,
> Vathsa
> 
> 
> 
> 
> -----Original Message-----
> From: <MORTON>, "ALFRED C   (AL)" <acmorton@att.com>
> Date: Monday, June 8, 2015 at 11:48 PM
> To: Dave Taht <dave.taht@gmail.com>, Srivathsa Sarangapani
> <srivathsas@juniper.net>
> Cc: "ippm@ietf.org" <ippm@ietf.org>
> Subject: RE: [ippm] Discussion on extending TWAMP to monitor service
> KPIs
> and detect liveliness of an application
> 
> Hi Srivathsa and Dave,
> 
> At the moment, some visibility of service KPIs are intended
> to be measured with PDM:
> https://tools.ietf.org/html/draft-elkins-ippm-6man-pdm-option-00
> (and there is/was a call for interest
> http://www.ietf.org/mail-archive/web/ippm/current/msg03732.html
> probably not too late to respond usefully on this wg call)
> 
> Vathsa>>>Good work.
> We feel that service KPIs like load, latency etc for Router
> services(like
> DPI, CGNAT, IPSEC)
> may not be the scope of this extension. This is a good extension for
> host
> to host architecture.
> 
> regarding:
> > > Similarly they cannot figure out the liveliness of an application on
> a
> > > server even though they can figure out that the server is alive.
> >
> > Well said. liveliness as a default benchmark type would be good for
> just
> > about everything. :)
> 
> By liveliness, what degree of response complexity is considered alive?
> Possibilities beyond ICMP Echo include:
>  - Opens connection on well-known port
>  - sends expected greeting message
>  - ...
>  - completes entire transaction within time limit
> Vathsa>>>
> For some TCP applications, opening a connection on well-known port is
> liveliness.
> For some UDP applications, receiving greeting message is liveliness.
> For some applications like HTTP, DNS it can be one transaction within
> time
> limit.
> 
> regards,
> Al
> 
> > -----Original Message-----
> > From: ippm [mailto:ippm-bounces@ietf.org] On Behalf Of Dave Taht
> > Sent: Monday, June 08, 2015 12:31 PM
> > To: Srivathsa Sarangapani
> > Cc: ippm@ietf.org
> > Subject: Re: [ippm] Discussion on extending TWAMP to monitor service
> > KPIs and detect liveliness of an application
> >
> > On Mon, Jun 8, 2015 at 9:17 AM, Srivathsa Sarangapani
> > <srivathsas@juniper.net> wrote:
> > > Dear IPPM,
> > >
> > > I would like to share something with you today.
> > >
> > > In the existing as well as next generation network architectures,
> > > there are lot of new services getting added in the service plane
> with
> > > in the network. services here include subscriber aware services,
> flow
> > > based traffic load balancing, content delivery servers, real time
> > > streaming applications and similar. The performance of these
> services
> > > are monitored using set of attributes. some of the critical
> attributes
> > > are latency introduced in the packet path, impact on network
> capacity
> > and throughput.
> > > some other attributes are to check whether a service node is alive
> or
> > not.
> > >
> > > To Address some of these challenges, how about extending TWAMP
> > > protocol (RFC 5357) to monitor service KPIs and monitor the
> liveliness
> > > of the service or application.
> >
> > I would like to see smokeping more widely used, and bandwidth
> presented
> > at the same time as loss, ECN CE, and latency in more mtrg and cacti
> > (and other widely used network management system) graphs.
> >
> > While I would like to see more twamp deployments, it is kind of a
> > headache to deploy (ntp time sensitivity for starters - that said, I
> > would also like to see better timekeeping across the internet also).
> >
> > >
> > > Today TWAMP is used to measure just RTT between 2 Network Elements,
> > > like routers, servers etc.
> > > Since Routers are no more just forwarding packets but running lot
> more
> > > services like CGNAT, DPI, IPSec, TDF and like.
> > > Even Servers are used to run applications like DNS, HTTP over it.
> > >
> > > Existing standard protocols cannot measure the impact of enabling
> > > service on packets that get routed via a router in terms of latency
> > > and the throughput.
> > > Similarly they cannot figure out the liveliness of an application on
> a
> > > server even though they can figure out that the server is alive.
> >
> > Well said. liveliness as a default benchmark type would be good for
> just
> > about everything. :)
> >
> > > With  the advent of SDN and VNFs, the latency of a VNF would really
> > > make lot of sense for the network operator for optimal network
> > > planning and deployment.
> >
> > I agree that monitoring the effectiveness of these new technologies is
> a
> > goodness.
> >
> > > Based on the real time latency, the network operator can possibly
> > > spawn more  VMs  for a services VNF when required and can shut down
> > > some of them when not required.
> >
> > Well, more VMs does != less latency.
> >
> > >
> > > We therefore think that adding this new dimension to TWAMP protocol
> > > would really be helpful for network monitoring and analysis.
> > > We request you all to please share your thoughts on the same.
> >
> >
> > >
> > >
> > > --
> > > Thanks and Regards,
> > > Vathsa
> > >
> > > _______________________________________________
> > > ippm mailing list
> > > ippm@ietf.org
> > > https://www.ietf.org/mailman/listinfo/ippm
> >
> >
> >
> > --
> > Dave Täht
> > What will it take to vastly improve wifi for everyone?
> > https://plus.google.com/u/0/explore/makewififast
> >
> > _______________________________________________
> > ippm mailing list
> > ippm@ietf.org
> > https://www.ietf.org/mailman/listinfo/ippm
>