[pim] PIM refresh reduction discussion

"Mike McBride \(mmcbride\)" <mmcbride@cisco.com> Wed, 25 July 2007 16:28 UTC

Return-path: <pim-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1IDjih-00053c-Nb; Wed, 25 Jul 2007 12:28:15 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1IDjig-00051Z-8v for pim@ietf.org; Wed, 25 Jul 2007 12:28:14 -0400
Received: from sj-iport-2-in.cisco.com ([171.71.176.71] helo=sj-iport-2.cisco.com) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1IDjif-0004Ce-B2 for pim@ietf.org; Wed, 25 Jul 2007 12:28:14 -0400
Received: from sj-dkim-1.cisco.com ([171.71.179.21]) by sj-iport-2.cisco.com with ESMTP; 25 Jul 2007 09:28:13 -0700
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Ao8CAHcWp0arR7MV/2dsb2JhbAA
X-IronPort-AV: i="4.16,581,1175497200"; d="scan'208"; a="388103721:sNHT32437608"
Received: from sj-core-4.cisco.com (sj-core-4.cisco.com [171.68.223.138]) by sj-dkim-1.cisco.com (8.12.11/8.12.11) with ESMTP id l6PGSCH3006136 for <pim@ietf.org>; Wed, 25 Jul 2007 09:28:12 -0700
Received: from xbh-sjc-221.amer.cisco.com (xbh-sjc-221.cisco.com [128.107.191.63]) by sj-core-4.cisco.com (8.12.10/8.12.6) with ESMTP id l6PGRoU9028447 for <pim@ietf.org>; Wed, 25 Jul 2007 16:28:12 GMT
Received: from xmb-sjc-219.amer.cisco.com ([171.70.151.188]) by xbh-sjc-221.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.1830); Wed, 25 Jul 2007 09:28:02 -0700
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Subject: [pim] PIM refresh reduction discussion
Date: Wed, 25 Jul 2007 09:28:05 -0700
Message-ID: <47951CBFA6409B4C8916514FCB05BFBE028D59D0@xmb-sjc-219.amer.cisco.com>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [pim] PIM refresh reduction discussion
Thread-Index: AcetL7MvWKDTszLfQNiyGbt+GpkoaQMoiZpQBUElgjA=
From: "Mike McBride (mmcbride)" <mmcbride@cisco.com>
To: pim@ietf.org
X-OriginalArrivalTime: 25 Jul 2007 16:28:02.0712 (UTC) FILETIME=[C59EB980:01C7CED8]
DKIM-Signature: v=0.5; a=rsa-sha256; q=dns/txt; l=4631; t=1185380892; x=1186244892; c=relaxed/simple; s=sjdkim1004; h=Content-Type:From:Subject:Content-Transfer-Encoding:MIME-Version; d=cisco.com; i=mmcbride@cisco.com; z=From:=20=22Mike=20McBride=20\(mmcbride\)=22=20<mmcbride@cisco.com> |Subject:=20[pim]=20PIM=20refresh=20reduction=20discussion |Sender:=20; bh=jT/Jecqs+vEmSRRCYmxl8hfjMjbDJ+2nAIArhL5gIt8=; b=ma00Vs1ycBkmKui0Qsz95wY4FVrunJ0Fy/XjTI5FTPLUD0zpCEDoYuo6TIWGU2466VezNtdn BBsYUdHEaOx6amgJyu1f+xf1ia9Hgm5SeXpOCvG1yboYDSgWLGdxSWEZ2rkIemtYNWrb16/KfL nrt/ELUBgbffhn+q8/rdJWhWo=;
Authentication-Results: sj-dkim-1; header.From=mmcbride@cisco.com; dkim=pass ( sig from cisco.com/sjdkim1004 verified; );
X-Spam-Score: -3.2 (---)
X-Scan-Signature: 14582b0692e7f70ce7111d04db3781c8
X-BeenThere: pim@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Protocol Independent Multicast <pim.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/pim>, <mailto:pim-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:pim@ietf.org>
List-Help: <mailto:pim-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/pim>, <mailto:pim-request@ietf.org?subject=subscribe>
Errors-To: pim-bounces@ietf.org

For a complete view of the discussion which occurred on the pim-state
list, please go here:
http://www3.ietf.org/proceedings/07jul/slides/pim-2.txt
 
Suresh has provided a summary below. 

Refresh reduction or refresh elimination

*The consensus seemed to be to keep refreshes and use holdtime in the JP
PDU to eliminate them if someone wants it by setting the hold time to
max value. Otherwise, state would be refreshed every 30 minutes (or some
such large value).

Reliability

*Every JP PDU gets acked. A Hello option will help determine if all
routers on the LAN support JP ACK or not. If not, we revert back to
existing behavior.

How to achieve refresh reduction

*One of the goals was to make as minimal changes to PIM as possible.
Dino's suggestion was to use a new message type, say Join Ack, which
would look exactly like the JP Pdu. The upstream router that receives a
JP reflects back the same PDU to the downstream router by just changing
the PDU type. 
*To differentiate between successive state changes, we need to have a
sequence number. Where we differed: 

*Dino wanted to overload one of the reserved fields in the source
address field to use it as a sequence number per route. 
*Bob and I prefer that there be a seq num per PDU. The idea here was to
have a seq# TLV and just ack that in the JP Ack PDU. 

I think if we combine the two approaches, we get the following. See if
this is agreeable to you guys (I will use an example here):

oLet's say there are entries E1, E2, E3, E4 that have Joins scheduled to
be sent. 

oThe JP PDU will contain the entries E1 thru E4 as Joins. 

oA seq num TLV gets assigned to the PDU. Let's say the seq num is 1. All
entries get the seq number of the PDU they go in so that effectively,
each entry is associated with a seq num (what Dino wants).  

oThe JP PDU is sent out. A retransmission timer is started for the PDU
that starts off at a low value and backs off exponentially. 

oThe ACK is just a reflection of the JP PDU by changing the PDU type.
When the ACK is received, all entries associated with that PDU are
considered ACKed by the router that sent the JP PDU. For any other
downstream router, the seq number is meaningless. It processes the ACK
just as it processes a "See Join" or "See Prune" event. This gives the
other routers one more chance to see the PDU just in case they missed
the Join. 

oBefore the ACK is received, if the state changes for an entry, say E1
changes from a Join to Prune, E1 is added to a new PDU with seq #2.
Before PDU 2 gets sent out, if the state changes again, E1 simply gets
updated without having to change the seq number. So E1's seq number is
now 2.

oIf an ACK is received for seq #1, E2 thru E4 are considered ACKed, but
not E1, since E1 is not associated with seq #1 anymore. So the router
that processes the ACK does not really care about the contents of the JP
PDU, except the seq number and its notion of which entries correspond to
the seq number. Another way to look at it is to say that it goes through
each entry in the PDU and considers those entries acked whose seq number
match that of the PDU.

Other issues

*Both JP and JP Acks are multicast. This preserves Join Suppression.
There is still a problem where one downstream router Prunes a state and
the Upstream router Acks it back and this exchange (both the JP and the
JP ACK) is not seen by another downstream router that wants the state.
Now there is no traffic flowing on the LAN until this router refreshes
its state after a long period of time, if it decides to do so. The only
solution to this is to have explicit tracking. 

*There was also mention of ensuring bidirectional connectivity for
adjacencies. I think this should be addressed as well as part of
reliability. Once a JP Exchange occurs, if there is error on the link
leaving only one way connectivity where the Hellos sent by the Upstream
router are seen by the downstream router, but not vice-versa, data
forwarding simply stops, since the upstream router would have cleaned up
all downstream state while the downstream router assumes the upstream
router has all the state. This can be fixed by having a new 2Way Hello
option wherein each router adds its nbr IP in the TLV (OSPF style). This
is especially important when we are running PIM on overlay networks like
in L3VPN.

*Lastly, the need for L2 snooping devices to be able to request states
to build snooping state was mentioned. Modifying Hellos with GenId 0 to
force routers to refresh their state is a solution.

_______________________________________________
pim mailing list
pim@ietf.org
https://www1.ietf.org/mailman/listinfo/pim