Comments on draft-spagnolo-manet-ospf-design-00.txt

Richard Ogier <ogier@EARTHLINK.NET> Mon, 26 July 2004 16:17 UTC

Received: from cherry.ease.lsoft.com (cherry.ease.lsoft.com [209.119.0.109]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id MAA25773 for <ospf-archive@LISTS.IETF.ORG>; Mon, 26 Jul 2004 12:17:30 -0400 (EDT)
Received: from vms.dc.lsoft.com (209.119.0.2) by cherry.ease.lsoft.com (LSMTP for Digital Unix v1.1b) with SMTP id <9.00E284BB@cherry.ease.lsoft.com>; Mon, 26 Jul 2004 12:17:28 -0400
Received: from PEACH.EASE.LSOFT.COM by PEACH.EASE.LSOFT.COM (LISTSERV-TCP/IP release 1.8e) with spool id 27659228 for OSPF@PEACH.EASE.LSOFT.COM; Mon, 26 Jul 2004 12:17:18 -0400
Received: from 207.217.120.232 by WALNUT.EASE.LSOFT.COM (SMTPL release 1.0i) with TCP; Mon, 26 Jul 2004 12:07:18 -0400
Received: from sdn-ap-004castocp0008.dialsprint.net ([63.187.32.8] helo=earthlink.net) by flamingo.mail.pas.earthlink.net with esmtp (Exim 3.33 #1) id 1Bp80P-0003C1-00 for OSPF@PEACH.EASE.LSOFT.COM; Mon, 26 Jul 2004 09:07:14 -0700
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:0.9.4) Gecko/20011128 Netscape6/6.2.1 (emach0202)
X-Accept-Language: en-us
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format="flowed"
Content-Transfer-Encoding: 7bit
Message-ID: <41052C2A.80901@earthlink.net>
Date: Mon, 26 Jul 2004 09:07:06 -0700
Reply-To: Mailing List <OSPF@PEACH.EASE.LSOFT.COM>
Sender: Mailing List <OSPF@PEACH.EASE.LSOFT.COM>
From: Richard Ogier <ogier@EARTHLINK.NET>
Subject: Comments on draft-spagnolo-manet-ospf-design-00.txt
To: OSPF@PEACH.EASE.LSOFT.COM
Precedence: list
Content-Transfer-Encoding: 7bit

Please consider my comments below for the draft
http://www.ietf.org/internet-drafts/draft-spagnolo-manet-ospf-design-00.txt
to be presented next week in San Diego.

Overall, the draft presents some nice analysis, simulation results,
and conclusions, and I was happy to see that ideas from my draft
http://www.ietf.org/internet-drafts/draft-ogier-manet-ospf-extension-01.txt
(recently updated) were considered. However, I disagree with some of
the conclusions as noted below.

My comments are divided into the following 5 items:

1. Simulation Model
2. Originator-based LSA or ACK supression
3. Simulation results for reliable vs. unreliable flooding
4. Reliable flooding without ACKs, using periodic DDESC packets
5. Differential Hellos and differential LSAs

Items 2 and 3 are a bit long, so I first summarize them, and later
provide more details in items 2a and 3a.

1. Simulation Model

Although the analysis considered up to 100 nodes, the simulation
results considered only 20 nodes, which is quite limited.
The high mobility case (for 20 nodes) had a neighbor change rate of
0.14, or one neighbor change every 7.14 sec on average.
If there are 100 nodes and thus 5 times as many neighbors, then the
neighbor change rate should increase to about 0.70, or one neighbor
change every 1.43 sec, so that an LSA would be generated almost every
MinLSInterval (5 sec), which would trigger the "periodic update" mode
(possibly with non-ackable LSAs) in any of the hybrid flooding schemes
we have considered (whether or not prediction of neighbor changes is
used). Thus, your results are highly dependent on your choice of
number of nodes and mobility level.

In addition, a larger number of nodes would result in larger Hellos,
thus making differential Hellos more beneficial in a highly mobile
network (in which a smaller Hello interval would be used). As indicated
in your analysis (Figure 1), using 100 nodes and a 2 second Hello
interval results in about 62 times as much Hello overhead as using 20
nodes and a 10 second Hello interval (when full Hellos are used).

Also, the only measure you used for data packets was delivery ratio,
which was about .78 in your high mobility scenario.
MANET simulations often assume that link-layer failure detection
is used, which typically results in a delivery ratio close to 0.99
for even faster mobility than you considered. It would be good
to run simulations that assume link-layer failure detection is used,
and that consider other performance measures such as delay.


2. Originator-based LSA or ACK supression (summary)

Although I mentioned receiver-based ACK suppression in version 00
of my draft, I later stated that I preferred originator-based
methods (described in version 01 of my draft), so I will focus on
such methods. However, I will also mention that your simulation
result in Figure 10, which shows increased overhead for my ACK
suppression method, does not use the parameters that I would use.
(I would have used a threshold parameter that suppresses ACKs only
when a new LSA is originated almost every MinLSInterval - not true
in your case - which should not increase overhead. However, this is
not important since we both prefer originator-based methods.)

Regarding originator-based methods, we seem to agree that using
an option bit to indicate a non-ackable LSA can be useful and
should be investigated.

However, I disagree with you that using an exponential moving average
(EMA) to measure the rate of LSA changes is not a good approach.
The following bullet from your draft refers to this approach:

   "The approach adapts to the link changes passively by predicting
    the future LSA change based upon the LSA's history.  In the MANET
    environment, it is difficult to predict future link changes based
    on the past.  Looking at both fixed and exponentially weighted
    windows with different window sizes and weights, we could not
    predict when future LSAs would be created.  In fact the
    distribution of interarrival times looked almost as if the
    interarrivals were exponentially distributed (modulo the
    MinLSInterval)."

My goal was to measure the long-term or medium-term average
rate of LSA changes, and to base the decision of using reliable
or periodic flooding on such a measurement.  (The threshold should
be chosen so that periodic flooding is used only if it generates
less overhead than reliable flooding.)  Thus, I was interested in
a quasi-static scheme based on an average rate, rather than a scheme
that reacts dynamically to short-term variations in the (exponentially
distributed) interarrival times. Your prediction scheme is of the
latter type, which tries to do better than the quasi-static approach
by making a short-term prediction of the interarrival times.

Although using prediction of LSA changes can result in better
performance when the LSA change rate happens to be near the threshold
as in your case (so that your predicted measurement frequently
crosses the threshold), using an EMA based on LSA history can also
be quite effective in reducing overhead (when used by the originator
to decide whether to flood LSAs periodically vs. reliably) if
done properly, and could be a good choice if it is decided that
using a prediction technique is too complex.

Simulations could be conducted to compare these two approaches
for the hybrid reliable/unreliable LSA flooding method described
in item 2a below (and in my updated draft), but we need to
consider a large range of scenarios, and avoid drawing conclusions
based only on a few scenarios.


3. Simulation results for reliable vs. unreliable flooding (summary)

Your simulation results in Figure 17 compare the overhead
for reliable vs. unreliable flooding, where the reliable
flooding incorporates several of the proposed optimization
techniques (source-independent CDS, source LSA suppression,
multicast ACKS, and retransmit timer backoff.)
Your results show that, for the high mobility case, unreliable
flooding generates much less overhead than reliable flooding,
with no significant difference in delivery ratio.

As I explain below (item 3a), a hybrid reliable/unreliable flooding
method can be used that performs the same as unreliable (periodic)
flooding when mobility is high, and never performs significantly worse
than unreliable flooding.  Also, reliable flooding (or a hybrid
solution) will generate much less overhead than periodic flooding
in *non-mobile* wireless networks, which is an important
advantage in low-bandwidth sensor networks, for example.
(Thus, the fact that unreliable flooding overhead is independent
of mobility can be a *bad* thing, for non-mobile networks.)

Also, I discuss below (item 3a) other possible ways to reduce overhead
in reliable flooding. For example, the CDS algorithm you are using may
not be the best one. I have been running simulations to compare
different CDS algorithms, and may announce my results soon. Maybe you
can run simulations to evaluate the algorithm that I found to be the
most promising.  Other people may also suggest CDS algorithms, so
it would be good to compare several of them.
It is known that a good CDS will not result in more transmissions
of a given LSA than MPRs, i.e., a source-independent CDS
will be no larger than a source-dependent CDS (based on MPRs).


4. Reliable flooding without ACKs, using periodic DDESC packets

In Section 6.5 of your draft, you mentioned the methods of INRIA [12]
and Ogier [5] to achieve reliable flooding without ACKs, based on
periodic sequence numbers, similar to IS-IS.
However, in Sections 7 and 9.2 you mentioned only [12] for
adjacency forming.  What I suggested in my draft is a simple
application of the IS-IS method of periodic Sequence Numbers packets,
which translates to *periodic* DDESC packets in OSPF that are
*broadcast* to all neighbors. This method is equivalent to using
LSA NACKs instead of LSA ACKs.  (The NACKs would actually be the LSR
packets sent in response to DDESC packets.)
If a CDS is used, then only CDS nodes need to send full DDESC
packets. (If MPRs are used, then each MPR only needs to include
a subset of LSA headers in the DDESC packets it sends, as described
in my draft.)  I agree with you that [12] is a big departure from
OSPF, but what I propose is less of a departure from OSPF since
it does not require any new message types. (An option bit could be
used to distinguish a periodic DDESC packet from a regular one.)

Also, you propose using INRIA's method [12] only for external LSAs,
while using ACKs or unreliable flooding for internal LSAs, the
rationale being that there will be more external LSAs than internal
LSAs. (Although the manet could be configured as a stub area, you do
not mention this in your draft.)
However, it would be nice to have a single unified solution that
is as scalable for internal LSAs (for very large manets) as for
external LSAs. The method of periodic DDESC packets that I propose
is one possible way to achieve such a unified solution.
Simulations should be conducted to compare these alternatives.


5. Differential Hellos and differential LSAs

As discussed above, Hellos can generate a significant amount of
overhead in dense, highly mobile networks where a relatively
small Hello interval is desirable. (That is why we used differential
Hellos in TBRPF.)  Therefore, I think that the design should include
differential Hellos.

For the same reason, I think that differential LSAs should be
given serious consideration, since this can also reduce overhead
significantly in dense, highly mobile networks.
(That is why we used differental LSAs in TBRPF.)
Of course, simulations are needed to evaluate this.

----------------

I will now provide more details for items 2 and 3 above.

2a. Originator-based LSA or ACK supression (details)

As I said above, using prediction of neighbor changes can be helpful,
but simply using an EMA to measure the rate of neighbor changes
can also be quite effective if done properly. In fact, I believe
it can achieve the same reduction in overhead as using prediction
(if done properly). Using prediction may help to reduce LSA latency,
by allowing an LSA to be originated earlier if it is predicted that
no further link change is likely to occur soon.
(However, your draft concludes that LSA latency is not a problem
even if unreliable, periodic flooding is used.)

In the hybrid reliable/unreliable flooding method that I suggest
(described in my updated draft) the originator decides
when to generate periodic non-ackable LSAs (unreliable/periodic mode)
vs. event-driven ackable LSAs (reliable/event-driven mode), using
an EMA that estimates the probability that a link change will
occur within the next MinLSInterval (which I think is the same
as your LSA_WAIT_TIME, which I assume is 5 seconds).
When the EMA goes above a certain threshold (say 0.9), the
originator goes to unreliable mode and generates a non-ackable
LSA periodically every PERIODIC_WAIT_TIME (say 10 seconds).
Since the LSAs are non-ackable, no ACKs or retransmitted LSAs
are sent in this case, so going to unreliable mode can only
decrease overhead. The originator goes to reliable mode when
the EMA goes below the threshold (possibly a different threshold
for hysteresis). The last LSA sent in periodic mode must be
ackable, so that subsequent LSAs need to be originated only
when there is a link change.

Frankly, I was assuming that PERIODIC_WAIT_TIME = MinLSInterval,
since I don't think it is a good idea to increase LSA latency
when link changes are frequent.  If this causes too much
overhead, then the period used for unreliable mode can be
adjusted based on the measured overhead of originated LSAs,
using an EMA. (If the main goal is to control overhead, then
I don't think it helps much to use prediction.)


3a. Simulation results for reliable vs. unreliable flooding (details)

In item 3 above, I claimed that a hybrid reliable/unreliable flooding
method can be used that performs the same as unreliable flooding when
mobility is high, and never performs significantly worse than
unreliable flooding. This can be achieved using something like the
hybrid method described in item 2a above, except for the DDESC and LSR
overhead, which I discuss below.  The key is for the originator-based
method to go to reliable/event-driven mode ONLY when doing so will
generate less overhead than unreliable/periodic mode.

The LSR overhead was negligible in your results of Figure 17, but
the DDESC overhead (which was a significant 7.81 kbps) can
be reduced by using one or more of the following techniques:

 - Only CDS nodes need to send a full DDESC (as described in my draft).
   Since a CDS typically consists of less than 10% of the
   nodes, this could reduce DDESC overhead by 90% or more.

 - A DDESC packe only needs to contain headers of ACKable LSAs, and
   thus could be EMPTY if all LSAs are non-ackable (in unreliable mode).

 - If the method of sending periodic DD packets is used (similar to
   IS-IS, as suggested in my draft), then only CDS nodes would send
   periodic DDESC packets.

Regards,
Richard Ogier