EARP proposal

bagnall_d@apollo.com Fri, 30 November 1990 15:51 UTC

Message-Id: <9011301531.AA01782@xuucp.ch.apollo.com>
From: bagnall_d@apollo.com
Date: Fri, 30 Nov 1990 03:54:23 -0500
Subject: EARP proposal
To: fddi@merit.edu
Status: O

I am submitting the following document for review before the meeting in Boulder. Two
of the authors, myself and Caralyn Brown, will be at the meeting to answer any questions.
Thanks.

--Doug Bagnall
bagnall_d@apollo.hp.com

----------------------------------- Cut here -------------------------------------------

FDDI Working Group D. Bagnall
Preliminary Document C. Brown
D. Hunt
M. J. Strohl
November 1990

Extended Address Resolution Protocol

1. STATUS OF THIS MEMO

The purpose of this document is to offer for review a new elective
standard. The distribution of this document is unlimited.

2. ABSTRACT

The following memo proposes a new form of the address resolution
protocol for use over Local Area Networks (LANs) where a one-to-many
mapping of Internet Protocol (IP) [4] to link layer addresses is
desirable. The one-to-many mapping may be implicit in the
relationship between IP and an underlying, multi-rail link layer, or
may be useful in providing more reliable delivery in the face of
congestion or for improving error recovery procedures. The new
protocol is not meant to supersede that specified in RFC 826 [3] but
to supplement it.

3. ACKNOWLEDGMENTS

The April 1988 memo of J. Lekashman [2] first explored the
advantages of providing multiple link layer paths between pairs of
IP addresses. The seminal idea of writing a new protocol, however,
came about in discussions in which Carol Iturralde, now at Digital
Equipment Corporation, played a major role. Comments from many
others have also been incorporated within the document including
especially those from Vernon Schryver of Silicon Graphics, Dave
Katz of Merit, and J. Noel Chiappa.

4. MOTIVATION

Network services are no longer just another feature tacked on to an
already complete operating system. With the advent of distributed
file systems and network computational servers, the network has
moved from the periphery of the operating system to become one of
its core services. In the traditional IP networking model, a host
computer has one IP address assigned to every physical device.
When a device fails, its associated IP address becomes unreachable,
and all reliable transport connections channeled through it die.

[Page 1]

Preliminary EARP November 1990

This one-to-one mapping of IP to link layer or hardware addresses is
simple to understand and hosts using it are easy to build, but, as
reliable network services have become more critical, the limitations
of the mapping have become more intolerable.

With a one-to-many mapping of IP to hardware addresses, the offered
resources of a network server are no longer held hostage to the
state of a single network device. Instead, several network devices
attached to a single LAN can function as a single, logical link
layer service for IP and its attendant upper layer protocols.
Transport Control Protocol (TCP) connections, in particular, are not
tied to either a single transmitting device on the local host or to
a single receiving device on the remote host. As long as there is
at least one functioning link path between sender and receiver, the
TCP connection can continue to transfer data.

5. INTRODUCTION

Since not every Internet host will support the Extended Address
Resolution Protocol (EARP), the new protocol will be used in
conjunction with standard ARP. (The term host here refers to either
an Internet host or to a gateway when the gateway uses that portion
of a host's functionality devoted to the establishment of link layer
paths between stations on a common LAN.) Hosts which implement EARP
will prefer to use that protocol when possible but will be able to
both send and receive standard ARP packets when their messages would
not otherwise be understood. The purpose of using either ARP or
EARP is to yield a mapping between a remote IP address and one or
more link layer addresses. EARP will simply provide more complete
information than ARP, thus allowing a host to make a better decision
as to how to direct a frame to its remote peer.

When a host has determined that a data packet is to be transmitted
to another host on the same LAN, it looks in a table to find a
mapping between the packet's destination IP address and a link layer
or hardware address. With standard ARP, the table will list one IP
address and one hardware address. With EARP, the table will list
one IP address and possibly several hardware addresses. When it
finds a single address, the transmitting host uses that address as
the destination address of the unicast frame which encapsulates the
packet to be sent to its IP peer at the remote host. If, however,
the sending host finds several hardware addresses in its table, it
must choose one to specify for the current frame. If the host, for
instance, were to want to distribute the packet load among the
several network devices at the receiving host, it might choose
destination hardware addresses from its table in round-robin order,
always selecting the least recently used address for the current
packet.

[Page 2]

Preliminary EARP November 1990

A busy Internet server may become congested with received packets
on one network device while another device is idle. So that a host
can assume some control over the distribution of incoming traffic,
EARP includes a special ranking field with each source hardware
address in a request or response packet. The ranking field allows
the host at a minimum to designate a primary interface to its remote
peer; at its most complex, the field allows the host to specify a
hierarchy among its interfaces. The intention is that a host with
a busy server can balance the input packet load from many clients
by assigning each a different primary interface. The other
hardware addresses at the server host would of course still be
available for backup.

Some LANs can be conveniently divided into several distinct rails or
link layer paths with each rail a separate physical conduit for
transmitting frames. Individual hosts at network initialization
attach separately to each of the rails with a distinct link layer
address assigned to each attachment. Frames transmitted on any
given rail must use the correct destination address for the target
host on that rail. EARP supports the concept of a link layer path
by associating with each source link layer address in an EARP packet
a path number. Path numbers are indicated with a number of from 0
to the total number of paths less one, and they are stored in the
address resolution table along with their associated addresses.

To give a concrete example, the Fiber Distributed Data Interface
(FDDI) uses two separate rings in normal operation, each of which
can be viewed as a separate rail. On each of these rings, a given
Internet host will have one and only one hardware address. If both
of the rings are included in one IP subnet, an EARP host A will
probably have an entry in its ARP table for any other host B which
includes that host's IP address and both of its link layer addresses
(assuming, of course, that host B has two addresses). The table
entry on host A for host B will thus logically be,

When sending a packet to the remote host, the local host will choose
a ring or rail and use the hardware address for that rail as the
destination address in the frame.

As another example, say Ethernet host B has two network boards on
the same Ethernet segment which share one IP address. Here the
concept of separate rails does not hold. Each of host B's addresses
is just as valid a destination address in any frame as any other of
its addresses. This host can thus be represented in the ARP table
of host A, as,

[Page 3]

Preliminary EARP November 1990

When host A wishes to send a packet to host B, it can choose either
link address B0 or address B1 as the destination address in the
Ethernet frame it transmits on the segment.

6. PROTOCOL OVERVIEW

To establish the one-to-many mapping of IP to link addresses, an
EARP host must send EARP request packets demonstrating its desire
to receive EARP response packets in return. If the underlying
LAN is based on a structure of multiple rails, then one EARP message
is sent on each rail with the local host's link layer address for
that rail as the single source address. The host will in turn
expect a single response on each rail with the target host's link
layer address on that rail as the source address in the response
packet.

On LANs which do not support the concept of multiple rails, a
single EARP request packet is sent with a listing of the host's m
link layer addresses as the source hardware addresses. The
requesting host will expect in return a response packet with a list
of the n link layer addresses at the target host as the source
hardware addresses. Otherwise, the request and response packets
will look very similar to those in standard ARP.

To return to the above examples, on an FDDI LAN made up of two
rings using a single IP subnet, an EARP host will send a request
message on ring 0 with its ring 0 address as the source address and
with a path encoding of 0. The single, empty target address will be
sent without a path indication. On ring 1, the host will send a
packet with its ring 1 address and a path of 1. On ring 0, the
sending host will expect to receive a response packet with its own
ring 0 address as the target and with the ring 0 address of the
remote host as the source address. The response on ring 1 will be
similar.

For an Ethernet EARP host, the request packet will include m source
addresses and a single, empty target address. The response will
include the remote host's n Ethernet addresses as the source and the
first of the m addresses of the original host as the target. One
Ethernet address in either the request or the response packet may be
designated as the primary link layer address for the sending host.
Thus, host A when it sends a request EARP message to host B may
designate one Ethernet address as the one B should use to send it
data packets. B, in its turn, may in the response EARP message
specify one of its own Ethernet addresses as the primary address for

[Page 4]

Preliminary EARP November 1990

host A. This mechanism will allow busy servers to suggest how their
clients could receive better and faster service.

7. PACKET FORMAT

An EARP request is essentially a standard ARP request with some
additional information specifying the number of link layer addresses
associated with the source protocol address and, optionally, the
path number and ranking associated with each link address. The path
number is specified if the underlying link layer is made up of
several rails. On such a LAN, the path number ensures that the
packet has been received on the appropriate rail by the target host.
On LANs which support only a single rail, the path number is decimal
255. In addition to the path indication, there is a field for each
source address specifying the ranking of hardware addresses. The
field is probably most useful if only one of the source hardware
addresses is given a rank, but the designation of a ranking
hierarchy is permissible. Ranks are assigned from a high of 0 to a
low of 254. A value of 255 indicates that no rank has been
assigned.

EARP is essentially a simple request/response protocol just as is
standard ARP; the opcode field of the packet indicates whether
the sender is requesting an IP to link layer address mapping or is
returning one. The same field in an EARP packet, however, also
indicates the mode of the request or response. Two different modes
are possible. In normal mode, a request or response packet
contains a complete and valid address mapping for the sender. In
advisory mode, the packet contains a subset of the normally
complete address mapping of the sender. The best method for
demonstrating the difference is again with two examples.

On an FDDI LAN of two rings sharing a single IP subnet, a host
wishing to discover the IP address mapping for a remote host would
want to send a separate EARP request packet on each of its two rings
or logical rails. If both of its ring attachments were operational,
both request packets would be sent in normal mode. If, however, one
of the two ring attachments of the sender were disabled, then it
would send a single EARP request packet in advisory mode out over
the still functioning ring attachment. The receiving host would
then understand both that the sender is normally capable of sending
on both rings and that it cannot do so now. An obvious corollary
is that hosts which normally can send out on only a single ring
will never send an EARP packet in advisory mode. Since these hosts
never have more than one path on which to send a packet, they can
never send a warning on one path that their other path is non-
operational.

[Page 5]

Preliminary EARP November 1990

On Ethernet, a host wishing to establish an address mapping for a
remote host would send a single EARP request packet in advisory
mode when one of its several network devices were down. Only the
addresses of those devices currently capable of transmitting and
receiving would be included. If all of its network devices were
functional, the host would send out a single packet in normal mode.

The general packet format is thus,

16 bits Protocol version number
16 bits Hardware type code
16 bits Protocol type code
8 bits Number of octets in each hardware address (j)
8 bits Number of octets in each protocol address (k)
16 bits Opcode
k octets Protocol address of sender
16 bits Count of the sender's link layer addresses which follow

For each link layer address associated with the given source IP
address on a given path,
j octets Hardware address of sender
8 bits Corresponding path number or 255 decimal
8 bits A rank from 0 to 254 or 255 for no ranking.

k octets Protocol address of target
j octets Hardware address of target. For request packets, all
zeros. For response packets, the first of the source
hardware addresses in the original request packet.

The protocol version number allows for later enhancements of the
protocol. Its current value is 1.

The hardware and protocol type fields are encoded exactly as they
are in standard ARP with the single exception that single subnet IP
FDDI LANs use hardware type code 256. See RFC-1010 [6] for the
other hardware type codes.

The number of octets in a hardware address depends on the type of
device. For FDDI or Ethernet, the value is 6. The number of octets
in an IP address is 4.

The opcodes are
1 Request in normal mode
2 Response in normal mode
3 Request in advisory mode
4 Response in advisory mode

For IP, the protocol address of the sender is its four octet

[Page 6]

Preliminary EARP November 1990

Internet address.

The count field is used to indicate how many <link layer address>
<path number><ranking> triplets are included for the sender. The
first triplet will correspond to the interface that is sending this
message. If a path is specified, then the count should be 1.
Otherwise, the count can be any positive integer value.

To return to the examples, the EARP host on the FDDI LAN would send
the following packet on ring 0 to specify normal mode,

1 Protocol version number
256 Hardware type code
2048 Protocol type code
6 Number of octets in each hardware address
4 Number of octets in each protocol address
1 Opcode
4 octets IP address for the single subnet including rings 0 and 1
1 Count of the sender's link layer addresses which follow
6 octets Hardware address of the host on ring 0
0 Path number
255 No ranking specified
4 octets IP address of target
6 octets 6 zeroed octets for the target hardware address

The EARP request packet on ring 1 would differ only in the single
hardware address included and in the path number. Neither ring
address would be given a rank if both paths were to be considered
as of equal weight.

For the above Ethernet example, the request packet might look as
follows.

1 Protocol version number
1 Hardware type code
2048 Protocol type code
6 Number of octets in each hardware address
4 Number of octets in each protocol address
1 Opcode
4 octets IP address for this Ethernet interface
2 Count of the sender's link layer addresses which follow
6 octets First hardware address
255 No path number specified
0 This is the primary hardware address
6 octets Second hardware address
255 No path number specified
255 No ranking designated
4 octets IP address of target

[Page 7]

Preliminary EARP November 1990

6 octets 6 zeroed octets for the target hardware address

Three assumptions about the underlying link layer services are made
in the final format of an EARP packet. First, it is assumed that
the underlying link or physical layer includes some sort of data
integrity check. For FDDI and Ethernet, the frame check sequence
guarantees that the EARP frame has not been corrupted in transit.
Second, the underlying medium must support broadcast frames. All
request EARP packets are, in fact, sent to the broadcast address.
Response packets are encapsulated in unicast frames.

The third assumption is that the structure of the physical frame
or link layer encapsulation includes a two-octet type field which
can be used to de-multiplex received frames. On Ethernet, the
type field is part of the frame. For FDDI and IEEE 802.X LANs,
the type is part of the SNAP header included in the link layer
header. See RFC-1103 [1] and RFC-1042 [5] for more information.
The two-octet Ethertype value for EARP is TBD.

8. INPUT PROCESSING

If an EARP host receives an ARP request packet in which its IP
address is the target, it should return a standard ARP response with
the address of one of its network devices as the single hardware
source address. If it receives an EARP request directed to it,
then an EARP response should be returned.

There are two exceptions to this rule. EARP hosts sometimes
broadcast EARP request packets in order to warn other EARP hosts
that the status of one or more of its network devices has changed;
a device may have either just failed or just come back on-line.
The target IP address in such a request is the same as the source,
and the mode is advisory in the case of a failing device and normal
in the case of a recovering device. The reason for sending a request
with a host's own IP address as the target is that no other host
will then try to respond. The sending host will, of course, just
drop the packet when it is received.

In the original examples, when one of an FDDI station's two ring
attachments is no longer available, it sends an advisory request
packet out of its still functioning attachment with that
attachment's ring address to inform other EARP hosts that their
table mappings are no longer valid. When one or more of the
Ethernet host's network devices fails, it should send out an EARP
advisory request listing its still available hardware addresses.
This action alerts other clients for which the now unavailable
devices were designated as preferred that the devices are now
inoperative. When either the FDDI ring attachment or the failed

[Page 8]

Preliminary EARP November 1990

Ethernet device comes back on-line, normal request packets should
be sent, one on each ring for FDDI and one including all active
devices for Ethernet. Again, the source and target IP addresses
should be that of the local interface and the frame destination
address should be the broadcast address.

Extreme caution should be used in sending out packets to inform
other hosts of a change in network interface status. If a device
is cycling between on and off-line states, the effect on the LAN
can be disastrous. It is always best to be conservative when
transmitting broadcast frames, but since EARP packets tend to be
more complex to parse than ordinary ARP packets, a conservative
transmission policy is even more than usually warranted. One
method of guaranteeing a conservative approach is to use a deadman
timer. When one of its network devices fails, a host should set the
timer rather than sending out an advisory request packet. If when
the timer expires the device is still down, then the host has no
choice but to send the packet.

One other remark before the exposition of the input processing
algorithm. Although the RFC 826 specifies that the response
message should include the original source IP address as the target
protocol address in the response message, it would be better to
use instead the IP address of the receiving interface. In this
way the receiving host can guarantee that the original requestor
receives a correct IP mapping in return.

The algorithm is then,

Merge_flag = false.
If an entry for this IP address already exists, then
If this is an EARP request, then
If the host is not marked in the table as an EARP host, then
Change the entry to show that this is an EARP host.
If this is an advisory request message, then
Invalidate all previous hardware addresses, paths, and
address ranking indications.
Update the table with the new hardware address(es).
If a path number has been included, then
Record the path number with the address in the table.
Merge_flag = true
Otherwise,
If the host is not already marked as understanding EARP, then
Update the table entry with the new hardware address.
Merge_flag = true
If the target IP address is mine, then
If the source IP address is also mine, then
Stop. My own packets should be ignored.

[Page 9]

Preliminary EARP November 1990

Otherwise,
If this is an EARP request, then
If Merge_flag is not set to true, then
Add an EARP entry to the table.
If a path number has been included, then
Record the path number with the address in the table.
If a ranking has been specified, then
Record the rank with the address in the table.
Format and send an EARP response.
Otherwise,
If Merge_flag is not set to true, then
Add a standard ARP entry.
Format and send a standard ARP response.

It should be noted in the algorithm that the indication of an
address ranking is recorded only when taken from a packet addressed
explicitly to the local host. In this way, a large server host can
specify different preferred addresses to different hosts.

9. SENDING ADDRESS RESOLUTION REQUESTS

A host which understands only standard ARP needs to distinguish
between only two types of remote host. One type will never respond
to an IP address mapping request because the host is either off-line
or non-existent. The other type of host can respond to an ARP
request but did not respond to the last request because it never
received the packet, because it had to drop the request due to
internal congestion, or because the response was lost in the network.
EARP hosts, however, also have to distinguish a third type of remote
host, one which drops EARP request packets because it does not
understand the new protocol.

The most important distinction to be made is that between hosts
which can participate in an EARP dialogue but did not respond to
the last EARP request packet and those hosts which cannot
participate at all. Both types of host will respond to an EARP
request in the same way, with silence, but with non-participatory
hosts that silence will be persistent. In most situations an EARP
host could feel reasonably certain that it would never receive an
EARP response from a remote host if after sending two EARP request
packets and after waiting twice for a response it still had not
received a packet. At that point, the host could try initiating a
standard ARP dialogue.

EARP hosts, however, will have to work in an environment in which
many other hosts will never learn to understand the new protocol,
and any time spent in trying to intiate an EARP dialogue with one of
them will be wasted. Since the time is wasted, it is better kept to

[Page 10]

Preliminary EARP November 1990

a minimum. EARP hosts should therefore send an EARP request only
once, and when the response timer expires, they should immediately
switch to an ARP dialogue.

Assuming then that either the IP layer has requested that a packet
be sent to a remote host for which no entry exists in the address
mapping table, or that the response timer for an incomplete entry
in the table has expired, an EARP host should act as follows.

If no request packets have yet been sent to the remote host, then
Send the EARP request to the remote host.
Set a flag in the entry that an EARP request has been sent.
Start the response timer.
Otherwise,
If a standard ARP request packet has not yet been sent, then
Send an ARP request to the remote host.
Set a flag in the entry that an ARP request has been sent.
Re-start the response timer.
Otherwise,
Delete address resolution table entry for the remote host.
Delete the timer for the entry.
If the trigger was an IP request and not a timer event, then
Return an error to IP.

The use of a new timer for timing out EARP and ARP responses is not
absolutely necessary. Transport layer re-transmission timers may
serve just as well if the IP and link layer interfaces are modified
to allow the transport layer protocol to flag whenever it is
re-transmitting. A request for a packet to be re-transmitted
would signal the need to request an address mapping again, while a
standard request to send a packet would not.

The design of the output algorithm for EARP includes the assumption
that sometimes an EARP host will be incorrectly designated as a
standard ARP host in a requesting host's address resolution table.
The assumption must be made because no network is 100% reliable
as some frames will inevitably be lost and since transmission delays
due to congestion can temporarily exceed the limits of even the most
scrupulously designed response timers. When an EARP host fails to
respond to an EARP request but does respond to a standard ARP
request, then the advantages of link layer path multiplexing and
demultiplexing will be lost in any communication between the
requesting and the responding host. Communication, however, will
still be possible, and it is quite probable that the error will be
detected since every EARP host hears every other EARP host's address
mapping requests.

[Page 11]

Preliminary EARP November 1990

10. SUMMARY

Communication between hosts which view the IP to link layer address
mapping as one to many is more reliable although more complex than
that between hosts which are limited to a simple one to one
mapping. In the future, Internet hosts may be known on all their
several link layer interfaces by a single IP address, but until
that time, EARP provides support for a better interconnection model
than that supported by standard ARP. Since the concepts of path
indication and address ranking are general, new conventions can be
instituted for the interpretation of the fields on new LANs and
older conventions can be dropped as current LANs evolve. The
re-interpretation of these two fields will allow for the
accomodation within EARP of new LANs with very different structures.

11. REFERENCES

[1] Katz, D., "A Proposed Standard for the Transmission of IP
Datagrams over FDDI Networks", RFC-1103, Merit/NSFNET,
April, 1990.

[2] Lekashman, J., "Multi-Homed Hosts in an IP Network", NASA Ames
GE, April, 1988.

[3] Plummer, David C., "An Ethernet Address Resolution Protocol",
RFC-826, MIT, November, 1982.

[4] Postel, J., "Internet Protocol", RFC-791, USC/Information
Sciences Institute, September 1981.

[5] Postel, J., and Reynolds, J., "A Standard for the Transmission
of IP Datagrams over IEEE 802 Networks", RFC-1042,
USC/Information Sciences Institute, February, 1988.

[6] Reynolds, J.K., and J. Postel, "Assigned Numbers", RFC-1010,
USC/Information Sciences Institute, May 1987.

12. AUTHORS' ADDRESSES

Douglas Bagnall Caralyn Brown
HP/Apollo Computer Prime Computer
300 Apollo Drive 500 Old Connecticut Path
Chelmsford, MA 01824 Framingham, MA 01701
Phone: (508) 256-6600 x4414 Phone: (508) 620-2800 x4237
Email: bagnall_d@apollo.hp.com Email: cbrown@enr.prime.com

[Page 12]

Preliminary EARP November 1990

Mary Jane Strohl Douglas Hunt
HP/Apollo Computer Prime Computer
300 Apollo Drive 500 Old Connecticut Path
Chelmsford, MA 01824 Framingham, MA 01701
Phone: (508) 256-6600 x4421 Phone: (508) 620-2800 x????
Email: strohl@apollo.hp.com Email: dhunt@enr.prime.com

EARP proposal Vernon Schryver
EARP proposal bagnall_d
Re: EARP proposal Vernon Schryver