Counterpane comments, ASCII version

Stephen Kent <kent@bbn.com> Wed, 26 January 2000 17:17 UTC

Received: from lists.tislabs.com (portal.gw.tislabs.com [192.94.214.101]) by ns.secondary.com (8.9.3/8.9.3) with ESMTP id JAA27943; Wed, 26 Jan 2000 09:17:32 -0800 (PST)
Received: by lists.tislabs.com (8.9.1/8.9.1) id JAA28919 Wed, 26 Jan 2000 09:16:51 -0500 (EST)
Mime-Version: 1.0
X-Sender: kent@po1.bbn.com
Message-Id: <v04220800b4b34f753f15@[128.33.238.94]>
In-Reply-To: <200001241844.NAA16433@tonga.xedia.com>
References: <Pine.BSI.3.91.1000119203938.18717S-100000@spsystems.net> <4.2.1.20000120112510.00bfa600@mail.vpnc.org> <4.2.1.20000120173822.00b476f0@mail.vpnc.org> <38886529.E3D38477@bbn.com> <200001241844.NAA16433@tonga.xedia.com>
Date: Tue, 25 Jan 2000 08:04:01 -0500
To: ipsec@lists.tislabs.com
From: Stephen Kent <kent@bbn.com>
Subject: Counterpane comments, ASCII version
Content-Type: multipart/alternative; boundary="============_-1263317047==_ma============"
Sender: owner-ipsec@lists.tislabs.com
Precedence: bulk

My annotations are in brackets.

Steve
----------

\chapter*{Executive summary}
IPsec is a set of protocols that provides communication security for computers
using IP-based communication networks. It provides authentication and
confidentiality services on a packet level. To support the IPsec 
security, a key
management protocol called ISAKMP is used. ISAKMP uses public-key cryptographic
techniques to set up keys between the different parties to be used with IPsec.

Both IPsec and ISAKMP are too complex. [a protocol is too complex only relative
to a specified set of requirements that are satisfied by a simpler protocol. To
substantiate this observation, one ought to define the requirements that one
believes the protocol is trying top satisfy, and then  offer a simpler
protocol.] This high complexity leads to errors. We have found 
security flaws in
both IPsec and ISAKMP, and expect that there are many more. We expect 
any actual
implementation to contain many more errors, some of which will cause security
weaknesses. These protocols give the impression of having been designed by a
committee: they try to be everything for everybody at the cost of complexity.
For normal standards, that is bad enough; for security systems, it is
catastrophic. In our opinion, the complexity of both IPsec and ISAKMP can be
reduced by a large factor without a significant loss of functionality.

IPsec is in better shape than ISAKMP. The description and definitions are
reasonably clear. A careful implementation of IPsec can achieve a good level of
security. Unfortunately, IPsec by itself is not a very useful 
protocol. Use on a
large scale requires the key management functions of ISAKMP. [while I 
would tend
to agree with this observation, I should note that a non-trivial 
number of IPsec
implementations, used in constrained contexts, are manually keyed.]

ISAKMP is currently not in a suitable state for implementation. Major work will
be required to get it to that point. There are many security-critical 
errors, as
well as many unnecessary cross-dependencies within the protocol. These should
all be eliminated before a new evaluation is done.

Based on our analysis, we recommend that IPsec and ISAKMP not be used for
confidential information. At the moment we cannot recommend a direct
alternative. Some applications might be able to use SSL 
\cite{SSLv3Nov96}, which
in our opinion is a much better protocol that provides a much higher level of
security when used appropriately.

\tableofcontents

\chapter{Introduction}

At the request of NSA, Counterpane has conducted a security review of the IPsec
and ISAKMP security protocols.

This evaluation is based on RFCs 2401--2411 and RFC 2451
\cite{RFC2401,RFC2402,RFC2403,RFC2404,RFC2405,RFC2406,RFC2407,RFC2408, 
RFC2409,RF
C2410,RFC2411,RFC2451}. The Oakley protocol \cite{RFC2412} is only an
informational RFC; it is not part of the standard and is not used in 
ISAKMP. RFC
documents are available from {\tt ftp:ftp.isi.edu\slash in-notes\slash
rfc<n>.txt}.

As \cite{RFC2401} states: ``The suite of IPsec protocols and associated default
algorithms are designed to provide high quality security for Internet traffic.
However, the security offered by use of these protocols ultimately depends on
the quality of the their implementation, which is outside the scope of this set
of standards.  Moreover, the security of a computer system or network is a
function of many factors, including personnel, physical, procedural,
compromising emanations, and computer security practices.  Thus IPsec is only
one part of an overall system security architecture.'' This evaluation only
deals with the IPsec and ISAKMP specifications and is not directly concerned
with any of the other factors. However, we do comment on aspects of the
specifications that affect other security factors.

IPsec and ISAKMP are highly complex systems. Unfortunately, we cannot give a
sufficiently detailed description of these systems in this document 
to allow the
reader to understand our comments without being familiar with IPsec and ISAKMP.
Our comments frequently refer to specific places in the RFC documents for ease
of reference.

The rest of this report is structured as follows. Chapter~\ref{chap:general}
gives some general comments. Chapter~\ref{chap:bulk} discusses the IPsec
protocols that handle bulk data. Chapter~\ref{chap:ISAKMP} discusses the ISAKMP
generic definitions. Chapter~\ref{chap:IPsecDOI} talks about the 
IPsec Domain of
Interpretation which gives more details on how the generic ISAKMP structure
applies to the IPsec protocols. Finally, chapter~\ref{chap:IKE} discusses the
IKE protocol that is the default key management protocol used with ISAKMP.

\chapter{General comments}\label{chap:general}

\section{Complexity}

Complexity is the biggest enemy of security. This might seem an odd 
statement in
the light of the many fielded systems that exhibit critical security failures
for very simple reasons. It is true nonetheless. The simple failures are simple
to avoid, and often simple to fix. The problem is not that we do not 
know how to
solve them; it is that this knowledge is often not applied. 
Complexity, however,
is a different beast because we do not really know how to handle it.

Designing any software system is always a matter of weighing various
requirements. These include functionality, efficiency, political acceptability,
security, backward compatibility, deadlines, flexibility, ease of use, and many
more. The unspoken requirement is often the complexity. If the system gets too
complex, it becomes too difficult, and therefore too expensive, to make. As
fulfilling more of the requirements usually involves a more complex 
design, many
systems end up with a design that is as complex as the designers and
implementors can reasonably handle.

Virtually all software is developed using a try-and-fix methodology. Small
pieces are implemented, tested, fixed, and tested again.\footnote{Usually
several iterations are required.} Several of these small pieces are combined
into a larger module, and this module is tested, fixed, and tested again. The
end result is software that more or less functions as expected, although we are
all familiar with the high frequency of functional failures of 
software systems.

This process of making fairly complex systems and implementing them with a try-
and-fix methodology has a devastating effect on the security. The 
central reason
is that you cannot test for security. Therefore, security bugs are not detected
during the development process in the same way that functional bugs 
are. Suppose
a reasonably sized program is developed without any testing at all during
development and quality control. We feel confident in stating that the result
will be a completely useless program; most likely it will not perform 
any of the
desired functions correctly. Yet this is exactly what we get from the try-and-
fix methodology when we look at security.

The only reasonable way to ``test'' the security of a security product is to
perform security reviews on it.\footnote{A cracking contest can be seen as a
cheap way of getting other people to do a security analysis. The big problem is
interpreting the results. If the prize is not claimed, it does not imply that
any competent analysis was done and came up empty.} A security review is a
manual process; it is relatively expensive in terms of time and effort and it
will never be able to show that the product is in fact secure. [this seems to
ignore the approaches usually employed for high assurance system design and
implementation , i.e., careful design and review coupled with rigid development
procedures, all prior to testing.]

The more complex the system is, the harder a security evaluation 
becomes. A more
complex system will have more security-related errors in the specification,
design, and implementation. We claim that the number of errors and 
difficulty of
the evaluation are not linear functions of the complexity, but in 
fact grow much
faster.

For the sake of simplicity, let us assume the system has $n$ different options,
each with two possible choices.\footnote{We use $n$ as the measure of the
complexity. This seems reasonable, as the length of the system 
specification and
the implementation is proportional to $n$.} Then there are $n(n-1)/2 = O(n^2)$
different pairs of options that could interact in unexpected ways, and $2^n$
different configurations altogether. Each possible interaction can lead to a
security weakness, and the number of possible complex interactions that involve
several options is huge. As each interaction can produce a security 
weakness, we
expect that the number of actual security weaknesses grows very rapidly with
increasing complexity.

The same holds for the security evaluation. For a system with a moderate number
of options, checking all the interactions becomes a huge amount of work.
Checking every possible configuration is effectively impossible. Thus the
difficulty of performing security evaluations also grows very rapidly with
increasing complexity. The combination of additional (potential) weaknesses and
a more difficult security analysis unavoidably results in insecure systems.

In actual systems, the situation is not quite so bad; there are often options
that are ``orthogonal'' in that they have no relation or interaction with each
other. This occurs, for example, if the options are on different layers in the
communication system, and the layers are separated by a well-defined interface
that does not ``show'' the options on either side. For this very reason, such a
separation of a system into relatively independent modules with clearly defined
interfaces is a hallmark of good design. Good modularization can dramatically
reduce the ``effective'' complexity of a system without the need to eliminate
important features. Options within a single module can of course still have
interactions that need to be analyzed, so the number of options per module
should be minimized. Modularization works well when used properly, but most
actual systems still include cross-dependencies where options in different
modules do affect each other.

A more complex system loses on all fronts. It contains more weaknesses to start
with, it is much harder to analyze, and it is much harder to implement without
introducing security-critical errors in the implementation.

Complexity not only makes it virtually impossible to create a secure
implementation, it also makes the system extremely hard to manage. The people
running the actual system typically do not have a thorough understanding of the
security issues involved. Configuration options should therefore be kept to a
minimum, and the options should provide a very simple model to the 
user. Complex
combinations of options are very likely to be configured erroneously, which
results in a loss of security. The stories in \cite{TheCodebreakers} and
\cite{A:WhyFail} illustrate how management of complex systems is often the
weakest link.

Both IPsec and ISAKMP are too complex to be secure. The design obviously tries
to support many different situations with different options. We feel very
strongly that the resulting system is well beyond the level of complexity that
can be implemented securely with current methodologies.


\section{Stating what is achieved}
A security analysis evaluates the security aspects of a system. To be able to
give any sensible answer, it should be clear what properties the system claims
to have. That is, the system documentation should clearly state what security
properties are achieved. This can be seen as the functional 
specification of the
security properties. This applies not only to the entire system, but 
also to the
individual modules. At each module or function, the security properties should
be specified.

A good comparison is the testing of a product. The testing verifies that the
product performs according to the functional specifications. Without
specifications, the testers might have some interesting comments, but they can
never give a real answer.

Without security specifications, the first task of the security analysis is to
create descriptions of the security properties achieved, based on the perceived
intentions of the system designer. The subsequent evaluation might then turn up
problems of the form ``this function does not achieve the properties that we
think it should have.'' The obvious answer will be: ``but that is not the
properties that I designed it to have.'' Very quickly the discussion moves away
from the actual security into what was meant. The overall result is a security
evaluation that might point out some potential weaknesses, but that will hardly
help in improving the security.

The IPsec and ISAKMP protocols do not specify clearly which security properties
they claim to achieve. [RFCs 2401, 2402, and 2406 clearly state the security
services offered by the AH and ESP protocols.] The same holds for the modules
and functions. [modules are not specified by these standards; they are
implementation artifacts.] We recommend that each function, module, 
and protocol
be extended to include clear specifications regarding the security-related
functionality they achieve. We feel that unless this is done, it will not be
possible to perform an adequate security evaluation on a system of this
complexity.


\chapter{Bulk data handling}\label{chap:bulk}

In this chapter we discuss the methods used to handle the encryption and
authentication of the bulk data, as specified in
\cite{RFC2401,RFC2402,RFC2403,RFC2404,RFC2405,RFC2406,RFC2451,RFC2410,RFC2411}.
Together these documents specify the IPsec protocol. They specify the actual
encryption and authentication of packets, assuming that symmetric keys have
already been exchanged. We refer the reader to \cite{RFC2401} sections 1--4.2
for an overview of this part of IPsec and the relevant terminology.


\section{Functionality}
IPsec is capable of providing authentication and confidentiality services on a
packet level. The security configuration of an IPsec implementation is done
centrally, presumably by the system administrator. [In some environments, a
single administrator might control the configuration of each IPsec
implementation, or each user might have some control over it.  The latter would
tend to be characterized as a distributed management paradigm, not a central
one.  Also, two IPsec peers communicate ONLY if both agree on the security
parameters for the SA, i.e., there is suitable overlap in the SPDs.  In that
sense too, security configuration is distributed.]

IPsec is very suitable for creating a VPN over the Internet, improved security
for dial-in connections to portables, restricting access to parts of a network,
etc. These are very much network-level functions. IPsec by itself does not
supply application-level security. Authentication links the packet to the
security gateway of the originating network, the originating host, or possibly
the originating user, but not to the application in question or the data the
application was handling when it sent the packet. [true, but for many
applications, application layer security is not needed, and its implementation
might well be accorded less assurance than the network layer security provided
by IPsec. This paragraph seems to suggest that there is some important benefit
to linking data to an application, through an application-specific security
mechanism.  There are good examples of where this is true, e.g., e-mail and
directories. However, unless there are application-specific security semantics
that cannot be captured by use of an application security protocol, your own
arguments about simplicity, as well as a number of arguments re 
assurance, argue
against proliferation of application security protocols.]

The IPsec functionality can significantly increase the security of the network.
It is not a panacea for all security problems, and applications that require
security services will typically have to use other security systems in addition
to IPsec. [I might disagree with the term "typically" here. A lot 
depends on the
application, where IPsec is implemented, etc.]


\section{Complexity}\label{sec:complexity}
Our biggest criticism is that IPsec is too complex. There are too many options
that achieve the same or similar properties. [if they were completely 
equivalent
this would be a good basis for simplifying IPsec. However, there are subtle
differences that have resulted in the proliferation of options you address
below.]

\subsection{Options}

IPsec suffers from an abundance of options. For example, two hosts that want to
authenticate IP packets can use four different modes: transport/AH, tunnel/AH,
transport/ESP with NULL encryption, and tunnel/ESP with NULL encryption. The
differences between these options, both in functionality and performance, are
minor.

In particular, the following options seem to create a great deal of needless
complexity:

\begin{enumerate}
\item There are two modes that can be used: transport mode and tunnel mode. In
transport mode, the IP header of the packet is left untouched. AH authenticates
both the IP header and the packet payload. ESP encrypts and authenticates the
payload, but not the header. The lack of header authentication in transport/ESP
is a real weakness, as it allows various manipulations to be performed. In
tunnel mode, the full original IP packet (including headers) is used as the
payload in a new IP packet with new headers. The same AH or ESP functions are
used. As the original header is now included in the ESP authentication, the
transport/ESP authentication weakness no longer exists.

Transport mode provides a subset of the functionality of tunnel mode. The only
advantage that we can see to transport mode is that it uses a somewhat smaller
bandwidth. However, the tunnel mode could be extended in a straightforward way
with a specialized header-compression scheme that we will explain shortly. This
would achieve virtually the same performance as transport mode without
introducing an entirely new mode. We therefore recommend that the 
transport mode
be eliminated. [transport mode and tunnel mode address fundamentally different
requirements, from a networking point of view. When security gateways are
involved, the use of tunnel mode is an absolute requirement, whereas it is a
minor (and rarely used) feature for communications between end systems. A
proposal to make all traffic tunnel mode, and to try to offset the added
overhead through compression, seems to ignore the IPCOMP facility that is
already available to IPsec implementations. Today, transport mode is used
primarily to carry L2TP traffic, although this is primarily an efficiency
issue.]

\item There are two protocols: AH and ESP. AH provides authentication, and ESP
provides encryption, authentication, or both. In transport mode, AH provides a
stronger authentication than ESP can provide, as it also authenticates the IP
header. One of the standard modes of operation would seem to be to use both AH
and ESP in transport mode. [although this mode is required to be supported, it
seems to be rarely used today. A plausible, near-term use for AH is to provide
integrity and authenticity for IPsec traffic between an end system and a first-
hop intermediary. For example, AH can be used  between a host inside an enclave
and a security gateway at the perimeter, to allow the SG to control 
what traffic
leaves the enclave, without granting the SG access to plaintext traffic. This,
and similar concatenated SA examples, motivate retention of AH. One could
achieve a similar effect with (authentication-only) ESP tunnels, but with
increased bandwidth and processing overhead.] In tunnel mode, the 
authentication
that ESP provides is good enough (it includes the IP header), and AH is
typically not combined with ESP \cite[section 4.5]{RFC2401}. [the example above
shows why one might wish to use AH for the outer header, but most likely with
ESP in transport mode.] (Implementations are not required to support nested
tunnels that would allow ESP and AH to both be used.)

The AH protocol \cite{RFC2402} authenticates the IP headers of the 
lower layers.
[AH authenticates the IP header at the SAME layer, in many respects. AH was
originally described as an IP (v4) option. In IPv6, AH is viewed as 
part of the
AH header, and may appear before other header extensions (see section 
4.1 of RFC
2401). I agree that AH represents ugly layering, but it's not as bad as you
suggest here.] This creates all kind of problems, as some header fields change
in transit. As a result, the AH protocol needs to be aware of all data formats
used at lower layers so that these mutable fields can be avoided. [this is an
inaccurate characterization, especially given the status of AH re IPv6. Don't
think of AH as a transport protocol. It isn't.] This is a very ugly
construction, and one that will create more problems when future extensions to
the IP protocol are made that create new fields that the AH protocol is not
aware of. [RFC 2402 explains how to deal with new IP header fields in v6 (see
section 3.3.3.1.2.2). The existence of a mutability flag in such extensions
makes processing relatively straightforward.] Also, as some header fields are
not authenticated, the receiving application still cannot rely on the entire
packet. To fully understand the authentication provided by AH, an application
needs to take into account the same complex IP header parsing rules that AH
uses. The complex definition of the functionality that AH provides can easily
lead to security-relevant errors.

The tunnel/ESP authentication avoids this problem, but uses more 
bandwidth. [but
it does not provide exactly the same features, as noted above, so the
alternative is not quite equivalent.] The extra bandwidth requirement can be
reduced by a simple specialized compression scheme: for some suitably 
chosen set
of IP header fields $X$, a single bit in the ESP header indicates whether the
$X$ fields in the inner IP header are identical to the corresponding fields in
the outer header.\footnote{A trivial generalization is to have several flag
bits, each controlling a set of IP header fields.} The fields in question are
then removed to reduce the payload size. This compression should be applied
after computing the authentication but before any encryption. The 
authentication
is thus still computed on the entire original packet. The receiver 
reconstitutes
the original packet using the outer header fields, and verifies the
authentication. A suitable choice of the set of header fields $X$ allows
tunnel/ESP to achieve virtually the same low message expansion as transport/AH.

We conclude that eliminating transport mode allows the elimination of the AH
protocol as well, without loss of functionality.  [counter examples provided
above suggest that this claim is a bit overstated.]

\item The standard defines two categories of machines: hosts and security
gateways. Hosts can use transport mode, but security gateways must always use
tunnel mode. Eliminating transport mode would also allow this distinction to be
eliminated. Various computers could of course still function as hosts or
security gateways, but these different uses would no longer affect 
the protocol.

\item The ESP protocol allows the payload to be encrypted without being
authenticated. In virtually all cases, encryption without authentication is not
useful. The only situation in which it makes sense not to use authentication in
the ESP protocol is when the authentication is provided by a subsequent
application of the AH protocol (as is done in transport mode because ESP
authentication in transport mode is not strong enough). [this is one example of
when one might not need authentication with ESP, but it is not the only one. In
general, if there is a higher layer integrity and/or authentication function in
place, providing integrity/authentication in IPsec is redundant, both in terms
of space and processing. The authentication field for ESP or AH is 12 
bytes. For
applications where packet sizes are quite small, and for some 
environments where
packet size is of critical importance, e.g., packet voice in a wireless
environment, ESP w/o authentication may be appropriate. This is especially true
if the application protocol embodies an authentication mechanism. This might
happen if the application protocol wants to offer uniform protection
irrespective of the lower layers.  Admittedly, this might also cause the
application to offer confidentiality as well, but depending on the application,
the choices of what security services are being offered may vary.] Without the
transport mode to worry about, ESP should always provide its own 
authentication.
We recommend that ESP authentication always be used, and only 
encryption be made
optional. [the question of authentication as an intrinsic part of ESP is
independent of mode, i.e., whether one choose to provide authentication as a
part of ESP is not determined by the choice of transport vs. tunnel mode.]

\end{enumerate}

We can thus remove three of the four operational modes without any significant
loss of functionality. [sorry, can't agree, given the counter examples above.]

\subsection{Undesirable options}
There are existing combinations of options that are undesirable. These pose a
problem when non-experts have to configure an IPsec installation. 
Given the fact
that experts are rare and usually have better things to do, most IPsec
installations will be configured by non-experts. [yes, we were aware of this
concern. However, there is always a tradeoff between adopting the "we know
what's best for you" approach, vs. the "you can screw it up if you want to
approach." We opted for a point somewhere along this spectrum, but 
not at either
end.]

\begin{enumerate}
\item In transport mode, use of ESP provides authentication of the 
payload only.
The authentication excludes the IP headers of the packet. The result is a data
stream that is advertised as ``authenticated'' for which critical pieces of
information (such as the source and destination IP number) are not
authenticated. Unless the system administrator is intimately familiar with the
different forms of authentication used by IPsec, it is quite likely that the
administrator will assume that the authentication protects the entire packet.
The combination of transport mode and the ESP protocol (without the 
AH protocol)
should therefore not be allowed. [The IP source and destination address are
covered by the TCP checksum, which is covered by the ESP integrity check, so
this does limit (a tiny bit) the ability to change these values without
detection. A more significant observation is that transport mode IPsec SAs will
probably always use source and/or destination IP addresses as part of the
selector set. In such cases, tampering with the either address will result in a
failed authentication check.]

\item The standard allows ESP to be used with the NULL encryption, such that it
provides only authentication. The authentication provided by ESP in transport
mode is less functional than the authentication provided by AH, at a similar
cost. If transport mode is retained, either the EPS ESP 
authentication should be
extended or the use of ESP with only authentication should be forbidden and
replaced by the use of AH. [ESP authentication is more efficient to 
compute than
AH, because of the selective IP header coverage provided by AH.  Thus there is
good reason to allow authentication-only ESP as an alternative to AH. 
This point
was debated by the group and, with implementation experience, vendors came to
agree that this is true.]

\item The ESP protocol can provide encryption without authentication. This does
not make much sense in an application. It protects the application against
passive eavesdroppers, but provides no protection against active attacks that
are often far more devastating. Again, this mode can lure non-expert users into
using an unsafe configuration that they think is secure. Encryption without
authentication should be forbidden. [as noted above, there are examples where
this feature set for ESP is attractive.]

\end{enumerate}

\subsection{Orthogonality}
IPsec also suffers from a lack of orthogonality. The AH and ESP 
protocols can be
used together, but should only be used in one particular order. In transport
mode, ESP by itself provides only partial authentication of the IP packet, and
using AH too is advisable. [not in most cases, as noted above.] In tunnel mode
the ESP protocol authenticates the inner headers, so use of AH is no longer
required. These interdependencies between the choices demonstrate that these
options are not independent of each other. [true, but who says that this is a
critical criteria? TCP and IP are not orthogonal either, e.g., note the TCP
checksum covering parts of the IP header.]

\subsection{Compatibility}
The IPsec protocols are also hampered by the compatibility requirements. A
simple problem is the TOS field in the IP header \cite[p.\ 10]{RFC2402}.
Although this is supposed to be unchanged during the transmission of a packet
(according to the IP specifications), some routers are known to change this
field. IPsec chose to exclude the TOS field from the authentication provided by
the AH protocol to avoid errors introduced by such rogue routers. The result is
that, in transport/AH packets that have an authenticated header, the TOS field
is not authenticated. This is clearly unexpected from the application point of
view, which might want to rely on the correct value of the TOS field. This
problem does not occur in tunnel mode. [it is unfortunate that cisco chose to
not follow the specs here, and in several other places. I agree that an
unenlightened system administrator might be surprised in this case. But, in
practice, the effect is minimal.  Your example cites transport mode, 
which means
that the TOS bits are being acted upon by the end system. If end systems really
paid attention to these bits in the first place, cisco would not have been able
to corrupt them with impunity! The reason that these bits are being re-used by
the ECN folks is because hosts have never made use of them.  Still, going
forward, one should pay attention to this vulnerability.]

A more complex compatibility problem is the interaction between fragmentation
and IPsec \cite[appendix B]{RFC2401}. This is a complex area, but a typical
IPsec implementation has to perform specialized processing to facilitate the
proper behavior of higher-level protocols in relation to 
fragmentation. Strictly
speaking, fragmentation is part of the communication layer below the IPsec
layer, and in an ideal world it would be transparent to IPsec. Compatibility
requirements with existing protocols (such as TCP) force IPsec to explicitly
handle fragmentation issues, which adds significantly to the overall 
complexity.
Unfortunately, there does not seem to be an elegant solution to this problem.
[The requirement here is the same that arises whenever an intermediate system
adds info to a packet, or when a smaller MTU intermediate system is traversed.
IPsec in an SG is doing what a router along a path would do if the "other side"
network were smaller. IPsec in a host is doing what the NIC would do if the LAN
MTU changed. The real complexity arises when we wish to do this optimally, at a
security gateway or a BITS or BITW implementation, in cases where different SAs
use different combinations of AH and ESP, or different algorithms, etc.]

\subsection{Conclusion}
The overall result is that IPsec bulk data handing is overly complex. In our
opinion it is possible to define an equivalent system that is far less complex.


\section{Order of operations}

\subsection{Introduction}
When both encryption and authentication are provided, IPsec performs the
encryption first, and authenticates the ciphertext. In our opinion, this is the
wrong order. Going by the ``Horton principle'' \cite{WS:SSL30}, the protocol
should authenticate what was meant, not what was said. The ``meaning'' of the
ciphertext still depends on the decryption key used. Authentication should thus
be applied to the plaintext (as it is in SSL \cite{SSLv3Nov96}), and not to the
ciphertext.[The order of processing is intentional. It is explicitly 
designed to
allow a receiver to discard a packet as quickly as possible, in the 
event of DoS
attacks, as you acknowledge below. The suggestion that this concern 
be addressed
by the addition of a secondary MAC seems to violate the spirit of simplicity
that this document espouses so strongly, and the specific proposed fix is not
strong enough to warrant its incorporation. Moreover, this ordering allows
parallel processing at a receiver, as a means of increasing throughput and
reducing delay.]

This does not always lead to a direct security problem. In the case of the ESP
protocol, the encryption key and authentication key are part of a 
single ESP key
in the SA. A successful authentication shows that the packet was sent 
by someone
who knew the authentication key. The recipient trusts the sender to 
encrypt that
packet with the other half of the ESP key, so that the decrypted data 
is in fact
the same as the original data that was sent. The exact argument why this is
secure gets to be very complicated, and requires special assumptions about the
key agreement protocol. For example, suppose an attacker can manipulate the key
agreement protocol used to set up the SA in such a way that the two parties get
an agreement on the authentication key but a disagreement on the 
encryption key.
When this is done, the data transmitted will be authenticated successfully, but
decryption takes place with a different key than encryption, and all the
plaintext data is still garbled. [The fundamental assumption is that an ESP SA
that employs both encryption and an HMAC will have the keys bound together,
irrespective of the means by which they are generated. This assumption probably
could be better stated in the RFCs.]

In other situations, the wrong order does lead to direct security weaknesses.

\subsection{An attack on IPsec}
Suppose two hosts have a manually keyed transport-mode AH-protocol SA, which we
will call SAah. Due to the manual keying, the AH protocol does not provide any
replay protection. These two hosts now negotiate a transport-mode encryption-
only ESP SA (which we will call SAesp1) and use this to send information using
both SAesp1 and SAah. The application can expect to get confidentiality and
authentication on this channel, but no replay protection. When the immediate
interaction is finished, SAesp1 is deleted. A few hours later, the two hosts
again negotiate a transport-mode encryption-only ESP SA (SAesp2), and the
receiver chooses the same SPI value for SAesp2 as was used for SAesp1. Again,
data is transmitted using both SAesp2 and SAah. The attacker now introduces one
of the packets from the first exchange. This packet was encrypted using SAesp1
and authenticated using SAah. The receiver checks the authentication and finds
it valid. (As replay protection is not enabled, the sequence number field is
ignored.) The receiver then proceeds to decrypt the packet using SAesp2, which
presumably has a different decryption key then SAesp1. The end result is that
the receiver accepts the packet as valid, decrypts it with the wrong key, and
presents the garbled data to the application. Clearly, the authentication
property has been violated. [this attack is not a criticism of the 
choice of ESP
operation ordering, but rather the notion of applying AH and ESP (encryption
only) in a particular order, as allowed by RFC 2401. The specific 
combination of
keying operations described here, though not prohibited by 2401, does not seem
likely to occur in practice. Specifically, if an IPsec implementation supports
automated key management, as described above for the ESP SAs, then it is highly
unlikely that the AH SA would be manually keyed. The push to retain manual
keying as a base facility for IPsec is waning, and most 
implementations have IKE
available.  Under these circumstances, this vulnerability is unlikely to be
realized.]

\subsection{Other considerations}
Doing the encryption first and authentication later allows the recipient to
discard packets with erroneous authentication faster, without the overhead of
the decryption. This helps the computer cope with denial-of-service attacks in
which a large number of fake packets eat up a lot of CPU time. We question
whether this would be the preferred mode of attack against a TCP/IP-enabled
computer. If this property is really important, a 1- or 2-byte MAC (Message
Authentication Code) on the ciphertext could be added. The MAC code allows the
recipient to rapidly discard virtually all bogus packets at the cost of an
additional MAC computation per packet. [a one or two byte MAC 
provides so little
protection that this does not seem to be an attractive counter-proposal. Also,
as noted above, it adds complexity Š]

\subsection{Conclusion}
The ordering of encryption and authentication in IPsec is wrong. Authentication
should be applied to the plaintext of the payload, and encryption should be
applied after that.


\section{Security Associations}
A Security Association (SA) is a simplex ``connection' that affords security
services to the traffic carried by it \cite[section 4]{RFC2401}. The two
computers on either side of the SA store the mode, protocol, algorithms, and
keys used in the SA. Each SA is used only in one direction; for bidirectional
communications two SAs are required. Each SA implements a single mode and
protocol; if two protocols (such as AH and ESP) are to be applied to a single
packet, two SAs are required.

Most of our aforementioned comments also affect the SA system; the use of two
modes and two protocols make the SA system more complex than necessary.

There are very few (if any) situations in which a computer sends an 
IP packet to
a host, but no reply is ever sent. [we have a growing number of apps where this
functionality may be appropriate. For example, broadcast packet video feeds and
secure time feeds are unidirectional.] There are also very few situations in
which the traffic in one direction needs to be secured, but the traffic in the
other direction does not need to be secured. It therefore seems that in
virtually all practical situations, SAs occur in pairs to allow bidirectional
secured communications. In fact, the IKE protocol negotiates SAs in pairs. [IKE
has not always been well coordinated with IPsec, unfortunately. This is why we
have to have null encryption and null authentication algorithms. So, I don't
think one should cite IKE behavior as a basis for making SAs bi-directional. I
agree that the vast majority of examples that we see now are full 
duplex, but we
have example where this may not apply, as noted above.]

This would suggest that it is more logical to make an SA a bidirectional
``connection'' between two machines. This would halve the number of SAs in the
overall system. It would also avoid asymmetric security 
configurations, which we
think are undesirable (see section~\ref{sec:SPD}). [The SPI that is used as a
primary de-multiplexing value, must be chosen locally, by the receiver, so
having bi-directional SAs probably won't change the size of the SAD
substantially. Specifically, how do you envision that a switch to bi-
directionality would simplify implementations?]

\section{Security policies}\label{sec:SPD}
The security policies are stored in the SPD (Security Policy Database). For
every packet that is to be sent out, the SPD is checked to find how the packet
is to be processed. The SPD can specify three actions: discard the packet, let
the packet bypass IPsec processing, or apply IPsec processing. In the 
last case,
the SPD also specifies which SAs should be used (if suitable SAs have already
been set up) or specifies with what parameters new SAs should be set up to be
used.

The SPD seems to be a very flexible control mechanism that allows a very fine-
grained control over the security processing of each packet. Packets are
classified according to a large number of selectors, and the SPD can match some
or all selectors to determine the appropriate action. Depending on 
the SPD, this
can result in either all traffic between two computers being carried 
on a single
SA, or a separate SA being used for each application, or even each TCP
connection. Such a very fine granularity has disadvantages. There is a
significantly increased overhead in setting up the required SAs, and more
traffic analysis information is made available to the attacker. At 
the same time
we do not see any need for such a fine-grained control. [a lot of customers for
IPsec products disagree!] The SPD should specify whether a packet should be
discarded, should bypass any IPsec processing, requires authentication, or
requires authentication and encryption. Whether several packets are combined on
the same SA is not important. [yes it is. By allowing an administrator the
ability to select the granularity of protection, one can control the level of
partial traffic flow confidentiality offered between security gateways. Also,
fine-grained access control allows an admin to allow some forms of connections
through the gateway, while rejecting others. Access control is often the
primary, underlying motivation for using IPsec. A number of attacks become
possible if one cannot tightly bind the authentication provided by IPsec to the
access control decision. Also, given the computational costs of SA 
establishment
via IKE, it is important to allow an administrator to select the granularity of
SAs.] The same holds for the exact choice of cryptographic algorithm: any good
algorithm will do. There are two reasons for this. First of all, nobody ever
attacks a system by cryptanalysis. Instead, attacks are made on the users,
implementation, management, etc. Any reasonable cryptographic algorithm will
provide adequate protection. The second reason is that there are very efficient
and secure algorithms available. Two machines should negotiate the strongest
algorithm that they are allowed. There is no reason to select individual
algorithms on an application-by-application basis. [if one were to employ ESP
without authentication, because a specific higher layer protocol provided its
own authentication, and maybe because the application employed FEC, then one
might well imagine using different encryption algorithms, or different modes
(e.g., block vs. stream) for different SAs. while I agree that the focus on
algorithm agility may be overstated, it does allow communicating parties to
select a higher quality algorithm, relative to the mandated default, if they
both support that algorithm.]

In our opinion, management of the IPsec protocols can be simplified by letting
the SPD contain policies formulated at such a higher level. As we argued in
section~\ref{sec:complexity}, simplification will strengthen the actual system.
[examples provided above illustrate why fine-grained access control is
important.]

It would be nice if the same high-level approach could be done in relation to
the choice of SA end-points. As there currently does not seem to be a reliable
automatic method of detecting IPsec-enabled security gateways, we do not see a
practical alternative to manual configuration of these parameters. It is
questionable whether automatic detection of IPsec-enabled gateways is possible
at all. Without some initial knowledge of the other side, any detection and
negotiation algorithm can be subverted by an active attacker. [the authors
identify a good problem, but it is hardly an unsolvable one. A proposal was put
forth (by Bob Moscowtiz, over a year ago) to include records in the DNS
analogous to MX records. When one tried to establish an SA to a host 
"behind" an
SG, fetching this record would direct the initiator to an appropriate SG.  This
solves the SG discovery problem. Other approaches have been put forth in the
more recent BBN work on security policy management, which forms the basis for a
new IETF WG, chaired by Luis Sanchez. The fact that none of the approaches has
been deployed says more about the priorities of IPsec vendors and 
early adopters
than about the intractability of the problem. The other part of the problem is
verifying that an SG is authorized to represent the SA target. Here 
too, various
approaches have been described on the IPsec mailing list.]

\section{General comments}
This section contains general comments that came up during our evaluation of
IPsec.

\begin{enumerate}
\item In \cite[p.\ 22]{RFC2401}, several fields in the SAD are required for all
implementations, but only used in some of them. It does not make sense to
require the presence of fields within an implementation. Only the external
behavior of the system should be standardized. [the SAD defined in 2401 is
nominal, as the text explains. An implementation is not required to implement
these fields, but must exhibit behavior consistent with the presence of these
fields. We were unable to specify external behavior without reference to a
construct of this sort. The SPD has the same property.]

\item According to \cite[p.\ 23]{RFC2401}, an SA can be either for transport
mode, tunnel mode, or ``wildcard,'' in which case the sending application can
choose the mode on a packet-by-packet basis. Much of the rest of the text does
not seem to take this possibility into account. It also appears to us to be
needless complexity that will hardly every be used, and is never a 
necessity. We
have already argued that transport mode should be eliminated, which 
implies that
this option is removed too. If transport mode is to be retained, we would
certainly get rid of this option. [I agree, but at least one knowledgeable WG
member was quite adamant about this. So, chalk it up to the committee process!]

\item IPsec does not allow replay protection on an SA that was 
established using
manual key management techniques. This is a strange requirement. We 
realize that
the replay protection limits the number of packets that can be transmitted with
the SA to $2^{32}-1$. Still, there are applications that have a low data rate
where replay protection is important and manual keying is the easiest solution.
[elsewhere this critique argues for not presenting options in a standard that
can be misconfigured. Yet here, the authors make an argument for just such an
option! The WG decided that there was too great a chance that a manually keyed
SA would fail to maintain counter state across key lifetime and thus made a
value judgement to ban anti-replay in this context.]

\item \cite[section 5.2.1, point 3]{RFC2401} suggests that an 
implementation can
find the matching SPD entry for a packet using back-pointers from the SAD
entries. In general this will not work correctly. Suppose the SPD contains two
rules: the first one outlaws all packets to port $X$, and the second one allows
all incoming packets that have been authenticated. An SA is set up for this
second rule. The sender now sends a packet on this SA addressed to port $X$.
This packet should be refused as it matches the first SPD rule. However, the
backpointer from the SA points to the second rule in the SPD, which allows the
packet. This shows that back-pointers from the SA do not always point to the
appropriate rule, and that this is not a proper method of finding the relevant
SPD entry. [this is point #3 and is applied only after points #1 and #2. Since
point #1 calls for a liner search of the SPD, the packet would be rejected, as
required. Thus point #3 is not in error.]

\item The handling of ICMP messages as described in \cite[section 
6]{RFC2401} is
unclear to us. It states that an ICMP message generated by a router must not be
forwarded over a transport-mode SA, but transport mode SAs can only occur in
hosts. By definition, hosts do not forward packets, and a router never has
access to a transport-mode SA. [the text in the beginning of section 6 is
emphasizing that an SA from a router to a host or security gateway, must be a
tunnel mode SA, vs. a transport mode SA. If we didn't make this clear, someone
might choose to establish a transport mode SA from an intermediate system, and
this would cause the source address checks to fail under certain circumstances,
as noted by the text.]

The text further suggests that unauthenticated ICMP messages should be
disregarded. This creates problems. Let us envision two machines that are
geographically far apart and have a tunnel-mode SA set up. There are probably a
dozen routers between these two machines that forward the packets. 
None of these
routers knows about the existence of the SA. Any ICMP messages relating to the
packets that are sent will be unauthenticated and unencrypted. Simply 
discarding
these ICMP messages results in a loss of IP functionality. This problem is
mentioned, but the text claims this is due to the routers not implementing
IPsec. Even if the routers implement IPsec, they still cannot send 
authenticated
ICMP messages about the tunnel unless they themselves set up an SA with the
tunnel end-point for the purpose of sending the ICMP packet. The tunnel end-
point in turn wants to be sure the source is a real router. This requires a
generic public-key infrastructure, which does not exist. [RFC 2401 clearly
states the dangers associated with blindly accepting unauthenticated ICMP
messages, and the functionality problems associated with discarding such
messages. System administrators are provided with the ability to make this
tradeoff locally. The first step to addressing this problem is the addition of
IPsec into routers, as stated in the RFC. Only then does one face the need to
have a PKI that identifies routers. Yes, this second PKI does not exist, but a
subset of it (at BGP routers) might be established if the S-BGP technology is
deployed. These are the routers most likely to issue ICMP PMTU 
messages. So, the
answer here is that the specifications allow site administrators to make
security/functionality tradeoffs, locally. The longer term solution described
would require routers to implement IPsec, so that they can send authenticated
ICMP messages. Yes, this would require a PKI, but such a PKI may 
arise for other
reasons.]

As far as we understand this problem, this is a fundamental compatibility
problem with the existing IP protocol that does not have a good solution.

\item \cite[section 6.1.2.1]{RFC2401} lists a number of possible ways of
handling ICMP PMTU messages. An option that is not mentioned is to keep a
limited history of packets that were sent, and to match the header inside the
PMTU packet to the history list. This can identify the host where the packet
that was too large originated. [the approach suggested by the authors was
rejected as imposing too much of a burden on an SG. section 6.1.2.1 offers
options (not suggestions) for an SG to respond to ICMP PMTU messages, including
heuristics to employ when not enough information is present in the returned
header. These options may not as responsive as a strategy that caches 
traffic on
each SA, but they are modest in the overhead imposed. Also, an SA 
that carries a
wide range of traffic (not fine-grained) might not benefit from a limited
traffic history, as the traffic that caused the ICMP might well be from a host
whose traffic has been flushed from the "limited history."]

\item \cite[section 7]{RFC2401} mentions that each auditable event in 
the AH and
ESP specifications lists a minimum set of information that should be 
included in
the audit-log entry. Not all auditable events defined in \cite{RFC2406} include
that information.[you're right. Exactly one auditable event in 2406 does not
specify the list of data that SHOULD be audited.  We'll fix that in the next
pass. Furthermore, auditable events in \cite{RFC2401} do not specify such a
minimum list of information. [there are exactly 3 events defined as 
auditable in
2401, one of which overlaps with 2406. So, to be more precise, the other 2
auditable events defined in 2401 ought to have the minimum data requirements
defined.  Another good point that we will fix in the next pass.] The
documentation should be reviewed to ensure that a minimum list of audit-log
information is specified with each auditable event.

\item Various algorithm specifications require the implementation to reject
known weak keys. For example, the DES-CBC encryption algorithm specifications
\cite{RFC2405} requires that DES weak keys are rejected. It is questionable
whether this actually increases security. It might very well be that the extra
code that this requires creates more security problems due to bugs than are
solved by rejecting weak keys.

Weak keys are not really a problem in most situations. For DES, it is far less
work for an attacker to do an exhaustive search over all possible keys than to
wait for an SA that happens to use a weak key. After all, the easiest way for
the attacker to detect the weak keys is to try them all. Weak-key rejection is
only required for algorithms where detecting the weak key class by the weak
cipher properties is significantly less work than trying all the weak keys in
question.

We recommend that the weak-key elimination requirement be removed. Encryption
algorithms that have large classes of weak keys that introduce security
weaknesses should simply not be used. [I tend to agree with this analysis. The
argument for weak key checking was made by folks who don't understand the
cryptographic issues involved, but who are persistent and loud, e.g., Bill
Simpson. Ted T'so (co-chair of the WG) and I discussed this problem, and tried
to explain it to the list, but were unsuccessful. Another flaw in the committee
process.]

\item The only mandatory encryption algorithm in ESP is DES-CBC. Due 
to the very
limited key length of DES, this cannot be considered to be very secure. We
strongly urge that this algorithm not be standardized but be replaced by a
stronger alternative. The most obvious candidate is triple-DES. Blowfish could
be used as an interim high-speed solution.\footnote{On a Pentium CPU, Blowfish
is about six to seven times faster than triple-DES.} The upcoming AES standard
will presumably gain quick acceptance and probably become the default 
encryption
method for most systems. [DES as a default was mandated because of 
pressure from
vendors who, at the time, could not get export permission for 3DES. Triple DES
or AES will certainly augment DES as additional, mandatory defaults, and may
replace it in the future. ]

\item The insistence on randomly selected IV values in \cite{RFC2405} seems to
be overkill. It is true that a counter would provide known low Hamming-weight
input differentials to the block cipher. All reasonable block ciphers 
are secure
enough against this type of attack. Use of a random generator results in an
increased risk of an implementation error that will lead to low-entropy or
constant IV values; such an error would typically not be found during testing.
[In practice the IV is usually acquired from previous ciphertext output, as
suggested in the text for CBC mode ciphers, which is easy to acquire and not
likely to result in significant complexity. In hardware assisted environment an
RNG is usually available anyway. In a high assurance hardware implementation,
the crypto chip would generate the IV.]

\item Use of a block cipher with a 64-bit block size should in general be
limited to at most $2^{32}$ block encryptions per key. This is due to the
birthday paradox. After $2^{32}$ blocks we can expect one 
collision.\footnote{To
get a $10^{-6}$ probability of a collision it should be limited to about
$2^{22}$ blocks.} In CBC mode, two equal ciphertexts give the attacker the XOR
of two blocks of the plaintext. The specifications for the DES-CBC encryption
algorithm \cite{RFC2405} should mention this, and require that any SA 
using such
an algorithm limit the total amount of data encrypted by a single key to a
suitable value.

\item The preferred mode for using a block cipher in ESP seems to be CBC mode
\cite{RFC2451}. This is probably the most widely used cipher mode, but it has
some disadvantages. As mentioned earlier, a collision gives direct information
about the relation of two plaintext blocks. Furthermore, in hardware
implementations each of the encryptions has to be done in turn. This gives a
limited parallelism, which hinders high-speed hardware implementations. [first,
this is not an intrinsic part of the architecture; one can define different
modes for use with existing or different algorithms if the WG is so motivated.
Second, current hardware is available at speeds higher than the associated
packet processing capability of current IPsec devices, so this does not appear
to be a problem for the near term. Transition to AES will decrease the
processing burden (relative to 3DES), which may render this concern less
serious.]

Although not used very often, the counter mode seems to be preferable. The
ciphertext of block $i$ is formed as $C_i = P_i \oplus E_K( i )$, where $i$ is
the block number that needs to be sent at the start of the packet.\footnote{If
replay protection is always in use, then the starting $i$-value could be formed
as $2^{32}$ times the sequence number. This saves eight bytes per 
packet.} After
more than $2^{32}$ blocks counter mode also reveals some information about the
plaintext, but this is less than what occurs in CBC. The big advantage of
counter mode is that hardware implementations can parallelize the 
encryption and
decryption process, thus achieving a much higher throughput. [earlier the
authors criticize IPsec for a lack of orthogonality, but introducing
interdependence between the anti-replay counter and encryption would certainly
violate the spirit of the earlier criticism! Counter mode versions of 
algorithms
can be added to the list easily if there is sufficient vendor support.]

\item \cite[section 2.3]{RFC2451} states that Blowfish has weak keys, but that
the likelihood of generating one is very small. We disagree with these
statements. The likelihood of getting two equal 32-bit values in any one 256-
entry S-box is about ${256 \choose 2} \cdot 2^{-32} \approx 2^{-17}$. 
This is an
event that will certainly occur in practice. However, the Blowfish weak keys
only lead to detectable weaknesses in reduced-round versions of the cipher.
There are no known weak keys for the full Blowfish cipher.

\item In \cite[section 2.5]{RFC2451}, it is suggested to negotiate 
the number of
rounds of a cipher. We consider this to be a very bad idea. The 
number of rounds
is integral to the cipher specifications and should not be changed at 
will. Even
for ciphers that are specified with a variable number of rounds, the
determination of the number of rounds should not be left up to the individual
system administrators. The IPsec standard should specify the number of rounds
for those ciphers. [I agree that this algorithm spec ought not encourage
negotiation of the number of rounds, without specifying a minimum for each
cipher, although this gets us into the crypto strength value judgement arena
again. Also, the inclusion of 3DES in this table is inappropriate as it is a 48
round algorithm, period.  So, yes, there is definite room for improvement in
this RFC.]

\item \cite[section 2.5]{RFC2451} proposes the use of RC5. We urge caution in
the use of this cipher. It uses some new ideas that have not been 
fully analyzed
or understood by the cryptographic community. The original RC5 as 
proposed (with
12 rounds) was broken, and in response to that the recommended number of rounds
was increased to 16. We feel that further research into the use of data-
dependent rotations is required before RC5 is used in fielded systems. [RC5 is
not required by IPsec implementations. In the IETF spirit of flexible
parameterization of implementations, vendors are free to offer any additional
algorithms in addition to the required default. In general, the IETF is not
prepared to make value judgements about these algorithms and so one 
may see RFCs
that specify a variety of additional algorithms.]

\item \cite[section 2.4]{RFC2406} specifies that the ESP padding should pad the
plaintext to a length so that the overall ciphertext length is both a multiple
of the block size and a multiple of 4. If a block cipher of unusual block size
is used (e.g., 15 bytes), then this can require up to 59 bytes of padding. This
padding rule works best for block sizes that are a multiple of 4, which
fortunately is the case for most block ciphers. [this padding rule is based
primarily on IP packet alignment considerations, not on common block cipher
sizes! This is stated in the text.]

\item \cite[p.\ 6, point a]{RFC2406} states that the padding 
computations of the
ESP payload with regard to the block size of the cipher apply to the payload
data, excluding the IV (if present), Pad Length, and Next Header fields. This
would imply that the Pad Length and Next Header fields are not being encrypted.
Yet the rest of the specification is clear that the Pad Length and Next Header
field are to be encrypted, which is what should happen. The text of point a
should be made consistent with the rest of the text. [The text says "Šthe
padding computation applies to the Payload Data exclusive of the IV, the Pad
Length, and Next Header fields." The comma after "IV" is meat to terminate the
scope of the word "exclusive," and thus the intent is to include the pad length
and next header fields. The term "payload" in ESP applies to a set of data not
including the latter two fields, so the sentence is, technically, unambiguous,
and it is consistent with the terms employed in the figure in section 
2.  But, I
admit the wording could be improved.]

\item There is a document that defines the NULL encryption algorithm 
used in ESP
\cite{RFC2410}, but no document that defines the NULL authentication algorithm,
which is also used by ESP \cite[section 5]{RFC2406}. [good point. Another RFC
publication opportunity!]

\item The NULL cipher specifies an IV length of zero \cite{RFC2410}. This would
seem to imply that the NULL cipher is used in CBC mode, which is 
clearly not the
case. The NULL cipher is in fact used in ECB mode, which does not 
require an IV.
Therefore, no IV length should be specified. [use of the NULL cipher 
in ECB mode
would be inconsistent with the guidance in FIPS 82, and thus CBC mode is
intended, to preserve the confidentiality characteristics inherent in this
cipher :-).]

\end{enumerate}

\section{Conclusions}
The IPsec system should be simplified significantly. This can be done without
loss of functionality or performance. There are also some security weaknesses
that should be fixed. [the extensive comments above illustrate that 
the proposed
changes to IPsec would change the functionality, contrary to the claim made
here. One might argue about the importance of some of this functionality, but
several examples have been provided to illustrate application contexts that the
authors of this report did not consider in their analysis. Several
misunderstandings of some RFCs also were noted.]

Due to its high complexity, we have not been able to analyze IPsec as 
thoroughly
as we would have liked. After simplification, a new security analysis should be
performed.

I have not reviewed the ISAKMP/IKE comments. However, I agree that 
this protocol
is very complex. Much of the complexity results of incremental 
enhancement and a
reluctance on the part of developers to discard older versions of code.