[IPsec] Draft IETF-79 minutes

Yaron Sheffer <yaronf.ietf@gmail.com> Thu, 18 November 2010 06:49 UTC

DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject :content-type:content-transfer-encoding; b=hDCb2HyI8085WdTssnqNKnF3qh9Aa/V8hn/EBtTNcbE4Uh/5Z3L7dw5T0x1PWdYaHi MWVfWBlaLQ6cmZJHXkMKN1Xvt/vqtEYjvOY0IWk6BrQwCO5mwFKue9C+hN737C2Lmc1n xNoWYIbDjPa0ApI4NX+XDZbM8OOB2KHxFsohs=
Message-ID: <4CE4CC87.6090302@gmail.com>
Date: Thu, 18 Nov 2010 08:49:43 +0200
From: Yaron Sheffer <yaronf.ietf@gmail.com>
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.12) Gecko/20101027 Lightning/1.0b2 Thunderbird/3.1.6
MIME-Version: 1.0
To: IPsecme WG <ipsec@ietf.org>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Subject: [IPsec] Draft IETF-79 minutes
Precedence: list

Below are draft minutes of the Beijing meeting, for your review. I'd 
like to thank Brian Weis for taking them. I have meddled with the text, 
so all remaining errors are mine.

Thanks,
	Yaron

---

WG Status
- Paul Hoffman (PH)

Slides: http://www.ietf.org/proceedings/79/slides/ipsecme-0.pdf

PH: The two active documents in the WG will be presented today. We 
assume that people have read the docs and are aware of the open issues 
(http://trac.tools.ietf.org/wg/ipsecme/trac/report/1).

Failure Detection
- Frederick Detienne (FD)

Slides: http://www.ietf.org/proceedings/79/slides/ipsecme-2.pdf

FD: By taking parameters on the wire and the QCD Secret (not shared), a 
QCD token is created. The Token is sent in the IKE_AUTH exchange. The 
peer also returns a QCD token that it has similarly generated.

If one of the VPN gateways crashes, it loses its information except for 
its QCD Secret. It will receive ESP packets from the still-live peer, 
and returns an INVALID_SPI as well as a token sent in the clear. The 
non-crashed peer sends an IKE message back containing just a header 
(liveness test), which allows the crashed peer to re-create its token.

Q: Header isn't protected?
A: This is the gist of the issue we'll be covering later.

Issue 198: Do we really need the QCD token for the initiator too?

Tero Kivinen (TK): I don't believe that to be relevant to this case. QCD 
is useful for the asymmetric case. Otherwise, the gateway that restarts 
can recreate the SA.
Paul: we included the gw-gw case in the problem statement.
TK: Making it symmetrical adds lots of complication to implementations, 
in particular when there's mobility involved.
FD: On the contrary. Going asymmetric forces users to configure which 
side is doing QCD. In practice, this is a real problem ... I have seen 
many customer cases over the years and this is a big problem. Failure 
detection turns a high severity problem into a low-severity or a 
non-problem.
Paul: Tero, what you're saying is that because this doesn't happen much, 
and it's a fair amount of implementation overhead, you'd like to remove 
this case?
TK: Yes. Most of our gw-gw cases there are better and faster recovery 
methods.
PH: But Fred, you're saying that if you don't cover this, then you have 
to configure more.
FD: Yes, you have to make a judgment call re: which side initiates.
TK: In those cases where it matters, only one one side can do QCD anyway.
PH: You're way oversimplifying.
TK:  It's usually the branch office that drops off, and the headend 
doesn't care. You can use other methods that are faster to fix this 
problem. Most problems are a result of bad implementations.
Paul: Are we at the right scenarios?
FD: no vendor has a perfect implementation. Failure detection is a 
safety net.

Yoav Nir (relayed by Jabber scribe): My view may be colored by my own 
implementation, but we use the same implementation for site-to-site or 
remote access GW. It's easier to implement both sides as both maker and 
taker. In any case, the original initiator isn't necessary the token maker.
PH: Tero, for this item, say "which scenarios you feel are important" 
rather than it's not good for some cases.
Tero: I have sent 100s of lines of text earlier, and I have explained 
that. My implementation can do this using existing IKEv2 methods and it 
might be one half-trip faster.
FD: we agree either side can rebuild the tunnel. But when you're traffic 
triggered (which is common), the traffic pattern may not be symmetric.
TK: and I have explained how this can be dealt with using standard IKEv2 
methods.
PH: Tero, when you repost, specify which scenarios you are dealing with.
FD: we should say in the doc that we want a “one size fit all scenario”, 
where end users don't need to be involved in configuring it. The issue 
remains open. We need to analyze it per use case.

Issue 199: Section 7.4 is mostly wrong.

TK: I don't think we need the whole section 7 any more.
PH: It would be important to show people 5 years down the road why we 
did this.

Issue 200: Section 8 ignores IKEv2 text.
FD: the picture is also incorrect.

<no discussion>

Issue 201: Interdomain Gateways do not need QCD at all
FD: this should be merged with 198. We should have a “one size fit all” 
solution, and independent of the traffic going inside the tunnel.

<no discussion>

Issue 202: Token makers generating the same tokens w/out sync DB

TK: (Slide 15) MITM is much more powerful attack. This only requires a 
passive listener. The attacker doesn't even need to capture the 
gateway's reply.
FD: MITM is wrong term, I agree. But even this type of attacker can do 
more dangerous things (TK disagrees, FD promises to expand on the 
mailing list).
Dan Harkins: I don't see why this is a problem. It's a case with a 
cluster of nodes in standby. This is a stateful protocol, they're 
sharing state, there's no need to do QCD? And if you're not sharing 
state, why share the QCD token?
FD: You're right, if they share state there's no need for QCD. The thing 
we see more and more is that synchronizing the SAs between gateways in 
hw appliances is expensive.
DH: My point is that if you're not sharing state, don't share the QCD 
token. It's causing problems by sharing it.
TK: They assume that the client will cache them and they can recover, 
using session resumption.
Pratima Sethi: Sharing the QCD token helps you recover faster without 
synchronization. Faster than rebooting.
PH: I'll restate (slide 14): Active/Standby are only sharing QCD secrets 
because its easier than sharing the whole state. Not the traditional HA, 
a QCD sharing scenario.
FD: we offer a tradeoff compared to the universal failure detection 
provided by the liveness test. Speed of recovery vs. false positives.

Yoav: It makes sense for a hot standby case without sharing, you want 
the  failure to look like a really fast reboot. We should prevent the 
standby gateway from replying. It's dangerous for load-sharing.
Gregory Lebowitz: It's not just dangerous, the load sharing case simply 
won't work. They'd have separate IP addresses.
TK: the document already says that all cluster members should know 
whether an IKE SA is active. Other solutions don't work.
FD: gateways even don't know if another gateway is up.
Ahmad Muhanna: does a gateway know if it's in standby mode?
FD: not in all scenarios. Not in anycast scenarios.
PH: this contradicts the current text in the document.

Some discussion was had about what really is active-standby definition.

Yaron Sheffer (relayed by Jabber scribe): Should not reopen the IKE 
threat model. IKE is resistant to active MITM.
Steve Kent: If it's really a standby it doesn't get traffic. Let's use 
the correct terminology.
FD: the device is “standby” with respect to that particular peer. Maybe 
change the terminology. Devices may not know if they are active for a 
specific peer at any given time. Proposes a solution: the client should 
not accept a QCD token where its state machine indicates none should be 
coming.
TK: an attacker can modify an (unauthenticated) ESP packet to cause the 
gateway to eventually respond with a token.
FD: this requires a liveness test that will take care of the situation.
PH: we should decide first whether we want to expand the doc's coverage. 
We should split #202 into 2-3 issues.

High availability protocol open issues - 45 min
	<http://tools.ietf.org/html/draft-ietf-ipsecme-ipsecha-protocol>
- Dacheng Zhang

After an HA event, the new active node might not have the most recent 
information (e.g., IPsec replay counters).

PH: This HA proposal is only "tight" IPsec clusters, unlike the 
discussion about QCD. (slide 4)

One solution is for the new active standby to request the newest 
information from the peer using an IKE notify.

Delta in replay counter is sent, not the new value.
Steve Kent: Pick the largest one, and send that in response?
Paul: Yes, although might need to be more explicit. [Later note: this 
part was  misunderstood by several people during the discussion – YS].

IKEv2 peers also need to negotiate the ability.

Issue 1: Multiple failover

<no discussion>

Issue 2: How to synchronize the failover counter amongst different 
cluster members

PH: These issues are about when there are 3 or more, and 2 go down.

TK: I noticed that on slide 8 there is a problem, the ESN bit is 
overlapping the Critical bit. Needs to move elsewhere.

Paul: Those who read the draft, did we close out the known issues from 
the last drafts? Does it match the req'ts  document? Does anyone feel 
that we're not? (no response) Does anyone have any open issues? (None)

We might go to WGLC on this sooner than QCD.

FD: make sure ESP packets are not accepted while there is a replay 
“window of opportunity”.
PH: this should at least go into the security considerations.

IKEv2 Re-authentication

PH: Keith Welter wanted this to be discussed and he's not here. There 
are some reauth issues with IKEv2bis, and there are some possible 
solutions. We might look at different reauth methods in the future, in a 
faster way.

TK: I think we talked about this, we have an RFC out there talking about 
reauth. Are we coming back to that?
PH: We said last time we don't want to over-complicate the base IKEv2 spec.
TK: Adds complexity even if a separate RFC.
PH: Now he's asking, is the problem big enough and we can do something 
less complicated? I'm not a big believer in this problem: if you do 
reauth, it's going to take some time. You don't want to over-optimize. 
But other people differ in opinion.

Yoav: the idea is to keep the SAs up during reauth.

IKEv2-- (IKEv2 "minus minus")
Tero Kivinen

Defining the minimal set of features that an IKE implementation needs to 
have. Lots of people discussing that IKEv2 is too complicated, other 
IETF groups (e.g. CORE) wanting to use IPsec but not the full IKEv2 
functionality. They have constrained devices. There's lots of optional 
stuff in IKEv2. Explain that you only need 4 packets if you only need 
one SA. I started to take the IKEv2 docs and cut out stuff, and I ended 
up with 6-7 pages base spec plus 20 pages of payload description. All 
MUSTs  except the one requiring support of certificates.
PH: If you're willing to push together a rush draft, there's a couple of 
groups that might be interested in that. This will probably not be a WG 
document.

Sean Turner: If you copy-and-paste the stuff out of the IKEv2 draft it 
might end up as pain due to errata having to be duplicated. You want to 
keep a pointer to the original. Also there's potential copyright issues.
TK: I also added some new text.
FD: even in IKEv1 we are seeing too minimal implementations. So this is 
useful as an introduction. The doc should be seen as a stepping stone 
towards a full implementation.

Yaron: if this goes standards track, we will have a hard time 
synchronizing with the base spec.
PH: informational.

There was a discussion that this should be a profile kind of document, 
Informational, describing what would minimally be needed to implement 
IKEv2. The only issue is certificate support being a MUST.

Yoav: this finally documents the mythical RFC 5996 “minimal implementation”.


IKEv2 with CGA
Jean-MIchel Combes

Slides: http://www.ietf.org/proceedings/79/slides/ipsecme-3.pdf

PH: Are any of these drawbacks different than CGA in general?
J-M: Some are specific to IPsec.
PH: What about the hard-coded algorithms?
Steve Kent;  the CSI WG talked about hard-coded algorithms but never did 
anything about them. The threat model is different. CGAs  were intended 
to be "very local" -- used between a host and 1st hop gateway. When you 
move it here there's a different set of concerns. CGAs represent a 
"here's another way of doing autoconfig. and have a certain continuity" 
but it's for local uses. One of the features of IPv6 addresses were 
address generation for privacy reasons, and using multiple addresses 
concurrently. If you talk about CGAs as a static thing and beyond that 
local net, we're violating that other feature's trust model and that's a 
bad thing.
J-M: there is a a MIP6 RFC (RFC 4866) on using CGAs to secure MIP6 
signaling. (Ahmad: route optimization, not general MIP6).
PH: This is for information now, and you'll let us know if you plan to 
progress it?
J-M: Yes.


Sean: apologizes for having held the PAKE proposals. We have 3 
proposals. Will have an independent reviewer go through them in the next 
few weeks and will then decide how to proceed.

Paul: please re-read the active drafts. We are still making protocol 
changes.

[IPsec] Draft IETF-79 minutes Yaron Sheffer