Re: [Sidrops] Reason for Outage report (was: Re: ARIN RPKI Service Impact - 12 August 2020 - manifest issue - resolved)

Job Snijders <job@ntt.net> Wed, 26 August 2020 16:00 UTC

Return-Path: <job@ntt.net>
X-Original-To: sidrops@ietfa.amsl.com
Delivered-To: sidrops@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CAFBF3A163D for <sidrops@ietfa.amsl.com>; Wed, 26 Aug 2020 09:00:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qZFwZnbWKz1Y for <sidrops@ietfa.amsl.com>; Wed, 26 Aug 2020 09:00:07 -0700 (PDT)
Received: from mail4.sttlwa01.us.to.gin.ntt.net (mail4.sttlwa01.us.to.gin.ntt.net [204.2.238.64]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id F19093A1639 for <sidrops@ietf.org>; Wed, 26 Aug 2020 09:00:06 -0700 (PDT)
Received: from bench.sobornost.net (sobornost.connected.by.freedominter.net [45.155.156.99]) by mail4.sttlwa01.us.to.gin.ntt.net (Postfix) with ESMTPSA id 7EA5522019D; Wed, 26 Aug 2020 16:00:05 +0000 (UTC)
Received: from localhost (bench.sobornost.net [local]) by bench.sobornost.net (OpenSMTPD) with ESMTPA id 287521dd; Wed, 26 Aug 2020 16:00:02 +0000 (UTC)
Date: Wed, 26 Aug 2020 16:00:01 +0000
From: Job Snijders <job@ntt.net>
To: John Curran <jcurran@arin.net>
Cc: "sidrops@ietf.org" <sidrops@ietf.org>
Message-ID: <20200826160001.GF95612@bench.sobornost.net>
References: <DE33EFAE-FBD2-478F-92A9-1FBD81CCC43F@arin.net> <727F6FBD-F73C-4F58-AE2D-0276B2A183A3@arin.net>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <727F6FBD-F73C-4F58-AE2D-0276B2A183A3@arin.net>
X-Clacks-Overhead: GNU Terry Pratchett
Archived-At: <https://mailarchive.ietf.org/arch/msg/sidrops/Q5bjFdDGFFBDWoaDT_wyCU67J7k>
Subject: Re: [Sidrops] Reason for Outage report (was: Re: ARIN RPKI Service Impact - 12 August 2020 - manifest issue - resolved)
X-BeenThere: sidrops@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: A list for the SIDR Operations WG <sidrops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sidrops>, <mailto:sidrops-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/sidrops/>
List-Post: <mailto:sidrops@ietf.org>
List-Help: <mailto:sidrops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sidrops>, <mailto:sidrops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Aug 2020 16:00:09 -0000

Dear John,

I'd like to offer some contributions to this document's outline of the
background of some implementations. Mostly just nitpicking.

On Wed, Aug 26, 2020 at 02:54:17PM +0000, John Curran wrote:
> 12-13 August RPKI Outage: Update
> Posted: Wednesday, 26 August 2020
> Service Update
> 
> 12 August 2020 at 12:21 PM to 13 August at 11:36 AM
> 
> On 12 August at 12:21 PM, ARIN deployed a new version of its RPKI
> system. ARIN’s repository showed no errors in both RIPE’s Validator
> and NLNetLab’s Routinator systems. 

The current versions of routinator and ripe ncc's validator have weak
(lacking) support for manifest handling, there are other issues in both
softwares that don't yield errors where they should yield errors related
to manifest handling. Neither implementation handles manifests correctly
at the moment, so neither software currently can be used to confirm the
correct publication of manifest related data. :-(

> At 12:46 PM on that same day we received a service issue notice that
> ARIN’s repository was not working with rpki-client. ARIN Engineering
> worked closely with the OpenBSD software developers to pinpoint the
> error within the RPKI system. Both ARIN engineering and the OpenBSD
> developers independently found the error within ARIN’s repository. The
> fix was developed and deployed on 13 August at 11:36 AM.

It was not just rpki-client (linked against LibreSSL), but the FORT
validator (linked against OpenSSL) also flagged the same issue and
emitted an error. LibreSSL and OpenSSL's RPKI extensions do share some
public APIs, (and have some code in common), but they are distinct
implementations from separate development communities. 

> Here is a detailed analysis of the error:
> 
> During RPKI repository generation, ARIN creates “manifests.” A
> manifest is cryptographic object specific to the RPKI which is used to
> help guarantee the integrity of the repository. One manifest is
> associated with each resource certificate in the repository. The
> manifest, flagged by the OpenSSL-based validators, had a subtle
> encoding issue. 
>
> The manifest in question essentially contains two
> copies of an AlgorithmIdentifier variable in different locations (and
> used for different purposes). Per RFC 5280, Section 4.1.1.2, these two
> instances must match completely. In ARIN’s manifest, one contained an
> empty string (“”) as a parameter and the other contained a NULL
> (pointer to nothing). The empty string parameter was incorrect and the
> OpenSSL-based validators were flagging this because the two
> definitions of AlgorithmIdentifier did not match.

[ off topic note: from what I understand, the technical reason these
identifiers MUST match is to make certain X.509 fingerprinting
operations possible. If both identifiers had the (required) parameter
field omitted the situation would've been different. ]

> Planned Corrective actions:
> 
> As a corrective action, ARIN will be broadening its testing strategy.
> In future releases, we will be validating not only LibreSSL-based
> validators (RIPE’s Validator and NLNetlab’s Routinator) but also
> OpenSSL-based validators such as rpki-client and Fort. The list of
> validators we do test against the ARIN repository will be noted within
> the RPKI section of ARIN’s website.

This is a good and sensible plan.

Note: Neither RIPE's Validator and routinator are not based on LibreSSL
or OpenSSL. RIPE uses a cryptographic library called "Bouncy Castle" and
routinator appears (to in part?) rely on library bindings from the rust
ecosystem (which under the hood may link LibreSSL or OpenSSL), but also
in part seem to attempt to build their own crypto. LibreSSL and OpenSSL
both reject X.509 data with this type of encoding issue, and
consequently any RPKI validator implementations linked against these
libraries will automatically consider the algorithmidentifier mismatch
an error.

It is key to ensure not only a diverse set of validator implementations
is used for testing, but also care is taken to ensure a diverse set of
cryptographic libraries are put through the test. It was not rpki-client
or FORT rejecting the data, it was libressl and openssl rejecting the
data.

FORT and rpki-client have committed to a strict (safe) interpretation of
how cryptographic data should be handled and what those design choices
mean for internet operations in the global routing table. If either of
these implementations are used to confirm the correct operating of the
ARIN Trust Anchor (in cojunction with less strict implementations), one
will not only find the lowest common denominator, but in doing also
adhere to the highest standards.
 
Kind regards,

Job