Re: [ire] Extended verification process of the escrow deposit files

Christopher Browne <cbbrowne@afilias.info> Fri, 14 December 2012 16:07 UTC

Return-Path: <cbbrowne@afilias.info>
X-Original-To: ire@ietfa.amsl.com
Delivered-To: ire@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 368C521F891C for <ire@ietfa.amsl.com>; Fri, 14 Dec 2012 08:07:50 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.977
X-Spam-Level:
X-Spam-Status: No, score=-1.977 tagged_above=-999 required=5 tests=[AWL=0.000, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2EyeovSksf-p for <ire@ietfa.amsl.com>; Fri, 14 Dec 2012 08:07:49 -0800 (PST)
Received: from outbound.afilias.info (outbound.afilias.info [66.199.183.4]) by ietfa.amsl.com (Postfix) with ESMTP id 8CA0E21F8910 for <ire@ietf.org>; Fri, 14 Dec 2012 08:07:48 -0800 (PST)
Received: from ms5.on1.afilias-ops.info ([10.109.8.9] helo=smtp.afilias.info) by outbound.afilias.info with esmtp (Exim 4.69) (envelope-from <cbbrowne@afilias.info>) id 1TjXne-0006Ix-6M for ire@ietf.org; Fri, 14 Dec 2012 16:07:46 +0000
Received: from mail-ye0-f198.google.com ([209.85.213.198]) by smtp.afilias.info with esmtps (TLSv1:RC4-SHA:128) (Exim 4.72) (envelope-from <cbbrowne@afilias.info>) id 1TjXne-0006k9-5y for ire@ietf.org; Fri, 14 Dec 2012 16:07:46 +0000
Received: by mail-ye0-f198.google.com with SMTP id l11so5002129yen.1 for <ire@ietf.org>; Fri, 14 Dec 2012 08:07:41 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:x-gm-message-state; bh=npXzyR/613WRddXC7XsqXYyTH3abaaA+uAIZxXxM7WM=; b=PrAKN9f85AiK10XNvCVtUp0uqUOkVMZ2JZapEhkUx8JYOrYtpmR/WGCTrQImrYc2E4 MlB8YeeSxPh7BMH2JgQGhuJGGrcudNYPYmopdlJpaz/FmRnaztfoBnw46VibAvzm/TEI sR4oajOnLG+iu22NxaU0hjMt5RSMG+PmU8EgW0AaWZURD4r70YRvBTH/OEPhDCM3RNg8 3Ma3/4sP8Sr6UMrazrXtEDZA8Qa3dC+Kr0g8jwZH3eIM/iFRmbE3EHHQVOmRsV67fbh0 iWYFgL9KSGBXoadSbPcxhtVZWKbllbbxZT7AmnLLg/Z8hFDc6FaV3BWMcyABu9hcGVUF 1YZw==
Received: by 10.229.174.98 with SMTP id s34mr634578qcz.65.1355501261100; Fri, 14 Dec 2012 08:07:41 -0800 (PST)
MIME-Version: 1.0
Received: by 10.229.174.98 with SMTP id s34mr634575qcz.65.1355501260988; Fri, 14 Dec 2012 08:07:40 -0800 (PST)
Received: by 10.49.13.106 with HTTP; Fri, 14 Dec 2012 08:07:40 -0800 (PST)
In-Reply-To: <CCEFB2A4.6A69%gustavo.lozano@icann.org>
References: <CCEFB2A4.6A69%gustavo.lozano@icann.org>
Date: Fri, 14 Dec 2012 11:07:40 -0500
Message-ID: <CANfbgbbZ7aA89XcFXfSMh2XpjVoGUyfkx_K9y0+3kDZGqnzLiw@mail.gmail.com>
From: Christopher Browne <cbbrowne@afilias.info>
To: Gustavo Lozano <gustavo.lozano@icann.org>
Content-Type: text/plain; charset="ISO-8859-1"
X-Gm-Message-State: ALoCoQmfFwQ6dX5s4Xz6bNpa+cPWwkEHRP/YSvMMEDhu7ankwPACAWT8E29yl8MRBG2oLqWr6xgagJUZGmvzJjZom7GJPYvBtMkFx56SLKV6MBY3k1IRYPGIphtmDr2lDptLR/xQyDRT
Cc: "ire@ietf.org" <ire@ietf.org>
Subject: Re: [ire] Extended verification process of the escrow deposit files
X-BeenThere: ire@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Internet Registration Escrow discussion list." <ire.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ire>, <mailto:ire-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ire>
List-Post: <mailto:ire@ietf.org>
List-Help: <mailto:ire-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ire>, <mailto:ire-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Dec 2012 16:07:50 -0000

On Thu, Dec 13, 2012 at 7:28 PM, Gustavo Lozano
<gustavo.lozano@icann.org> wrote:
> In case of differentials deposits, the escrow agent shall use the last full
> and all the differentials deposits to perform the extended verification
> process.
>
>
> I am interested in your feedback regarding:
>
>  1. Other tests that you consider should be present in the draft.

In keeping with James Mitchell's comments about "the quick brown fox
jumps over the lazy dog", which could be set as a domain name, which
may be "structurally valid," but structurally nonsense, I would
suggest that mere comparison of escrow deposits is nowhere near
sufficient.

We could have a near-infinite number of policies, for each bit of the
escrow format, and someone with a Markov Chain generator could make up
a conforming escrow file out of random, but still policy-conforming,
junk.  Verifying that it's not just junk requires looking to other
sources such as WHOIS or DNS.

I daresay I have seen all the cases he mentions as problems, and more.
 Most typical and painful to resolve is to receive phone numbers not
complying with the ITU E164 standard. My "favourite" was receiving
contact <contact:id> values that did not conform to RFC 5733, so that
I had to remap them to new values in multiple places before
proceeding.

A more legitimate test would be more "end-to-end"; it would compare
escrow with other notionally authoritative sources.

The approach that comes to mind is to select data from 4 sources and
make sure that they match:

The data sources:

1.  Escrow is the one that is obviously at hand;

2.  Zone files tend to be not difficult to obtain, and they indicate
the state of the resolving portions of the 'portfolio' of domains in a
registry;

3.  WHOIS reports information on the state of the registry, albeit on
a one-by-one per-request basis;

4.  Submitting DNS requests to the authoritative DNS resolver for the
registry returns authoritative information on resolution information
for active domains.

#1 and #2 (perhaps just #1) are available "en masse", for the entire registry.

A test that escrow matches against the registry would be to pick a
sample of domain names from sources 1 (escrow) and 2 (zone file), and
agree them against WHOIS (#3) and the DNS resolver (#4).

It would not be reasonable to try to do this with all names in the
registry; a reasonably small statistically random sample should, if
collected regularly, provide confidence that samples taken from escrow
match against the "real data" in the registry.  Using stratification
would be appropriate to make sure that different sorts of cases get
examined (or perhaps avoided - it's desirable not to have many "false
mismatches" as would happen when objects legitimately change between
the time escrow is dumped and when validation tests are done.)

Over time, a validation involving doing a domain data reconciliation
of even as small a sample as a few dozen names per day should provide
useful assurance that the sources are not vastly out of sync,
particularly if discrepancies get escalated into larger targeted
samples.  Note that this is the standard sort of thing you'll find in
literature on auditing procedures.