Re: [perpass] draft-davin-eesst

Chuck Davin <chuck@eesst.org> Fri, 03 January 2014 22:45 UTC

Return-Path: <chuck@eesst.org>
X-Original-To: perpass@ietfa.amsl.com
Delivered-To: perpass@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 584E51AE005 for <perpass@ietfa.amsl.com>; Fri, 3 Jan 2014 14:45:11 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.202
X-Spam-Level:
X-Spam-Status: No, score=-1.202 tagged_above=-999 required=5 tests=[BAYES_50=0.8, GB_I_LETTER=-2, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8jNXX7nHrlLP for <perpass@ietfa.amsl.com>; Fri, 3 Jan 2014 14:45:08 -0800 (PST)
Received: from obermeyer.clearbearing.net (smtp.clearbearing.net [IPv6:2607:fc58:1004::9]) by ietfa.amsl.com (Postfix) with ESMTP id BCE1F1ADFFF for <perpass@ietf.org>; Fri, 3 Jan 2014 14:45:08 -0800 (PST)
Received: from mail.nuleaf.com (unknown [32.165.233.44]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by obermeyer.clearbearing.net (Postfix) with ESMTPSA id E44E4F162F; Fri, 3 Jan 2014 17:43:58 -0500 (EST)
Received: from [192.168.100.139] (IP139.NULEAF.COM [192.168.100.139]) by mail.nuleaf.com (Postfix) with ESMTP id 832B5412F0; Fri, 3 Jan 2014 17:43:10 -0500 (EST)
From: Chuck Davin <chuck@eesst.org>
To: "Fred Baker (fred)" <fred@cisco.com>
In-Reply-To: <FB3CAF7F-25CF-49F9-A3A3-7EFF57C28431@cisco.com>
References: <FB3CAF7F-25CF-49F9-A3A3-7EFF57C28431@cisco.com>
Content-Type: text/plain; charset="UTF-8"
Organization: Rarely
Date: Fri, 03 Jan 2014 17:43:09 -0500
Message-ID: <1388788989.1552.30.camel@chuck>
Mime-Version: 1.0
X-Mailer: Evolution 2.28.3
Content-Transfer-Encoding: 7bit
X-ClearBearing-MailScanner-Information: Please contact ClearBearing (http://www.clearbearing.com) for more information on our anti-spam/anti-virus efforts.
X-ClearBearing-MailScanner-ID: E44E4F162F.ABAEF
X-ClearBearing-MailScanner: Found to be clean
X-ClearBearing-MailScanner-IP-Protocol: IPv4
X-ClearBearing-MailScanner-SpamCheck: not spam, SpamAssassin (not cached, score=-6.001, required 5, autolearn=not spam, ALL_TRUSTED -2.00, BAYES_50 1.00, TLS_AUTH_NOSPAM -5.00)
X-ClearBearing-MailScanner-From: chuck@eesst.org
X-ClearBearing-MailScanner-Watermark: 1389393896.70561@gAwAjzb2uYlSrIuuT3uy1w
X-Mailman-Approved-At: Fri, 03 Jan 2014 16:04:20 -0800
Cc: perpass <perpass@ietf.org>
Subject: Re: [perpass] draft-davin-eesst
X-BeenThere: perpass@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
Reply-To: chuck@eesst.org
List-Id: "The perpass list is for IETF discussion of pervasive monitoring. " <perpass.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/perpass>, <mailto:perpass-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/perpass/>
List-Post: <mailto:perpass@ietf.org>
List-Help: <mailto:perpass-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/perpass>, <mailto:perpass-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 03 Jan 2014 22:45:11 -0000

Hi Fred,

Nice to hear from a once familiar face.  

Your questions are very much to the point.

Your scope question is "why so narrow a description?"  A brief answer
is that the specified format targets a particular application.  At
some point, the generality of onion-layered infrastructure must give
way to the particularity of doing some actual real-world task.  In
this case, the OpenPGP standard is applied to provide exactly the
services for which it is designed.  While OpenPGP services could
certainly also be applied in other application domains (e.g., your
example of electronic health records (EHR)), the two applications are
really quite different in terms of audience, requirements, etc.  My
guess is that the superficial commonality between them is not enough
that distilling it would greatly simplify the solution to either
problem.

Your protocol question is "why spend so much time on the structure of
the email message?"  The answer is that we want interoperable
applications.  If a student is applying for a job at a small
commercial enterprise, then his transcript will likely be received as
a normal email message, validated using a normal email reader, and the
PDF component will be reviewed by a human.  However, if our student
has applied to Big University, then the defined format supports
automated validation of the received transcript and automated
extraction of the transcript content (XML) into an Admissions
Department database.  Automation is important, because the Internet,
common application, and other factors have dramatically changed
college admissions, so that even smaller institutions can receive
surprisingly large numbers of applications.  Naturally, we want
transcript recipients to be as tolerant as possible of how students
convey their transcript information, and the spec is perhaps not so
restrictive as it sounds, but application implementors also
need guidance about how hard they should try to unpack a received
message before kicking it to a human.

Your first security concern relates to the optional use of encryption
by students in sending their transcripts to recipients.  For me, you
have certainly struck at the "heart of the matter."  I confess that,
in earlier versions, the encryption was mandatory.  Arguments for
relaxing the encryption requirement are, first, international students
may not always have easy access to encryption, and, second, that
encryption may sometimes be beyond the technical reach of Mom-N-Pop
enterprises.  But either of these limitations could be overcome by, for
example, burning a transcript onto a CD-ROM disk and sending it via
postal mail.  Thus, I vacillate frequently on the question of mandatory
encryption.  However, I do not fully share your concern about students
who decline to encrypt owing to the poor usability of email encryption
tools.  One can imagine a trivial, very application-specific script
that students could use to send their transcripts to chosen
recipients.

I may not have completely understood your second security concern.  It
seems to relate to the vulnerability of student information either at
rest or in transit across MTAs.  To me, these seem like different
cases.

The information is "at rest" only when held by the originator, the
student, or the recipient, but they are all authorized users.  What
remains are attacks upon their hosts.  Thus, the "at rest" question
seems close to the question of whether or not one should encrypt one's
files when not logged in, a question with an easy answer (yes!) but a
bit beyond the scope of the draft.  Of course, the "at rest" threat is
yet another reason to require encryption when the student transmits his
transcript, so perhaps the scale should be tilting much more that way.

The "MTA transit" problem also argues for mandatory encryption of
student transmissions.  But there are alternate approaches as well.
The Internet is supposed to be end-to-end.  Thus, the MTA transit risk
could be mitigated by enabling end-to-end email connections between
student and recipient, who could operate a specialized server that
welcomes arbitrary SMTP connections.  Insofar as the balkanization of
our email infrastructure had been largely a response to spam, individual
sites could, if they wish, choose to prefer built-in end-to-end security
over the convenience of reduced spam.

Your closing paragraph seems to sketch an alternative approach to the
MTA transit problem (??) or other "casual access."  I took its gist to
be that, absent student encryption, the originator could control
access to the student data using the originator's encryption keys.  In
common with other centralized approaches, it is, as you say, "not
perfect security."  In particular, the student is not really in control
of her own data.

I undertook this work less to innovate and more to address a real
world problem.  Although, to this community, the posted draft should
seem pedestrian, to a broader audience, it clearly demonstrates that
large centralized databases are not the only technical option for
addressing school transcript distribution.  The centralized approach
permits students little real privacy and no real control over
distribution of their own personal information.  In contrast, the
common format and distributed approach in the posting assign protocol
roles to the various players that are appropriate to their proper
rights and responsibilities.  Review, support, and implementation
within this community would counter the claim that a more
paternalistic, centralized approach is the only one that is
technically feasible.

Thanks for your very thoughtful review and comments.

Best,
Chuck

On Thu, 2014-01-02 at 18:06 +0000, Fred Baker (fred) wrote:
> I scanned quickly through draft-davin-eesst with a few basic puzzlements in mind. The mechanism makes sense. I have two questions on security/privacy considerations, one on the protocol involved, and one on scope. I am copying perpass, because I suspect that my considerations have bearing on the broader topic of pervasive monitoring of traffic.
> 
> If I understand correctly, the EESST specification is intended to facilitate the secure transmission of data private to one party but originated by a second party to a set of third parties with legitimate and authorized interest. The specific case is a secondary school transcript, being sent by a student, originated by an institution, and received by a set of other institutions; the data is, of course, private to the the student, who is presumably applying to the receiving institutions. 
> 
> My question on scope is: "why so narrow a description?" I can think of other cases in which data private to one party and originated by a second party is sent to a set of third parties with legitimate and authorized interest; medical records come quickly to mind. I could imagine the basic mechanism being described in generic terms and the transmission of transcripts described separately as a use case for the mechanism. 
> 
> My protocol question is "why spend so much time on the structure of the email message?" If the student has the files, they could be communicated via http upload, IM, USB key, twitter, or any other application. Even if email is used, the student is likely to simply attach them to an email message using his or her favorite email tool without regard to the specifics of the EESST specification. In any case, the transmission will need context: unless s/he is uploading them to an application web page at the receiving institution (which is its own context), the student will likely include either text or another file as a cover letter. Why not simply say that the files are exchanged without reference to the protocol in question?
> 
> But now to what I consider the heart of the matter. 
> 
> The specification calls for a pair of files (XML and PDF) to be signed by the originating institution, for purposes of authenticity and integrity checking, and for the MIME object containing it to be optionally encrypted, so that it cannot be inspected in an MTA. I understand that encryption being optional, in that few mail tools make S/MIME or OpenPGP easy enough for a non-expert to use - even in the IETF, few have PGP keys, and since our lists don't have keys, even this email is being sent in the clear. So the call for encryption is largely vacuous, and the very real possibility remains for inspection of data while at rest in an MTA or captured in flight on a traffic analyzer. Due to our failure to provide simple key creation and management tools, this private and important data is accessible to a casual eye in flight.
> 
> However, in the use case, the transcript is likely to survive as a record for years, and perhaps forever, but only be in transit for a period of seconds to minutes in the usual case. The specification leaves the data at rest unencrypted, available for casual inspection. One could argue that this is not an issue with a transcript; nobody is going to lose their job over having gotten a bad grade decades earlier. In the medical record use case, it can have dramatic effects. I think the primary threat is unauthorized or inappropriate access/use when the data is at rest. 
> 
> Why not have the originating institution encrypt the data using its private key and make that key available to other institutions? In this or another specification, you could give them a naming guideline that would identify the file as pertaining to an individual and an institution, and therefore a specified public key. This is not perfect security, of course; an unauthorized party that obtained the public key of the originating institution could read the file. But it at least forces them to do so, as opposed to leaving the data open to hack or casual access.