[Pearg] Comments on draft-rao-pitfol-01

Joseph Salowey <joe@salowey.net> Tue, 26 May 2020 05:04 UTC

MIME-Version: 1.0
From: Joseph Salowey <joe@salowey.net>
Date: Mon, 25 May 2020 22:04:07 -0700
Message-ID: <CAOgPGoB8KzQFs3x8wnaE4_Qa_wvmsCJz7=2j5vfstYDBw4vXbA@mail.gmail.com>
To: pearg@irtf.org
Content-Type: multipart/alternative; boundary="00000000000012253405a6860737"
Archived-At: <https://mailarchive.ietf.org/arch/msg/pearg/Mh0J62uK41-rVjSjeKMWSFRCvP0>
Subject: [Pearg] Comments on draft-rao-pitfol-01
Precedence: list

I had a chance to review draft-rao-pitfol-01.  Sensitive information in
logs is a constant problem for many security and privacy organizations so I
am very happy to see this draft.  Below are some thoughts on the draft.

1.  While PII information is super important, the problem extends to all
types of sensitive information in logs.  One common example is service
credentials written to logs or data that is confidential to an organization
and not an individual.  I don't see anything that would limit this approach
to just PII, but it may extend the use cases a bit.

2.  Log data is incredibly useful for troubleshooting,  analytics and other
purposes.   To meet these use cases the log data may be propagated,
extracted, transformed and stored in multiple places.  This creates some
challenges:

A. The data may be transformed into other data models and formats may not
preserve any privacy marking
B.  If the privacy marking changes on a field it's unclear what should be
done to historical data.

3.  Often when a sensitive value is written to a log it has gone undetected
for a period of time.  The values have now propagated to multiple systems
and the task of the team responding to the incident is to:

   - stop the logging or mis-identification of the data
   - determine where the sensitive data has propagated to
   - purge (or other action) the sensitive date from where it now lives
   - perform remedial actions and notifications (for example to rotate
   credentials)

I would like a system that helps to automate this.  I think the draft
provides part of the solution, but it seems there should be more to the
solution:

   1. Information and data models for sensitivity/privacy tagging of data
   2. Ways of attaching that sensitivity/privacy tagging to the data itself
   (similar to current draft)
   3. Interface to systems to control
      - actions based on tagging
      - changes to tagging/classification
      - communication of tagging/classification "out-of-band"

I'm not really up on the state of the art here.  I know there exists
proprietary protocols that do similar, but I'm not aware of any standards
work in this area.

As an illustration a JSON message could contain a field descriptor to match
on a field and an action to take when you match on the field.

{
   "dataFieldDescriptor": {
         "LogStream": "AuthenticationService",
         "eventType": "Login",
         "jsonField": "sessionID",
        "startDate": "None",
        "endDate": "Now"
    },
    "action": {
             "SensitivityLevelTag": "4",
             "NotificationType": "SummaryReport",
             "Action": "FullAnonymization",
    }
}

This message would be sent to one or more systems.  It may result in the
inline tagging described in the draft, but the message could be propagated
in other ways.   ETL systems that transform the data could transform the
request for the systems that consume their data.  The actions can be
optional and derived from the default for the sensitivity level.
 Notification and reporting is an essential piece of the puzzle.  The
interface could be provided by log aggregators, endpoints, and ETL
systems.

This provides a standard interface to different places that store data
derived from logs to remediate problems in their existing data.  It could
potentially be used to service deletion requests and maybe information
requests.  This obviously needs refinement.  A higher level abstraction
could decouple the request from the logging format and allow the request go
through stages.  For example the request says "delete user x@y.com" and
then intermediate systems fill out how to fulfil the request.

Cheers,

Joe

[Pearg] Comments on draft-rao-pitfol-01 Joseph Salowey
Re: [Pearg] Comments on draft-rao-pitfol-01 Sandeep Rao