Re: [sacm] SWID and CPE discussion

David,

It is a bit premature to decide what specifications we can use for what purpose under the SACM architecture.  We still have work to do to get to a final list of requirements and a stable architecture based on consensus.  We have a good start, but more work is needed once we can get a working group established.  It’s important to not get too far ahead of this initial step.  Individual SCAP specifications like CPE need to be part of the future discussion once we work through the requirements and architecture. There has been a lot of good work outside the SCAP efforts, including other work in the IETF that we need to consider when we get to the right point.

In my mind SWID is one area that has significant merit and your characterization of SWID (ISO/IEC 19770-2) as a license management specification is not accurate. It does not directly address license management, also known as entitlement management. It provides data elements that can point to license details, but entitlement management is intended to be addressed in ISO/IEC 19770-3.  SWID in general supports software asset management including footprinting using hashes, patch management, and software identification. It also contains many of the metadata elements supported by CPE. Through work that TagVault has been pursuing, it is possible to auto generate CPEs based on the data elements provided by a SWID.  This means that the data elements that SWID addresses is a superset of what CPE addresses. This means that SWID may be used to support similar functional capabilities as those addressed by CPE.

When thinking about security automation in the area of software asset management, there are three primary questions that we need to answer.

1)      What software (operating systems, applications, and patches) is present on an endpoint?

2)      What software is authorized to be on an endpoint?

3)      What is left that is not identified and authorized?

The capabilities in SWID help us get to these answers in a more robust way than CPE ever did. We need software identifiers to address #1 and #2. Footprinting capabilities in SWID help us to identify what is actually installed. Software identifiers provided by SWID enable us to address software in a fine-grained way to develop authorization policies. Through the use of footprinting capabilities, we can eliminate what should be on an endpoint, and identify what is left which addresses #3. This narrows the amount of software that a human or tool needs to analyze to determine the presence of malicious or unauthorized software.

A side benefit of collecting SWID information from endpoints is that it can be used for both security and broader IT management capabilities. I believe this represents a core tenant of this work: using data collected from sensors to support a variety of IT related processes using a single point of data collection. This adds value to the ecosystem of tools supported by security automation standards.

Sincerely,
Dave

From: sacm-bounces@ietf.org [mailto:sacm-bounces@ietf.org] On Behalf Of david.oliva@verizon.net
Sent: Wednesday, March 06, 2013 6:41 PM
To: amontville@tripwire.com; sacm@ietf.org
Subject: [sacm] SWID and CPE discussion

Adamand all:

This is what I understand.

SWID appears to be a goodsoftware licensing mechanism and not an security automation specification (or format).
SWID does not appear to talk to any of the current SCAP specifications.
If individual instances of a piece of software (operating system of application) are present in an IT-Asset, CPE will not detect which one is licensed and which one is not.
SWID would be of use with SCAP if a security scan also aims to detect unlicensed software.
NeitherCPE 2.2 nor 2.3 (or Asset Identification, AI) include a field for software licensing.
If CPE were to have an additional field for software licensing, then it would nicely fit a SWID function.
None of the current SCAP specifications does anything for software integrity, but SWID appears to do this.

If my understanding of SWID is incorrect, please help me.

David Oliva

On 03/06/13, Adam Montville<amontville@tripwire.com<mailto:amontville@tripwire.com>> wrote:

Sean, thanks for taking the time to review the draft, and I apologize for
taking so long to provide responses - hopefully what I'm saying inline
won't be a rehash.

It's clear there are a lot of issues to work through for this draft.

I'm also assuming that the AI draft is still part of SACM's proposed work
- someone let me know if this isn't true (the fourth line in the proposed
charter can be interpreted as quite inclusive).

Adam

On 1/28/13 7:20 PM, "Sean Turner" <turners@ieca.com<mailto:turners@ieca.com>> wrote:

>Hi all (these aren't directed just at Adam),
>
>I've got some comments/questions that are I guess pretty basic but also
>some questions about how the model works/fits with the information
>collected about assets that's been previously specified (or is currently
>being specified elsewhere) in the IETF. In no particular order:
>
>1) CPE: I guess it's an open question as to whether we should go with
>CPE', SWID, or do we need both because SWID is just software and CPE
>also includes hardware?

We need ways to identify the assets we want to assess. Sometimes, we'll
identify assets by class, and other times we'll identify assets very
specifically. There are three asset identification schemes at play:

1. CPE
2. SWID
3. Asset Identification (AI)

CPE can be used to identify classes of hardware, operating systems (for
endpoints and network devices), and application-level software.

SWID provides instance-level identification for software assets, but can
probably be used to identify classes of software (I'm not up-to-speed on
SWID).

AI is used for two reasons: 1) it applies to anything of value to an
organization, and 2) it provides relationships between assets. So, assets
can be information, people, facilities, IT things, and so on, and they can
be related to each other: asset A (a person) is the owner of asset B (a
critical system) and asset B is composed of the following set of assets...

I think all have their place and we should be open to leveraging what
works for the problem we're solving. Scoping assets is an important piece
of UC1 and I'd be inclined to say that we can leverage CPE and SWID in AI.

>
>a) My initial concern was that open source folks would have to pay to
>get a SWID, but it looks like tagvault.org has free memberships. I'd
>like to get that confirmed though.

I'll see what I can find out unless someone else has already done this.

>
>b) I used CPE' purposely because I think we'd need an international
>multistakeholder organization to maintain the registries because like it
>or not not everybody is going to be willing or be able to use a USG site
>to register their software/hardware. The registry could be set up as
>expert required and a pool of volunteer experts would review the
>requests. I'm just saying...

100% agree that not everyone will want to use USG. If you're suggesting
that the draft should leverage IANA for the registry, then I would not be
opposed to that. However, it's the operations that matter and I'd like to
understand more about who is willing and able to do the necessary work of
maintaining the registry for free.

>
>2) Possibly related cross area work: There's some work that's being done
>in other areas, but I asked around and there isn't one model that pulls
>it all together. But, there are data models being developed by vendors
>in other areas and I think we owe it ourselves to have a look at them
>and see what if anything is missing. It's nice if we as security
>practitioners say we want to know 1..n pieces of data about an asset,
>but if that asset isn't going to give that data up we're not going to be
>getting anywhere.

I have some additional comments inline for the options presented below,
but I also want to add that alignment is great, but at the end of the day
we need to get something working in the field. I can see a phased
approach to Asset Identification. Not suggesting that anything you've
said here precludes that option, just explicitly saying that I'd prefer to
get stuff working and add to it as things become available.

For example, Asset Identification can be largely informed by AD or other
CMDB sources. I'm not sure how many of these sources are going to be
NETMOD-/SCIM-/GEOPRIV-compliant by the time we need them to be.

>
>a) NETMOD (https://datatracker.ietf.org/wg/netmod/): NETMOD is a wg in
>the Operations and Management area that's working on data models for
>system management (draft-ietf-netmod-system-mgmt), ip management
>(draft-ietf-netmod-ip-cfg), routing management
>(draft-ietf-netmod-routing-cfg), and interfaces
>(draft-ietf-netmod-interfaces-cfg). If this proposed WG thinks it's
>going to also provide a data model for some of these same things - the
>models better be aligned. And yeah I know NETMOD uses YANG, but they've
>got data node figures and that means you don't have to read the YANG ;)

I will try to take some time to review this information before Orlando so
we can at least have a face-to-face discussion about it.

>
>b) SCIM (https://datatracker.ietf.org/wg/scim/): SCIM is a wg in the
>Applications area that's working on cross-domain identity management.
>Seems like if we're going to be using user identity data that we're
>aligned and maybe even use their schema.

Same as above, I'll take a look at this, too.

>
>c) GEOPRIV (https://datatracker.ietf.org/wg/geopriv/): GEOPRIV is a wg
>in Real-time Applications and Infrastructure Area that produced RFCs on
>both civic and geospatial data which we and NETMOD ought to be using.
>For civic addresses, the GEOPRIV WG produced the Civic Location Format
>for Presence Information Data Format Location Object (PIDF-LO)
>(RFC5139), which builds on DHCPv4/v6 Option for Civic Addresses
>Configuration Information (RFC4766) and is extended by RFC 6848*,
>there's a whole lot more in those RFCs for civic locations.** For
>geospatial (or geodetic) coordinates, there's DHCPv4/v6 Options for
>Coordinate-Based Location Configuration Information (RFC6225) and
>there's got to be some XML schema for it somewhere.

Not sure why, but this one seems more daunting - probably because I know
next to nothing about geo-anything :-) Nevertheless, will try to review
before Orlando.

>
>* Because you may need to know the pole # to which your router is
>attached.
>
>** Because you may need to know the cubicle # where your router is.
>Note: the format for civic and postal information differs from country
>to country and RFC 4776 captures some of that.
>
>3) s5.3: Maybe do the data model like this to save some trees:
>
> +-- Asset
> +-- Person
> +-- Organization
> +-- IT Asset
> +-- System
> +-- Software
> +-- Database
> +-- Network
> +-- Service
> +-- Data
> +-- Computing Device
> +-- Circuit
> +-- Website

I hate trees.

Kidding.

>
>4) Global Ids: I'm curious why there's not one identifier to rule them
>all? I get that assets can have more than one identifier depending on
>the management domain they're in (like I've got an SSN, DL, and Employee
>#), but doesn't the enterprise need one way to unique identify an asset?
> If we're looking for one might I suggest Universally Unique IDentifier
>(UUID) URN Namespace (RFC 4122), but there might be others out there.

This is likely to be important especially for trusted information sharing
efforts, if this information ever gets there. I'm not opposed to global
Ids, and in fact, I'd suspect that's how most implementations will end up
leveraging the AI specification anyway - each implementation will have
it's particular ID...

>
>5) Circuit/ai:circuit: I'm having visions of circuit-switched networks
>;) Is this describing the physical connections between two entities?

I *believe* that is the case - I'm not the original author, however, so
I'll need to rely on either others on this list who have the appropriate
history or reach back to the original author, if I can. I have been
working under the assumption that this means there's some network
connection between the two entities.

>
>6) ai:computing-device: Whole lotta questions here:
>
>a) distinguished-name/fqdn/hostname: If I were to do this, I'd have an
>element name that had subelements of the three here and the other names
>used in the document.

In person for this one is going to be best, I think. I don't want to
assume I know what you're trying to convey.

>
>b) connections: Drilling down in the subelements: I'd add references for
>the IPv4/v6, MAC, and URI formats and maybe even subnet (RFC922 maybe)*.
>Do we need to know about other types of connections usb, firewire,
>thunderbolt, etc. Finally, hasn't somebody already done the IPv4/v6/MAC
>addresses in XML with the proper constraints?

Good advice. I don't know the answer to the last question above. Anyone
else?

>
>* Something like: The syntax for IPv4 addresses described in this
>document MUST conform to [RFC791]. The syntax for IPv6 addresses
>described in this document MUST conform to [RFC3513]. The syntax for
>MAC addresses described in this document MUST conform to [802]. The
>syntax for the URIs described in this document MUST conform to
>[RFC3986]. etc.
>
>c) Additional info: After clicking on about my mac, don't you want to
>know a little more like bios/firmware version, memory, storage capacity,
>processor speed, cache size, etc.? Or do you think the information
>listed there is enough to uniquely identify the computing-device?

You raise an interesting point, but I believe the philosophy behind the
spec is to rely on synthetic Ids and a non-technical Asset Management
process to keep those synthetic Ids up-to-date.

Also, I don't think the original authors knew how to uniquely identify an
asset, given they did not offer any reconciliation algorithms (there could
be other reasons for this, of course).

>
>d) For hostname you could informatively refer to RFC 1178 for choosing
>good hostnames.

Thanks.

>
>7) ai:network: ip-net-range is there a reason you can't have more than
>one block of addresses? Right now it's just one or none. Don't we want
>to know what other networks it connects to (wait that's done later never
>mind)?
>
>8) ai:organization: also a couple here:
>
>a) There's got to be XML for email and telephone numbers with the proper
>constraints from W3C? mailto URI [RFC6068] and tel URI [RFC3966]?

Good.

>
>b) Why only website? What about my orgs twitter feed? :)

Shoot, why not just go for FOAF and call it a day? (Half serious.)

>
>c) Should stuff like DUNS # be listed here or is that synthetic-id?

DUNno...

>
>9) ai:person: a couple here too:
>
>a) email-address/telephone-number: see earlier comments. And, do we
>need an element for jabber id?

See above comment about FOAF :-)

>
>b) Way back in the day PKIX produced RFC 3739, Qualified Certificates
>Profile, which describes "a certificate whose primary purpose is to
>identify a person with a high level of assurance, where the certificate
>meets some qualification requirements defined by an applicable legal
>framework," ... blah, blah EU Directive. Basically, it stuff you'd put
>in a certificate to better identify the subject instead of being stuck
>with only the RDNs specified in RFC 5280. Stuff like placeOfBirth,
>gender, countryOfCitizenship, and countryOfResidence and not to be
>outdone it also allows for biometric data. Some of these you might want
>to use to identify a person and I'm not sure they're in SCIM or
>inetOrgPerson.

Cool. That is way back. Who uses these?

>
>10) ai:service: I think for the port # and port name you should be
>pointing to the port registry:
>https://www.iana.org/assignments/service-names-port-numbers/service-names-
>port-numbers.xml
>I think you might also need to include protocol because not all services
>available on a port are available on all protocols (tcp, udp,sctp, dccp).

Good.

>
>11) ai-system/ai-website: Where's parent defined? Isn't it asset?

I have to check.

>
>12) Energy/Power: I was surprised to not see any elements addressing
>energy/power. I'm thinking this data would be collected and if the
>temperature of my router is really, really hot I'd love to know because
>it might be one fire ;)

Does this information help ID the asset or would that be operational
information?

>
>That's it for now.

Only that much?

>
>spt
>
>On 9/23/12 2:49 PM, Adam Montville wrote:
>> All:
>>
>> I'm pleased to announce that I have transformed the NIST IR 7693 into an
>> RFC format and submitted it for our consideration (see post notification
>> e-mail below). Please note that this transformation is just that ¡© a
>> transformation. I have taken liberties where necessary (mainly for
>> formatting purposes), but the major points, content, and style of the
>> original document remain intact. There are sure to be issues with the
>> document as posted. For example, it is unclear to me what can and
>>cannot
>> be normatively referenced with respect to this transformation. There
>>are
>> also several areas for IANA considerations beyond registering the asset
>> identification namespace (extensions to asset types and relationships,
>> for example).
>>
>> I look forward to working these, and other, issues on this list.
>>
>> Finally, a world of thanks goes to John Wunder, Adam Halbardier, and
>>David
>> Waltermire for their effort in getting IR 7693 finalized ¡© this is
>>really
>> their work.
>>
>> Regards,
>>
>> Adam
>>
>>
>>
>> On 9/23/12 11:39 AM, "internet-drafts@ietf.org<mailto:internet-drafts@ietf.org>"
>><internet-drafts@ietf.org<mailto:internet-drafts@ietf.org>>
>> wrote:
>>
>>>
>>> A new version of I-D, draft-montville-sacm-asset-identification-00.txt
>>> has been successfully submitted by Adam W. Montville and posted to the
>>> IETF repository.
>>>
>>> Filename: draft-montville-sacm-asset-identification
>>> Revision: 00
>>> Title: Asset Identification
>>> Creation date: 2012-09-23
>>> WG ID: Individual Submission
>>> Number of pages: 73
>>> URL:
>>>
>>>http://www.ietf.org/internet-drafts/draft-montville-sacm-asset-identific
>>>at
>>> ion-00.txt
>>> Status:
>>>
>>>http://datatracker.ietf.org/doc/draft-montville-sacm-asset-identificatio
>>>n
>>> Htmlized:
>>> http://tools.ietf.org/html/draft-montville-sacm-asset-identification-00
>>>
>>>
>>> Abstract:
>>> Asset identification plays an important role in an organization's
>>> ability to quickly correlate different sets of information about
>>> assets. This document provides the necessary constructs to uniquely
>>> identify assets based on known identifiers and/or known information
>>> about the assets. This document describes the purpose of asset
>>> identification, a data model for identifying assets, methods for
>>> identifying assets, and guidance on how to use asset identification.
>>> It also identifies a number of known use cases for asset
>>> identification.
>>>
>>>
>>>
>>>
>>>
>>> The IETF Secretariat
>>>
>>>
>>>
>>
>>
>>
>>
>> _______________________________________________
>> sacm mailing list
>> sacm@ietf.org<mailto:sacm@ietf.org>
>> https://www.ietf.org/mailman/listinfo/sacm
>>
>_______________________________________________
>sacm mailing list
>sacm@ietf.org<mailto:sacm@ietf.org>
>https://www.ietf.org/mailman/listinfo/sacm
>
>

_______________________________________________
sacm mailing list
sacm@ietf.org<mailto:sacm@ietf.org>
https://www.ietf.org/mailman/listinfo/sacm