Re: [sip-clf] I-D Action: draft-ietf-sipclf-format-07.txt
Adam Roach <adam@nostrum.com> Tue, 23 October 2012 22:27 UTC
Return-Path: <adam@nostrum.com>
X-Original-To: sip-clf@ietfa.amsl.com
Delivered-To: sip-clf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E65651F0CA6 for <sip-clf@ietfa.amsl.com>; Tue, 23 Oct 2012 15:27:01 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.329
X-Spam-Level:
X-Spam-Status: No, score=-102.329 tagged_above=-999 required=5 tests=[AWL=0.271, BAYES_00=-2.599, SPF_PASS=-0.001, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6DBUuTmlN7bA for <sip-clf@ietfa.amsl.com>; Tue, 23 Oct 2012 15:26:59 -0700 (PDT)
Received: from shaman.nostrum.com (nostrum-pt.tunnel.tserv2.fmt.ipv6.he.net [IPv6:2001:470:1f03:267::2]) by ietfa.amsl.com (Postfix) with ESMTP id A3A4D1F0C8E for <sip-clf@ietf.org>; Tue, 23 Oct 2012 15:26:58 -0700 (PDT)
Received: from hydra-en0.roach.at (99-152-144-32.lightspeed.dllstx.sbcglobal.net [99.152.144.32]) (authenticated bits=0) by shaman.nostrum.com (8.14.3/8.14.3) with ESMTP id q9NMQv3a053272 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Tue, 23 Oct 2012 17:26:57 -0500 (CDT) (envelope-from adam@nostrum.com)
Message-ID: <508719B1.4090108@nostrum.com>
Date: Tue, 23 Oct 2012 17:26:57 -0500
From: Adam Roach <adam@nostrum.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:14.0) Gecko/20120713 Thunderbird/14.0
MIME-Version: 1.0
To: Gonzalo Salgueiro <gsalguei@cisco.com>
References: <20121005015620.22856.1399.idtracker@ietfa.amsl.com> <869FCF91-1032-4411-A7D5-85CEE6F120E5@cisco.com> <50870CB8.40908@nostrum.com> <5A63A1D1-5D2A-4EA8-9E7A-CDA3C9668DE5@cisco.com>
In-Reply-To: <5A63A1D1-5D2A-4EA8-9E7A-CDA3C9668DE5@cisco.com>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
Received-SPF: pass (nostrum.com: 99.152.144.32 is authenticated by a trusted mechanism)
Cc: "sip-clf@ietf.org Mailing" <sip-clf@ietf.org>
Subject: Re: [sip-clf] I-D Action: draft-ietf-sipclf-format-07.txt
X-BeenThere: sip-clf@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: SIP Common Log File format discussion list <sip-clf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sip-clf>, <mailto:sip-clf-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sip-clf>
List-Post: <mailto:sip-clf@ietf.org>
List-Help: <mailto:sip-clf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sip-clf>, <mailto:sip-clf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 23 Oct 2012 22:27:02 -0000
On 10/23/12 16:56, Oct 23, Gonzalo Salgueiro wrote: > "For the purposes of this document, we define 'unprintable' to mean a string of octets that: (a) contains an octet with a value in the range of 0 to 31, inclusive; (b) contains an octet with a value of 127, (c) contains any octet greater than or equal to 128 which is a formatting or control character (such as 128 to 159) within the UTF-8 character set; or (d) falls outside the UTF-8 character range, as specified by [UNICODE]." > > Does that sound ok? I think we're still talking past each other here. "Outside the UTF-8 character range" simply isn't a sensible thing to say. What we're talking about putting into a log record is a series of *octets*, not a series of *characters*. UTF-8 is an encoding that defines how octets are put together to make characters. Once you start talking about the octets as if they *are* characters, you're conflating two very different things. So, for example, you can't talk about "a string of octets that... falls outside the UTF-8 character range." You can talk about a string of bytes that does not form a valid UTF-8 sequence, and that's almost certainly what you want to say here. I'm also getting a bit lost in what you mean when you say "which is a formatting or control character (such as 128 to 159)." Keep in mind that we're still talking about *octets* here, not characters. In UTF-8, there's nothing special about an octet with a value of 128. There's nothing special about an octet with a value of 159. Both can appear as the second octet in a two-octet character. Or the second or third octet in a three-octet character. And so on. The same goes for everything between 128 and 191. Now, octet values of 192, 193, and 245-255 won't appear in valid UTF-8. If we wanted to be abundantly careful, we could call those out as being invalid. But I think we catch those just fine if we talk about octets that form valid UTF-8 sequences. Or are you meaning to call out UTF-8 code points like U+0080 (the Latin-1 padding character)? Because that has nothing to do with an *octet* with a value of 128. It would be encoded as a two-octet sequence starting with 194. However, if we're intending to go down the rabbit hole of making decisions about whether to Base-64 encode based on which UTF-8 codepoints we want to consider "printable," then we've got years of draft refinement ahead of us (I can already imagine the right-to-left mark arguments). That way lies madness. All of which is a very long winded way to say: octets are not characters and characters are not octets; and you need to write the text in a way that does not mix them with each other. /a
- [sip-clf] I-D Action: draft-ietf-sipclf-format-07… internet-drafts
- Re: [sip-clf] I-D Action: draft-ietf-sipclf-forma… Gonzalo Salgueiro
- Re: [sip-clf] I-D Action: draft-ietf-sipclf-forma… Robert Sparks
- Re: [sip-clf] I-D Action: draft-ietf-sipclf-forma… Gonzalo Salgueiro
- Re: [sip-clf] I-D Action: draft-ietf-sipclf-forma… Adam Roach
- Re: [sip-clf] I-D Action: draft-ietf-sipclf-forma… Gonzalo Salgueiro
- Re: [sip-clf] I-D Action: draft-ietf-sipclf-forma… Adam Roach
- Re: [sip-clf] I-D Action: draft-ietf-sipclf-forma… Gonzalo Salgueiro
- Re: [sip-clf] I-D Action: draft-ietf-sipclf-forma… Peter Musgrave
- Re: [sip-clf] I-D Action: draft-ietf-sipclf-forma… Adam Roach
- Re: [sip-clf] I-D Action: draft-ietf-sipclf-forma… Gonzalo Salgueiro
- Re: [sip-clf] I-D Action: draft-ietf-sipclf-forma… Peter Musgrave
- Re: [sip-clf] I-D Action: draft-ietf-sipclf-forma… Gonzalo Salgueiro
- Re: [sip-clf] I-D Action: draft-ietf-sipclf-forma… Gonzalo Salgueiro
- Re: [sip-clf] I-D Action: draft-ietf-sipclf-forma… Gonzalo Salgueiro
- Re: [sip-clf] I-D Action: draft-ietf-sipclf-forma… Peter Musgrave