Re: [DNSOP] New draft on representing DNS messages in JSON

Jay Daley <jay@nzrs.net.nz> Thu, 21 August 2014 19:54 UTC

Return-Path: <jay@nzrs.net.nz>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 75EDC1A06D3 for <dnsop@ietfa.amsl.com>; Thu, 21 Aug 2014 12:54:10 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.969
X-Spam-Level:
X-Spam-Status: No, score=-1.969 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, J_CHICKENPOX_84=0.6, RP_MATCHES_RCVD=-0.668, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KihCZopQMQ-5 for <dnsop@ietfa.amsl.com>; Thu, 21 Aug 2014 12:54:07 -0700 (PDT)
Received: from srsomail.nzrs.net.nz (srsomail.nzrs.net.nz [202.46.183.22]) by ietfa.amsl.com (Postfix) with ESMTP id 868B81A06CD for <dnsop@ietf.org>; Thu, 21 Aug 2014 12:54:07 -0700 (PDT)
Received: from localhost (localhost.localdomain [127.0.0.1]) by srsomail.nzrs.net.nz (Postfix) with ESMTP id 6D2AB4BB019; Fri, 22 Aug 2014 07:54:06 +1200 (NZST)
X-Virus-Scanned: Debian amavisd-new at srsomail.office.nzrs.net.nz
Received: from srsomail.nzrs.net.nz ([202.46.183.22]) by localhost (srsomail.office.nzrs.net.nz [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5kLQusWX4n1G; Fri, 22 Aug 2014 07:53:55 +1200 (NZST)
Received: from [192.168.2.230] (118-93-227-250.dsl.dyn.ihug.co.nz [118.93.227.250]) (Authenticated sender: jay) by srsomail.nzrs.net.nz (Postfix) with ESMTPSA id 4D5904B9B88; Fri, 22 Aug 2014 07:53:55 +1200 (NZST)
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\))
From: Jay Daley <jay@nzrs.net.nz>
In-Reply-To: <20140821172757.504f49b7@vulcan>
Date: Fri, 22 Aug 2014 07:53:53 +1200
Content-Transfer-Encoding: quoted-printable
Message-Id: <5AA83191-90E5-477F-BC8F-FF654D070A05@nzrs.net.nz>
References: <B4ACD73A-25EF-4063-81D4-DCFE6DB78AB1@vpnc.org> <20140821102232.73071610@vulcan> <586AEB36-C10F-4E6E-AC55-BAADE7C00FD4@vpnc.org> <20140821172757.504f49b7@vulcan>
To: Shane Kerr <shane@time-travellers.org>, Paul Hoffman <paul.hoffman@vpnc.org>
X-Mailer: Apple Mail (2.1878.6)
Archived-At: http://mailarchive.ietf.org/arch/msg/dnsop/PV3oby0fw8VZgKAYa7NVO47iDEU
Cc: dnsop <dnsop@ietf.org>
Subject: Re: [DNSOP] New draft on representing DNS messages in JSON
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 21 Aug 2014 19:54:10 -0000

I agree with all of Shane's points below, namely better encoding of data, not making all fields optional, handling of unknown rrtypes and having a schema for the data.  These are all points (and many more) that were thoroughly addressed in the following draft on representing DNS in XML

	http://tools.ietf.org/html/draft-daley-dnsxml-00

If it helps then I can produce XSLT that takes data in the XML format specified in that draft and validated against that schema and turn it into whatever flavour of JSON you require:

	http://controlfreak.net/xml-to-json-in-xslt-a-toolkit/

*stands back and waits for the "but json is sooo much easier than xml" brickbats*

Jay


On 22/08/2014, at 3:27 am, Shane Kerr <shane@time-travellers.org> wrote:

> Paul,
> 
> On Thu, 21 Aug 2014 07:01:01 -0700
> Paul Hoffman <paul.hoffman@vpnc.org> wrote:
> 
>> On Aug 21, 2014, at 1:22 AM, Shane Kerr <shane@time-travellers.org>
>> wrote:
>> 
>>> * I don't like the treatment of QNAME*/hostQNAME, NAME*/hostNAME,
>>> and so on. Since JSON includes encoded strings, wouldn't it make
>>> more sense just to always put the QNAME in there? (Especially since
>>> you'll end up with SRV queries always being encoded as they have
>>> underscore characters...)
>> 
>> JSON requires its strings to be encoded in a particular character
>> set. Given that the labels in a QNAME/NAME can be an binary cruft,
>> you can't assume that every QNAME will be representable.
> 
> I think you're making it too hard. Control characters, ", and \ are
> already required to be escaped. Just specify a similar requirement for
> octets 127 to 255 also be escaped, and we're done.
> 
>>> * In general I'm not super enthusiastic about the mixing of binary
>>> and formatted data - I tend to think an application will want one
>>> or the other. Perhaps it makes more sense to define two formats,
>>> one binary and one formatted? Or...
>> 
>> All fields are optional, so a profile could say "don't include these"
>> or "always include those". Further, and more importantly, most RDATA
>> are binary. I did not want to force implementations to use the
>> presentation format for RDATA.
> 
> The problem with an "all fields are optional" approach is that it puts
> all the burden on the consumer of the data, right? You literally have
> no idea what to expect. (That's kind of why I proposed some sort of
> schema below.)
> 
> I understand not wanting to force implementations to use the
> presentation format for RDATA... OTOH it seems likely that the reason
> people are putting data in JSON is so they can see what it is. We could
> always try the RFC 3597 approach for an unknown RTYPE?
> 
>>> * Maybe it makes sense to define a meta-record so consumers can know
>>> what to expect? Something that lists which names will (or may)
>>> appear.
>> 
>> That would be a JSON schema. Just using that phrase will cause
>> screaming in the Apps Area. Having said that, it's perfectly
>> reasonable for a profile to insist that each record have a profile
>> indicator such as "Profile": "Private DNS interchange v3.1".
> 
> Screaming aside, applications will either have an implicit schema or an
> explicit one. Defining the problem to be out of scope may be necessary
> to get something published, but that's a symptom of IETF brokenness
> IMHO, since it reduces the usefulness of any such RFC. :(
> 
>>> I'd be mildly curious to see a comparison of the compressed sizes of
>>> JSON-formatted data (without data duplicated as binary stuff) versus
>>> non-JSON-formatted data. My intuition is that compression will
>>> remove most of the horrible redundancy that is involved in JSON,
>>> but there's only one way to be sure. ;)
>> 
>> Sure. It's pretty trivial to do, for example, a CBOR format that
>> follows this; there are now CBOR libraries for most popular modern
>> languages (see http://cbor.io/) If folks here want that, I can add
>> it as an appendix. To be clear, however, I haven't heard anyone
>> saying they want compression so badly they are willing to lose
>> readability of the data.
> 
> Oh, I meant with gzip or the like, not some JSON crafted format.
> 
> So the idea is:
> 
>   $ tcpdump -w somefile.pcap
>   $ pcap2dnsjson somefile.pcap somefile.json
>   $ gzip somefile.pcap
>   $ gzip somefile.json
>   $ ls -l somefile.{pcap,json}.gz
> 
> Then compare the sizes of the compressed files.
> 
> The idea being that when moving files around via scp or rsync or
> whatever they'd probably be compressed like this, and probably also for
> medium-term storage. My hope is that a compressed JSON is roughly the
> same size as a compress raw pcap file, since basically they have the
> same entropy.
> 
> The reason I bring this up is to give a feel for the size cost of a
> bloated text format in practice. :)
> 
> Cheers,
> 
> --
> Shane
> 
> _______________________________________________
> DNSOP mailing list
> DNSOP@ietf.org
> https://www.ietf.org/mailman/listinfo/dnsop


-- 
Jay Daley
Chief Executive
.nz Registry Services (New Zealand Domain Name Registry Limited)
desk: +64 4 931 6977
mobile: +64 21 678840
linkedin: www.linkedin.com/in/jaydaley