Re: [Json] Nudging the English-language vs. formalisms discussion forward

Tatu Saloranta <tsaloranta@gmail.com> Wed, 19 February 2014 18:48 UTC

Return-Path: <tsaloranta@gmail.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 200641A0505 for <json@ietfa.amsl.com>; Wed, 19 Feb 2014 10:48:04 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aRkBKDTcoFp9 for <json@ietfa.amsl.com>; Wed, 19 Feb 2014 10:48:01 -0800 (PST)
Received: from mail-we0-x22d.google.com (mail-we0-x22d.google.com [IPv6:2a00:1450:400c:c03::22d]) by ietfa.amsl.com (Postfix) with ESMTP id 476DC1A015E for <json@ietf.org>; Wed, 19 Feb 2014 10:48:01 -0800 (PST)
Received: by mail-we0-f173.google.com with SMTP id x48so662576wes.18 for <json@ietf.org>; Wed, 19 Feb 2014 10:47:57 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=mMzjtcqPxaED6xlWV5pm5hLEY6sDZnt7Qrpbp0QjHVg=; b=ZxC/VzNvzC9WUJtpfCdYbx6p64mmTY4C84FJq7VvEc84wUKhpn7pPmLHgeaH5BMMhY eh3FRyw6aQUSTcX14DN/Bf+Bhz/ReyThkSks5jbz7NHX/Iwpb5UjliMYkbuoXrR9sMvF q9rsZ6ckFcQzf9bFoIZjiKKusYtbdNCiBSBidIm07X7+iZel3c0mGUcbL6ExwHuhyvql V4XdW+mgvpqma5MyFzE+IdQ6Yc7966kDl3T3gbGYnuidTSDwadXgfh6+X+xO7Y4pPDZZ dPch7gRqDXm0jnIuJ2Ub1Kgz+JjZ5N632KN8aYXtMeAEfBpsrN8OMwV/BmHuXQSGQy3v L4RQ==
MIME-Version: 1.0
X-Received: by 10.194.90.144 with SMTP id bw16mr29463930wjb.1.1392835677450; Wed, 19 Feb 2014 10:47:57 -0800 (PST)
Received: by 10.227.77.3 with HTTP; Wed, 19 Feb 2014 10:47:57 -0800 (PST)
In-Reply-To: <CAMm+LwjSrpRufOZE+7CW-ppSOW0btC3KgwOx-0YpDiRrewu2VA@mail.gmail.com>
References: <C87F9B96-E028-4F0E-A950-B39D3F68FFE7@vpnc.org> <CAMm+LwhUh_yN-hzaoDWfrO_H2iGvYvj99BCE4EcYmgqCPqXoVQ@mail.gmail.com> <CAHBU6itpttXBfVQGKw=u==k_XSdrht81+m_YDNZP6RM+=9CNow@mail.gmail.com> <CAMm+LwibiSDmymjt544kykhoXdMyR49uhMDLzzvwcBAaw_7oSw@mail.gmail.com> <CAHBU6isHwnMst1g6DM+6ZOG=uOsBTAjk-gVQuZimnFRB475F0g@mail.gmail.com> <20140219172511.GA8132@mercury.ccil.org> <CAMm+LwjSrpRufOZE+7CW-ppSOW0btC3KgwOx-0YpDiRrewu2VA@mail.gmail.com>
Date: Wed, 19 Feb 2014 10:47:57 -0800
Message-ID: <CAGrxA27Eh9mfUWK4A7eyO4=by2tAWt2qGAxbubYTci_fJHB-tQ@mail.gmail.com>
From: Tatu Saloranta <tsaloranta@gmail.com>
To: Phillip Hallam-Baker <hallam@gmail.com>
Content-Type: multipart/alternative; boundary=047d7bfcf35cc8f67004f2c6d3b6
Archived-At: http://mailarchive.ietf.org/arch/msg/json/GVlWl4-cVgBQDq5YJEBQbQcHwmk
Cc: Tim Bray <tbray@textuality.com>, John Cowan <cowan@mercury.ccil.org>, Paul Hoffman <paul.hoffman@vpnc.org>, JSON WG <json@ietf.org>
Subject: Re: [Json] Nudging the English-language vs. formalisms discussion forward
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 19 Feb 2014 18:48:04 -0000

On Wed, Feb 19, 2014 at 10:43 AM, Phillip Hallam-Baker <hallam@gmail.com>wrote;wrote:

>
>
> On Wed, Feb 19, 2014 at 12:25 PM, John Cowan <cowan@mercury.ccil.org>wrote;wrote:
>
>> Tim Bray scripsit:
>>
>> > > The point is to focus the discussion on the data going over the wire
>> > > rather than the syntax.
>> >
>> > Here is where we disagree absolutely. The point is to specify the syntax
>> > clearly and unambiguously, and the semantics of its payload.  Trying to
>> > come at it from the other direction (specifying data structures and then
>> > extracting syntax) leads to huge mistakes like CORBA and WS-*.
>>
>
> They are only mistakes because they are huge. Microsoft and IBM wanted to
> build a consulting business which meant that they liked a specification
> that was overly complex.
>
> The whole ProtoGen system is less than about 5,000 lines of code and it
> has equivalent functionality to the WS-* stack.
>
> XML isn't the farce that SGML was and JSON is even simpler than XML
> without loss of power (though it does not make a very good document markup).
>
> C# and Java are not the screwups that C++ was either. And C is not ADA.
>
> The fact that the B-crew has botched a job in the past does not mean the
> approach is wrong. The collapse of ADA did not prove that high level
> languages were a bad idea. Though there were folk on USENet making that
> argument at the time. My college tutor was known for having walked out of
> the ADA design meetings but he still worked on many other programming
> languages afterwards.
>
>
>
>> I agree absolutely with Tim, if it wasn't clear before.  Semantics is
>> vague
>> and fuzzy (my semantics of your JSON may consist solely in counting the
>> number of fields in the top-level object, for example).  Syntax is not.
>> Agreement on syntax promotes interoperation; agreement on semantics
>> takes forever, pushing syntax to the rear, thus tending to create bad
>> syntax.
>>
>
> JSON has no semantics, All the JSON syntax has is a mapping to an implicit
> data model.
>
> Semantics can be defined very precisely and most modern computer science
> courses teach methods of doing that Z, VDM and the rest are all very
> precise ways of specifying the behavior of a protocol with as high a degree
> of precision as LR(0) parsers allow for syntax.
>
>
> What I have found is that making use of syntax to encode semantics is
> always a mistake. For example we might have a specification that says that
> a Date field can either be an RFC3337 DateTime or an integer giving the
> offset in seconds from the current time. That is certainly possible but we
> now require the parser to be able to recognize the format of the data and
> select the representation appropriately. Maybe 95% of people will do that
> right but at least 5% will do it wrong and if one of those implementations
> becomes popular we end up with a specification that now has three different
> cases to code, the two in the spec and the real life situation that isn't
> documented.
>
> Using different tags for "DateTime" (=String) and "DeltaTime" (=Integer)
> avoids that whole business.
>
>
> When people reach for Regular Expressions they are almost always doing
> something that is better done without.
>
>

I agree completely with above.

-+ Tatu +-