Re: [Uri-review] Comments on draft-hoehrmann-javascript-scheme

Mykyta Yevstifeyev <evnikita2@gmail.com> Fri, 01 July 2011 02:59 UTC

Return-Path: <evnikita2@gmail.com>
X-Original-To: uri-review@ietfa.amsl.com
Delivered-To: uri-review@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A856C1F0C38 for <uri-review@ietfa.amsl.com>; Thu, 30 Jun 2011 19:59:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.44
X-Spam-Level:
X-Spam-Status: No, score=-3.44 tagged_above=-999 required=5 tests=[AWL=0.159, BAYES_00=-2.599, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qMUxQdaGxCGm for <uri-review@ietfa.amsl.com>; Thu, 30 Jun 2011 19:59:20 -0700 (PDT)
Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com [209.85.161.54]) by ietfa.amsl.com (Postfix) with ESMTP id 546721F0C39 for <uri-review@ietf.org>; Thu, 30 Jun 2011 19:59:20 -0700 (PDT)
Received: by fxe4 with SMTP id 4so3614494fxe.27 for <uri-review@ietf.org>; Thu, 30 Jun 2011 19:59:19 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=cjoKUkSEyMgpNrroOwl5tXuqoRCjWznjYTy/82inXnU=; b=AnLCww+afFnCSqNV7rm31VuATWAGB8ScPgEX44sMxh/r2LuXBFLeR4BSfx+U6Fcogp 8J6k5IKyQUvnDSyinAIPdXTKUvpjiCDk98HYVslhfzqp+ftqKB1ta5bhgc5fO375WKwc 7TwnFy3dAjSq4Uq1y5vlVNBohbWrlREmIC4x8=
Received: by 10.223.51.4 with SMTP id b4mr3987241fag.93.1309489159292; Thu, 30 Jun 2011 19:59:19 -0700 (PDT)
Received: from [127.0.0.1] ([195.191.104.224]) by mx.google.com with ESMTPS id k26sm2030056fak.0.2011.06.30.19.59.17 (version=SSLv3 cipher=OTHER); Thu, 30 Jun 2011 19:59:18 -0700 (PDT)
Message-ID: <4E0D3834.2020102@gmail.com>
Date: Fri, 01 Jul 2011 06:00:04 +0300
From: Mykyta Yevstifeyev <evnikita2@gmail.com>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; ru; rv:1.9.2.18) Gecko/20110616 Thunderbird/3.1.11
MIME-Version: 1.0
To: Bjoern Hoehrmann <derhoermi@gmx.net>
References: <4E0B5015.7010508@gmail.com> <p3km07hckpdl1ugp9lpk336h6v6dvgsaai@hive.bjoern.hoehrmann.de> <4E0B539E.6040009@gmail.com> <70lm0796citvujkm477f4oithbimerefas@hive.bjoern.hoehrmann.de> <4E0C4C17.7010207@gmail.com> <4jso07ti5osjn6aub6hlsur8t7qg51jpdj@hive.bjoern.hoehrmann.de>
In-Reply-To: <4jso07ti5osjn6aub6hlsur8t7qg51jpdj@hive.bjoern.hoehrmann.de>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
Cc: "uri-review@ietf.org" <uri-review@ietf.org>
Subject: Re: [Uri-review] Comments on draft-hoehrmann-javascript-scheme
X-BeenThere: uri-review@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Proposed URI Schemes <uri-review.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/uri-review>, <mailto:uri-review-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/uri-review>
List-Post: <mailto:uri-review@ietf.org>
List-Help: <mailto:uri-review-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/uri-review>, <mailto:uri-review-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Jul 2011 02:59:21 -0000

30.06.2011 16:43, Bjoern Hoehrmann wrote:
> * Mykyta Yevstifeyev wrote:
>> I'm copying this to uri-review list.
> Please see RFC 1855, section 2.1.1., bullet point #4.
You personally encouraged me to copy comments to uri-review.
>> As I identified before, the document is missing the syntax description.
>> URIs have a limited number of allowed characters; some disallowed ones
>> might be legal in Javascript code.  Therefore, if you define the URI
>> scheme, those characters which do not suit to<unreserved>  production of
>> RFC 3986 should be percent-encoded within such URI, and this should be
>> mentioned in the specification.  If you define IRI scheme,<reserved>  in
>> RFC 3987 are the same as in RFC 3986, which should be considered.
>> Anyway, it's better to have formal definition of syntax using ABNF, even
>> though it is going to be smth. like:
> I disagree. If I were to add a definition like:
>
>>>       javascript-uri = "javascript:" code
>>>       code           = segment
> Then people would read this as saying certain resource identifiers are
> conforming because they match the grammar even though they would not be
> conforming under the current definition.
The current definition comparable to syntax is:

>     A resource identifier conforms to this specification if and only if
>     it is a valid IRI and application of the source text retrieval
>     operation yields a valid application/javascript entity without
>     generating any error.
which in fact refers the reader to RFC 4329, explicitly Section 4, which 
says:

>    Implementations that support binary source text MUST support binary
>    source text encoded using the UTF-8 [RFC3629] character encoding
>    scheme.
However a number of UTF-8 encoded octets are not allowed in URIs and 
IRIs as well.  They are those which do not match the <unreserved> and 
<iunreserved>, respectively.  Moreover, as UTF-8 is the dominant 
character encoding for application/javascript, I don't see any reasons 
to define IRI scheme.  UTF-8 encodes any UCS char as ASCII char, many of 
which are allowed in URIs.  So do UTF-16 and UTF-32, which are also 
allowed by

> Other character encoding schemes MAY be supported.
As for the formal syntax.  From RFC 4395, Section 2.2:

>    All URI scheme
>    specifications MUST define their own syntax such that all strings
>    matching their scheme-specific syntax will also match the
> <absolute-URI> grammar described in Section 4.3 of RFC 3986.

Here it is preliminarily assumed that URI syntax is defined with ABNF 
(in order to match the <absolute-URI>).  Moreover, we have MUST here, 
which is the normative requirement.  I don't think the registration 
without the syntax definition will be OK.
>>>         4.  Decode the octet sequence using the UTF-8 character encoding
>>>             and transform the result into source text.
>> Probably it would be better to mention "Decode the octet sequence using
>> the algorithm defined in Section 3 of RFC 3629; obtained ASCII text
>> should be considered to be Javascript source code."
> As pointed out in the draft, "source text" is a concept used in the
> ECMAScript specification, and while people recognize "UTF-8", they
> likely need to look up what is "RFC 3629"; that UTF-8 is defined in
> STD 63 is pointed out in the draft.
Section 3 of RFC 3629 defined how to decode UTF-8 encoded -> Unicode.  
Pointing that (citing the draft):

> UTF-8, including the term byte
>     order mark, in STD 63
isn't too generic to point to this information.
>> Section 3.2 (and global): I don't see why you chose the name "In-context
>> evaluation".  Shouldn't "In-context execution" be better?
> You execute statements (instructions) and evaluate expressions. It would
> be weird to say for javascript:'1' you "execute" the string literal.
OK, I agree on this now.
>> Section 7 is missing registration template required by RFC 4395
>> (http://tools.ietf.org/html/rfc4395#section-5.4).
> See http://www.w3.org/mid/455CCAAD.2040407@att.com and list archives.
 From Section 5.4 of RFC 4395:

>     This template describes the fields that must be supplied in a URI
>     scheme registration request:
where we have the non-normative must.

Section 5.2 ibid., with normative SHOULD, required the template for 
registration process.
>> Global: s/resource identifier/URI (should be explained in Abstract when
>> first mentioned as Uniform Resource Identifier) or IRI, dependent on
>> what you define.  I suppose URI is a better term for 'javascript' RI.
> My usage is consistent with draft-hansen-iri-4395bis-irireg-00.
First, this is Internet-Draft and RFC 4395 is still BCP 35.  You need to 
be guided by current and active procedural documents.  And, 
draft-ietf-iri-4395bis-irireg-01 
(http://tools.ietf.org/html/draft-ietf-iri-4395bis-irireg-01), which is 
the most current version, uses "URI/IRI" rather than "resource 
identifier".  Yet, we haven't had any instances of use of such term in 
recently published URI scheme specifications.
>> Abstract:
>>
>>>      Using
>>>      this scheme, executable script code can be specified in contexts that
>>>      support resource identifiers.
>> I propose:
>>
>>>      Using 'javascript' URIs, Javascript (also known as ECMAScript)
>>>      executable code may be specified in such URI and executed by
>>>      the application resolving it.
> JavaScript is not also known as ECMAScript. JavaScript is essentially a
> proprietary language where parts of it are standardized under the ECMA-
> Script moniker. The JavaScript scheme is for JavaScript. "Execution" and
> "URI" have the same problems as discussed above.
>
>> Section 2:
>>
>>>     Resource identifiers, including percent-encoding and requirements for
>>>     IRIs, are defined in STD 66, [RFC3986], and [RFC3987].  Source text
>>>     and the media type application/javascript are defined in [RFC4329],
>>>     the 'data' scheme in [RFC2397], and UTF-8, including the term byte
>>>     order mark, in STD 63, [RFC3629].
>> I think mentioning specification for these terms in the text of the
>> document itself rather than combining them in one section seems to be a
>> better option.  This will add clarity to the document.
> I disagree, I would find inline references for terms like "UTF-8" in the
> middle of the text rather distracting.
However, this is acceptable.  In recently published RFCs defining URI 
schemes, such as RFC 6068 (http://tools.ietf.org/html/rfc6068), RFC 6167 
(http://tools.ietf.org/html/rfc6167), RFC 6270 
(http://tools.ietf.org/html/rfc6270), we can see such reference formation.
>> Section 1:
>>
>>>      The first operation, source text retrieval, defines which script code
>>>      a given 'javascript' resource identifier represents.
>> Maybe here s/defines which script code/defined how to determine the
>> script code.
> I don't see why that would be an improvement.
Because "which script code" assumes there is a finite array of valid 
Javascript codes whereas javascript URI refers to a member of such 
array.  However, since Javascript codes are not limited in size, we can 
assume that such array is infinite.  This is a grammar question mostly.
>> Ibid:
>>
>>> <a href='javascript:doSomething()'>...</a>
>> For the purpose of an example, it would be better to mention something like
>>
>>> <a href='javascript:doSomething()'>Run doSomething</a>
> No, that would draw undue attention and I would regard this kind of link
> text a bad practise, much as using this kind of link is a bad practise
> for reasons mentioned in the document. It's what people are likely to be
> familiar with though, so that's why I use the example.
OK, let's agree on this as well.
>> Ibid:
>>
>>>      script, style sheet, HTML document, resource identifier, or other
>>>      type of resource, as appropriate for the context.
>> Considered you provide this list as an illustration of the previous
>> phrase "like an Internet media type, how to process it" I should note
>> URIs aren't classified as some media type so it's better to remove URIs
> >from this list.  Moreover, probably you meant "MIME media type", not
>> "Internet media type".  Also, informative reference to RFC 2046 is
>> probably missing.
> The term "MIME media type" is obsolete, RFC 4288 for instance just calls
> them "media types", and Internet media type is the common term to tell
> those apart from some other "media types". I do not think a reference to
> RFC 2046 would be useful here, I also don't have a reference to HTML or
> Unicode even though I mention HTML documents and use "U+002F SOLIDUS".
RFC 4288 didn't draw distinction between (MIME) media types and Internet 
media types.  Neither did RFC 2046, which is currently on Standards 
Track.  Moreover, in some places you use "media type" while in others 
"Internet media type".  I personally think "media type" is fine, and its 
doesn't conflict with current RFCs.

Mykyta Yevstifeyev