[ftpext] ftpext2-typeu-02

Frank Ellermann <hmdmhdfmhdjmzdtjmzdtzktdkztdjz@gmail.com> Wed, 10 August 2011 11:26 UTC

Return-Path: <hmdmhdfmhdjmzdtjmzdtzktdkztdjz@gmail.com>
X-Original-To: ftpext@ietfa.amsl.com
Delivered-To: ftpext@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E69F721F877B for <ftpext@ietfa.amsl.com>; Wed, 10 Aug 2011 04:26:01 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.299
X-Spam-Level:
X-Spam-Status: No, score=-102.299 tagged_above=-999 required=5 tests=[AWL=0.200, BAYES_00=-2.599, FROM_LOCAL_NOVOWEL=0.5, J_CHICKENPOX_23=0.6, RCVD_IN_DNSWL_LOW=-1, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nK8er2iSCaZi for <ftpext@ietfa.amsl.com>; Wed, 10 Aug 2011 04:26:01 -0700 (PDT)
Received: from mail-pz0-f45.google.com (mail-pz0-f45.google.com [209.85.210.45]) by ietfa.amsl.com (Postfix) with ESMTP id 1D97621F876F for <ftpext@ietf.org>; Wed, 10 Aug 2011 04:26:01 -0700 (PDT)
Received: by pzk33 with SMTP id 33so3483241pzk.18 for <ftpext@ietf.org>; Wed, 10 Aug 2011 04:26:32 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:from:date:message-id:subject:to:cc:content-type; bh=MCmsWPN6H5SsVlIhiG9223+UMk+/4mJUQsd/HbkVH1g=; b=t8xQhjDaEjELeocieNndPWA4KMdSXEMyW9PZhZKwrS3bRj5z2952My9/lAJVM9705F dL132zyFKftm0FxZ4q+Q0qcFkuBGHRzMFNEUs6tYxAAuSvYAeuDUGFSQIQgQKAzHgjh7 xAHVvPLWfWLsD2ubiA4OH4I9URfIbv2CN3dHw=
Received: by 10.143.20.12 with SMTP id x12mr3801505wfi.105.1312975592144; Wed, 10 Aug 2011 04:26:32 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.143.157.2 with HTTP; Wed, 10 Aug 2011 04:26:12 -0700 (PDT)
From: Frank Ellermann <hmdmhdfmhdjmzdtjmzdtzktdkztdjz@gmail.com>
Date: Wed, 10 Aug 2011 13:26:12 +0200
Message-ID: <CAHhFybpCXj42D2SjpuwbgvPB=ZZQuZ43dcJf0r+X_0y6rT-tsA@mail.gmail.com>
To: ftpext@ietf.org
Content-Type: text/plain; charset="ISO-8859-1"
Cc: john+ietf@jck.com
Subject: [ftpext] ftpext2-typeu-02
X-BeenThere: ftpext@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: <ftpext.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ftpext>, <mailto:ftpext-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ftpext>
List-Post: <mailto:ftpext@ietf.org>
List-Help: <mailto:ftpext-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ftpext>, <mailto:ftpext-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 10 Aug 2011 11:26:02 -0000

Hi, I've read the "hash" and "typeu" drafts.  The former only looking for
any new MD5 examples for a test suite; no luck, no problem.  Some of the
links did not work for me, but maybe that was in a section to be removed.

For the "typeu" draft I hope that section 1.3 will NOT be removed.  If it
could be a distraction for implementors maybe move this to an appendix,
but please keep it, same idea as for the appendices in RFC 5198.

I'm surprised that TYPE became "less used" by the late 1980s, because all
troubles I ever had with FTP were caused by a missing "binary" or similar
command, when the default TYPE A turned out to be unsuited for binaries,
and from my POV unnecessary for text files -- my favourite text editor is
an XEDIT-clone and can handle any input line end or tab style.  Of course
I recall only miserable "missing TYPE I" failures, and not any cases were
it worked automatically (as mentioned in section 1.3).

In section 2.1 TYPE L was news for me:  So the "typecode" in FTP URIs
does not only add a somewhat obscure D, it also removes an obscure L.
RFC 959 says it stands for "LOCAL TYPE", and the draft says "logical",
is that as it should be?

The next surprise was that my recipe "always use binary" could fail for
servers, which MAY reject any TYPE excluding A.  Checking RFC 959 that's
the case, but RFC 959 also "recommends" (lower case in old RFC) to offer
TYPE I in 3.1.1.3.  It would be clearer if the draft translates this to
the current style of "mustard" (SHOULD or RECOMMENDED).

There is no logical difference between "MAY reject" and "SHOULD support,
and MAY reject", but for readers the latter is hopefully clearer -- and
I think the "recommended" in RFC 959 would back it.

The next interesting point for me was the RFC 5198 profile in section 3,
I didn't know (back when RFC 5198 was a draft) that there would be any
"profiles", I considered it as the final word of the IETF wrt UTF-8 in
Internet protocols.  Now it turns out to be not as simple as I thought.

And I'm not sure that I understand it correctly:  Some characters are
discouraged in RFC 5198, but now permitted for FTP in the draft.  That
(mostly) affects most control codes in RFC 5198 section 2 clause 3, and
"SHOULD be avoided" matches "discouraged".  The exception for SPace was
fixed in an erratum, leaving CR or LF or FF as exceptions in RFC 5198.

Appendix B explains the discouraged BEL, BS, HT, VT, and FF (again).
The "bare LF" barely escapes the "SHOULD be avoided" in RFC 5198, and
I guess that 0x00..0x0C + 0x0E..0x1F + 0x7F are now on the "permitted"
side wrt the "typeu" draft.  If I got this wrong it might be only me,
or an issue in section 3 of the draft.

Apparently 0x0D (CR) is still only permitted in CRLF and still MUST NOT
occur outside of CRLF or CR+NUL, because the draft does not say that it
overrules a "MUST NOT" in RFC 5198.

That leaves 0x80..0x9F, first listed as "SHOULD be avoided" like other
control codes in RFC 5198.  Two statements later and still in section 2
clause 3 0x80..0x9F are caught by a "MUST NOT appear".  That's not only
discouraged, that's verboten:  Does the "typeu" draft stick to RFC 5198
wrt C1 controls, or are they now permitted?

I'm not sure if it helps, but IIRC C1 conrols have 7bit representations
ESC+d2c(c2d(x)-128+64), e.g., CSI (0x9B) can be given as ESC+'[' (0x5B)
in some 7bit encodings -- but that's of course ancient ISO 2022 history
and not the topic of "typeu".  Maybe permit C1 controls explicitly in
the draft, if there is no good reason to treat NEL and CSI completely
different from BEL and ESC.

RFC 5198 discusses the telnet IAC character.  The only case where I met
this beast is in FTP, getting rid of an (erroneous) file or directory
with 0xFF in its name required to send 0xFF twice.  I had to hardcode
this oddity in <http://purl.net/xyzzy/src/ftpsynch.rex> (and the very
similar classic REXX ftpsynch.cmd), but can't tell when I last needed
this -- years ago, both REXX FTP scripts would never create such names.

If that is as it should be (= not only bad luck with an odd FTP server)
please mention it somewhere in the draft.  RFC 1123 lists it in section
3.2.6 (with a reference to RFC 854 page 13), and maybe RFC 1123 should
belong to the normative references in the draft.

-Frank