Re: [RFC 959] FTP in ASCII mode

"Sandeep Srivastava" <sandeep.kumar.srivastava@gmail.com> Tue, 21 February 2006 07:23 UTC

Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1FBRs2-0003Or-UH; Tue, 21 Feb 2006 02:23:38 -0500
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1FBRrz-0003Om-FT for ietf@ietf.org; Tue, 21 Feb 2006 02:23:36 -0500
Received: from zproxy.gmail.com ([64.233.162.196]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1FBRrx-0006SE-0y for ietf@ietf.org; Tue, 21 Feb 2006 02:23:35 -0500
Received: by zproxy.gmail.com with SMTP id 40so1163910nzk for <ietf@ietf.org>; Mon, 20 Feb 2006 23:23:32 -0800 (PST)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=HMlfFYV7gCvyItxlRfOr7rsgrBaBZev014wyO5YhfqYgPUSaG2fWFjuuTA0qxZzYqQfNZZXIPM+GckJgl6WZI4G840XMIaRK+d275VfLU/PBepMonvHzdRPUgkYWg7Lwy2aALdg2AjquchNIyBu/8rGP7UJdFoZNIbALz8nzKRA=
Received: by 10.36.247.33 with SMTP id u33mr4036765nzh; Mon, 20 Feb 2006 23:23:32 -0800 (PST)
Received: by 10.36.33.2 with HTTP; Mon, 20 Feb 2006 23:23:32 -0800 (PST)
Message-ID: <802d52a30602202323o27715984lecfe7d8fa835565f@mail.gmail.com>
Date: Tue, 21 Feb 2006 12:53:32 +0530
From: "Sandeep Srivastava" <sandeep.kumar.srivastava@gmail.com>
To: ietf@ietf.org
In-Reply-To: <43FA82F1.9010401@necom830.hpcl.titech.ac.jp>
MIME-Version: 1.0
References: <43F9ED6A.5090901@peter-dambier.de> <30C473BA61A6E355540B54DE@p3.JCK.COM> <43FA82F1.9010401@necom830.hpcl.titech.ac.jp>
X-Spam-Score: 1.1 (+)
X-Scan-Signature: ded6070f7eed56e10c4f4d0d5043d9c7
Subject: Re: [RFC 959] FTP in ASCII mode
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============1209301027=="
Errors-To: ietf-bounces@ietf.org

First of all thanks to everybody for the response.

I knew that a FTP transfer in ASCII mode does EOL and EOF conversions based
on the OS of the receiving system. And I very much expected my UTF-8 encoded
file to get garbled when I FTPied it in ASCII mode. But guess what, it was
not garbled on the receiving system. Maybe I was lucky, or maybe its because
UTF-8 is backward compatible with ASCII. But then, as ASCII is purely
7-bits, the FTP in ASCII mode should have corrupted the UTF-8 encoded file,
because UTF-8 is 8-bits.

Moreover, in ASCII code page, code point 13=CR and code point 10=LF, but
that might not be the case in every other code page. Hence the EOL
conversion (in FTP ASCII mode) might corrupt that text file if it is encoded
using a non-ASCII encoding. And what about handling the Unicode NewLine
characters? Anyway...

After reading all the wonderful replies, my conclusion is, even though my
FTP client/server handled the UTF-8 encoded text file (which BTW contained
Devanagri characters) correctly, there is a possibility that a text file,
encoded in an encoding other than ASCII runs a risk of being corrupted when
FTPied in ASCII mode. Therefore, always use ASCII mode to transfer only
ASCII encoded files, and Binary mode to transfer non-ASCII encoded files.

I was wondering why isn't there something like a "Text" mode for FTPing text
files, which could handle text files encoded using any encoding available in
this world, and then, the FTP client/server still does the EOL and EOF
conversions properly?

Thanks,
Sandeep.


On 2/21/06, Masataka Ohta <mohta@necom830.hpcl.titech.ac.jp> wrote:
>
> John C Klensin wrote:
>
> > Sandeep's question raises another interesting issue.  I just
> > went back and reread RFC 2640.   It does not seem to address the
> > "TYPE A" issue at all.  It does say (Section 2, paragraph 1)
> > "Clients and servers are, however, under no obligation to
> > perform any conversion on the contents of a file for operations
> > such as STOR or RETR", which I would take to imply that it
> > anticipates I18N FTP operations to be entirely binary ("TYPE I")
> > although that is not explicit.
>
> As for Japanese processing, UTF-8 is not visible by users and on
> the network, because UTF-8 is not only useless but also harmful.
>
> Instead, ISO-2022-JP, ShiftJIS and EUC are the major character sets.
> Some ftp implementations does assume (sometimes depending on environment
> variables) network character code ShiftJIS or EUC and perform appropriate
> conversions, which garbles UTF-8.
>
> On the other hand, if you use ISO-2022-JP, which is 7 bit pure and ASCII
> compatible (in a sense, it is pure ASCII), we can safely use ASCII mode
> of vanilla ftp and there is no confusion as long as we are in ASCII
> environment.
>
> Similar encoding can be profiled using ISO 2022 to obtain a fully
> internationalized, 7 bit pure, ASII compatible character encoding.
>
> The only problem for RFC2460 was that it does not need MIME for
> charset and 8bit extension that it makes it clear that MIME is
> useless.
>
> Note that long term state maintainance of full ISO 2022 is not
> more complex than that of UTF-8. Note also that, carefully profiled
> ISO 2022, such as ISO-2022-JP, requires state maintainance a lot
> simpler than that of UTF-8.
>
> > Whether the characters in use are UTF-8 or not, we've still got
> > that issue with line-endings.
>
> Line-ending issues of any ISO 2022 based encoding are just as simple
> as those of ASCII.
>
>                                                         Masataka Ohta
>
>
>
> _______________________________________________
> Ietf mailing list
> Ietf@ietf.org
> https://www1.ietf.org/mailman/listinfo/ietf
>
_______________________________________________
Ietf mailing list
Ietf@ietf.org
https://www1.ietf.org/mailman/listinfo/ietf