Re: [ftpext] draft-peterson-streamlined-ftp-command-extensions

Damin <damin@emuxperts.net> Wed, 24 November 2010 05:20 UTC

Return-Path: <damin@emuxperts.net>
X-Original-To: ftpext@core3.amsl.com
Delivered-To: ftpext@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 6585628C1DD for <ftpext@core3.amsl.com>; Tue, 23 Nov 2010 21:20:23 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.826
X-Spam-Level:
X-Spam-Status: No, score=-1.826 tagged_above=-999 required=5 tests=[AWL=0.150, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Bk9+wNM79I+a for <ftpext@core3.amsl.com>; Tue, 23 Nov 2010 21:20:20 -0800 (PST)
Received: from ca0.emuxperts.net (ca0.emuxperts.net [96.45.180.106]) by core3.amsl.com (Postfix) with ESMTP id 640F128C1DB for <ftpext@ietf.org>; Tue, 23 Nov 2010 21:20:20 -0800 (PST)
Received: from mail-ew0-f44.google.com (mail-ew0-f44.google.com [209.85.215.44]) by ca0.emuxperts.net (Postfix) with ESMTPSA id 6B6D11FC074 for <ftpext@ietf.org>; Wed, 24 Nov 2010 05:21:18 +0000 (GMT)
Received: by ewy8 with SMTP id 8so4888647ewy.31 for <ftpext@ietf.org>; Tue, 23 Nov 2010 21:21:17 -0800 (PST)
MIME-Version: 1.0
Received: by 10.213.105.147 with SMTP id t19mr86160ebo.5.1290576076733; Tue, 23 Nov 2010 21:21:16 -0800 (PST)
Received: by 10.213.114.139 with HTTP; Tue, 23 Nov 2010 21:21:16 -0800 (PST)
In-Reply-To: <AANLkTi=Mm9U2ghmnNkD62AF=JZPB75fzwhUjykb2p+=0@mail.gmail.com>
References: <A5FC996C3C37DC4DA5076F1046B5674C442CE618@TK5EX14MBXC125.redmond.corp.microsoft.com> <D2BD569FC8F4431E85F609B71E61EB79@rhinooffice.net> <A5FC996C3C37DC4DA5076F1046B5674C442E8816@TK5EX14MBXC127.redmond.corp.microsoft.com> <AANLkTi=Mm9U2ghmnNkD62AF=JZPB75fzwhUjykb2p+=0@mail.gmail.com>
Date: Wed, 24 Nov 2010 00:21:16 -0500
Message-ID: <AANLkTi=q1tFsjOHKZruoJF_0mjFTg8XemD6i9dOjJv6d@mail.gmail.com>
From: Damin <damin@emuxperts.net>
To: ftpext@ietf.org
Content-Type: multipart/alternative; boundary="0015174c40189c031e0495c5a847"
Subject: Re: [ftpext] draft-peterson-streamlined-ftp-command-extensions
X-BeenThere: ftpext@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: <ftpext.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ftpext>, <mailto:ftpext-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ftpext>
List-Post: <mailto:ftpext@ietf.org>
List-Help: <mailto:ftpext-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ftpext>, <mailto:ftpext-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 24 Nov 2010 05:20:23 -0000

Hi,
I knew this THMB command would be a pain as soon as a saw it.

I think the real problem we´re having with it is that it´s TOO versatile. Do
we really need to let the client specify whatever option they want?
Obvious ways to abuse this would be to exec THMB PNG 100 100, then THMB PNG
100 101, THMB PNG 101,100, and so on...

Perhaps it would be better to simply limit the versatility, the server can
provide certain valid sizes, formats. (somehow...)

In the practical world, we don´t really care that the client gets exactly
the size they want! The point is to give a preview of an image. Say, we
configure a server ONLY to support 500x500 in JPG format. Such an image is
fairly small, easy to transfer, and easy to render. If need-be, the client
can (very quickly, and in their own resources) resize their image for actual
view.
This is a win-win for both the client, and the server. The client gets a
speedy preview, and the server can minimise abuse.

I might also point out that there could be a big problem with this command.
How do we know the aspect ratio of the original image? We can´t really
specify an appropriate thumb size without knowing that information (stretchy
images, anyone?). I think this can be solved with some appropriate commands.

On 23 November 2010 23:26, Robert McMurray <robmcm@microsoft.com> wrote:

> Thanks, Mark.
>
> > Mark wrote:
> >
> > That's the point of the command,
> > though.  It's to reduce the amount of
> > data that is required to travel over
> > the wire.  The idea is to shorten
> > delivery time to the client, letting
> > the client decide what is best for
> > the client, not the server.
>
> I understood that to be the case, but looking back to Anthony's original
> question - he had asked which commands were more valuable to me than others.
> With that in mind, when I considered the THMB command from a server
> implementation point-of-view as it appears in the current draft, I thought
> that the current proposal was a little impractical, which would make me lean
> away from implementing it.
>
> Here's a real-world scenario that illustrates why I think that the current
> THMB proposal breaks down quickly in a normal use scenario:
>
> Let's say that I have a friend who works in astrophotography, and let's say
> that he's doing his doctoral work on the Crab Nebula. He books some time at
> one of the national observatories and takes several series of images of the
> Crab Nebula that are stitched together into 50 large composite images.
> (These have very large dimensions, 100K by 100K pixels, and each is over
> 100MB in size, which are the perfect candidates for thumbnails.) When he
> publishes his thesis for peer review, he drops the full-sized images in a
> directory on an FTP site that I manage where his colleagues can access them.
> Let's say that his thesis garners a modicum of interest on the day of its
> publication for review - perhaps just 10 users. Each of these users has an
> FTP client that supports the THMB command as it is currently designed, and
> each FTP client can use varying thumbnail dimensions. So here's what
> happens:
>
> Client 1 has the following transaction:
>
> Client01> THMB JPG 100 100 Crab_Nebula_01.jpg
> Server00> 150 Starting thumbnail transfer for Crab_Nebula_01.jpg
> -         //Note: 5K bytes are transferred over the data channel//
> Server00> 226 Transfer complete.
>
> Since the thumbnail was generated dynamically based on the client's input,
> one of my server's CPUs spikes while it loads the 100MB image into RAM,
> renders the thumbnail in the user-requested image format and dimensions, and
> dumps the image on the wire. (Because a client can request any image
> dimensions and format, my server implementation would probably not cache the
> thumbnail in memory, but I'll come back to that.) Still, transferring only
> 5K over the wire is a lot better than transferring 100MB, so we're in
> agreement that reducing the bandwidth is a good thing.
>
> But I have 9 other clients that are hitting my FTP site and they issue FTP
> thumbnail requests like the following examples:
>
> Client02> THMB PNG 100 100 Crab_Nebula_01.jpg
> Client03> THMB GIF 100 100 Crab_Nebula_01.jpg
> Client04> THMB JPG 200 200 Crab_Nebula_01.jpg
> Client05> THMB PNG 200 200 Crab_Nebula_01.jpg
> Client06> THMB GIF 150 150 Crab_Nebula_01.jpg
> Client07> THMB PNG 640 480 Crab_Nebula_01.jpg
> Client08> THMB GIF 320 240 Crab_Nebula_01.jpg
> Client09> THMB JPG 150 150 Crab_Nebula_01.jpg
> Client10> THMB JPG 320 240 Crab_Nebula_01.jpg
>
> So my server has received 10 completely dissimilar thumbnail requests, and
> all of those requests are just for the first physical image. If each FTP
> client attempts to download just 20 thumbnails in order to show the first
> set of thumbnails in each graphical FTP client, I have to dynamically
> generate 200 different thumbnails. Since these are all 100MB images, I am
> reducing my bandwidth at the expense of the CPU and RAM resources that are
> required to process the THMB requests.
>
> I mentioned earlier that I would probably not cache the thumbnails in RAM
> because the client can request any thumbnail dimensions, so let's say that
> my server implementation at least caches thumbnails to disk so I don't have
> to generate them dynamically for every request. Since each FTP client can
> request multiple image formats and any image dimensions, I may have to keep
> track of hundreds of thumbnails for each single physical image. That will
> quickly start to eat up disk space, so now I have to implement some form of
> garbage collection to clean up stale thumbnails. But even then, since my
> friend has 50 physical images and I configure my FTP server implementation
> to only keep the 10 most-recent thumbnails around, I'm still managing 500
> thumbnails for his original 50 physical images.
>
> All in all I find the approach for letting only the client have full
> control very impractical, and that would prevent me from wanting to
> implement the THMB command as it is currently documented in the draft. But
> that being said, I like the idea of having a thumbnail command, just not the
> way that it's currently proposed. That's why I was suggesting some
> alternatives, and I'll expound a little on that.
>
> In my last email I had suggested letting the server be a little more in
> control, and I had suggested that the server could optionally tell an FTP
> client that the client can't specify the image format. If I use the
> astrophotography scenario that I just gave, that means that I could have my
> friend pre-create the 50 thumbnails for the 50 physical images in some
> fashion where the FTP server implementation would pick them up. (I could use
> shadow folders, unique thumbnail naming, etc.) This means that there is no
> spike in CPU or RAM when the 10 FTP clients issue their THMB requests; it
> also means that I'm only managing 50 static thumbnails. So now the requests
> could be as simple as the following:
>
> Client01> THMB Crab_Nebula_01.jpg
> Server00> 150 Starting thumbnail transfer for Crab_Nebula_01.jpg
> Server00> 226 Transfer complete.
> Client02> THMB Crab_Nebula_01.jpg
> Server00> 150 Starting thumbnail transfer for Crab_Nebula_01.jpg
> Server00> 226 Transfer complete.
> ... etc ...
>
> The reason why I had proposed using the OPTS command was to give the client
> the level of control that you mentioned in your reply, e.g. "letting the
> client decide what is best for the client, not the server." If you were to
> combine your client request concepts with elements of my OPTS-based
> suggestion, you could create a hybrid of the two approaches that might
> address all concerns. Here are some examples:
>
> In this example, the client simply asks for the server's current thumbnail
> configuration:
> C> OPTS THMB
> S> 200 JPG 100 100
>
> In this example, the client specifies a new thumbnail format:
> C> OPTS THMB PNG
> S> 200 PNG 100 100
>
> In this example, the client specifies a new thumbnail format and
> dimensions:
> C> OPTS THMB GIF 150 150
> S> 200 GIF 150 150
>
> I think that something more of a hybrid approach works better - the client
> can ask for a file format and dimensions, but the server can still say "no"
> to a client-specified file format and dimensions when it wants to, but still
> return a thumbnail when the client asks for it.
>
> C> OPTS THMB PNG 100 100
> S> 504 Specifying thumbnail properties is unsupported.
>
> C> THMB widget.png
> S> 150 JPG Starting thumbnail transfer for widget.png
> S> 226 Transfer complete.
>
> This makes it easier for the server implementation to have a pre-cached
> collection of thumbnails, especially when given the realistic scenario that
> I listed above. But if you omit using OPTS command and stick with using a
> single THMB command, when an FTP client sends a request that your server
> doesn't want to fulfill for some reason, your only recourse is to fail the
> whole request. Whereas, if break the process into separate OPTS and THMB
> commands, you can fulfill a THMB request even if you reject the custom
> parameters that the client had requested with an OPTS command.
>
> > Mark wrote:
> >
> > This could waste server-side
> > resources, the client might not even
> > make a request for thumbnail images.
>
> This is true, but using the astrophotography scenario once again, I'd
> rather have 50 5K thumbnail files eating up a tiny fraction of disk space
> (which is a dirt cheap resource) rather than trying to generate dynamic
> thumbnails for 50 100MB images and eating CPU and RAM (which are expensive
> resources).
>
> If my server implementation didn't pre-cache thumbnails, and I configured
> my server to only allow JPG format thumbnails at 100x100 pixels, it would
> still be possible to implement some form of in-memory or dynamic to disk
> caching for subsequent requests, because now the list of variables has been
> reduced. For example:
>
> Client01> THMB Crab_Nebula_01.jpg
> -        //Note: the thumbnail was not//
> -        //pre-cached, so the server  //
> -        //creates it dynamically and //
> -        //caches it to disk after it //
> -        //sends the thumbnail to the //
> -        //client                     //
> Client02> THMB Crab_Nebula_01.jpg
> -        //Note: the thumbnail was    //
> -        //cached during the previous //
> -        //request, so the server can //
> -        //send the thumbnail to the  //
> -        //client with no additional  //
> -        //processing required        //
> Client03> THMB Crab_Nebula_01.jpg
> ... etc ...
> Client10> THMB Crab_Nebula_01.jpg
>
> > Mark wrote:
> >
> > This isn't really the purpose of this
> > command.  The purpose is to take an
> > original image file and reduce it to
> > a specified size, not to return the
> > OS's icon representation for that
> > particular file type.
>
> Perhaps that was not the originally-intended purpose for this command, but
> then I think that you're limiting the usefulness of the command. My
> suggestion to return an icon may not have been the best example, but
> limiting the THMB command to just image files is not very useful, since
> there are other files that would yield beneficial results. For example,
> video files are typically much larger than images, so why shouldn't video
> files be able to have unique thumbnails? What about thumbnails for new image
> types that are introduced later, like SVG files? I would say that it's
> certainly possible that any server implementation could refuse to send a
> thumbnail for any file that it chooses, but why shouldn't a server
> implementation be able to return a thumbnail for any file? I am simply
> suggesting that limiting the functionality of the THMB command to just
> images reduces the overall value of the command.
>
> All that being said, as I stated earlier, I like the idea of a THMB
> command, but at the moment I'm not fond of the current proposal. When I
> consider that implementers of FTP clients might read about the THMB command
> in an RFC and start creating FTP clients that can issue requests for
> thumbnails in any number of image formats and pixel dimensions, I start to
> back away from this command pretty quickly.
>
> I admit that it would be different if I implemented my own graphical FTP
> client and my own FTP server, because I could control the THMB
> interoperability in a way of my choosing - for example, I could define a
> specific set of thumbnail dimensions and only use JPG format. But since I
> would only be implementing an FTP server, I don't like the way the odds are
> stacked. (In some ways this is like writing a function to strip whitespace
> from text files - if you generate all of your own text files, then you only
> have to anticipate what you've defined as whitespace. But when you're
> stripping whitespace from someone else's text files, you have to anticipate
> 0x20, 0x09, 0xA0, multiple character sets, what to do when you get character
> codes that you don't recognize, etc.)
>
> Just the same, I'd like to see some form of thumbnail functionality if
> possible, because the scenario about astrophotography wasn't hypothetical -
> I actually have a friend that works with astrophotography who generates
> those types of huge image files; having an effective method to retrieve
> thumbnails for those large images would be great. But I don't think that the
> current draft offers an example of an effective method when I consider
> having to create dynamically-generated thumbnails for a near-infinite number
> of possible client request parameters.
>
> Thanks again!
>
> Robert McMurray
> robmcm@microsoft.com
> _______________________________________________
> ftpext mailing list
> ftpext@ietf.org
> https://www.ietf.org/mailman/listinfo/ftpext
>