Re: POP3 maildrop size and message size

Mark Crispin <MRC@panda.com> Wed, 25 May 1994 21:18 UTC

Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa10164; 25 May 94 17:18 EDT
Received: from CNRI.RESTON.VA.US by IETF.CNRI.Reston.VA.US id aa10160; 25 May 94 17:18 EDT
Received: from ANDREW.CMU.EDU by CNRI.Reston.VA.US id aa16221; 25 May 94 17:18 EDT
Received: (from postman@localhost) by andrew.cmu.edu (8.6.7/8.6.6) id RAA13793; Wed, 25 May 1994 17:13:53 -0400
Received: via switchmail for ietf-pop3+@andrew.cmu.edu; Wed, 25 May 1994 17:13:52 -0400 (EDT)
Received: from po5.andrew.cmu.edu via qmail ID </afs/andrew.cmu.edu/service/mailqs/q001/QF.khsvwni00UddI4jk5i>; Wed, 25 May 1994 17:12:25 -0400 (EDT)
Received: from Tomobiki-Cho.CAC.Washington.EDU (tomobiki-cho.cac.washington.edu [128.95.135.58]) by po5.andrew.cmu.edu (8.6.7/8.6.6) with SMTP id RAA01208 for <ietf-pop3@andrew.cmu.edu>; Wed, 25 May 1994 17:12:07 -0400
Received: from Ikkoku-Kan.Panda.COM by Tomobiki-Cho.CAC.Washington.EDU (NX5.67e/UW-NDC Revision: 2.27.MRC ) id AA23758; Wed, 25 May 94 14:11:48 -0700
Received: from localhost by Ikkoku-Kan.Panda.COM (NX5.67e/UW-NDC/Panda Revision: 2.27.MRC ) id AA03314; Wed, 25 May 94 14:11:31 -0700
Date: Wed, 25 May 1994 13:45:52 -0700
Sender: ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Mark Crispin <MRC@panda.com>
X-Orig-Sender: Mark Crispin <mrc@ikkoku-kan.panda.com>
Subject: Re: POP3 maildrop size and message size
To: Matt Madison <MADISON@tgv.com>
Cc: ietf-pop3@andrew.cmu.edu
In-Reply-To: <940525130657.60802877@TGV.COM>
Message-Id: <MailManager.769898752.3148.mrc@Ikkoku-Kan.Panda.COM>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; CHARSET="US-ASCII"

On Wed, 25 May 1994 13:06:57 -0700 (PDT), Matt Madison wrote:
> John Gardiner Myers <jgm+@CMU.EDU> writes:
> >I think the current specs quite clearly define the octet sizes as
> >being exact values.
> On the contrary, it does not do so "clearly".  Exactness is implied,
> but "size" is not precisely defined. Is it simply the size as stored on the
> server, or is it supposed to be the size as transferred to the client?
> Does it include the CRLF's between lines, or not?  How about the extra
> dots tacked onto the front of lines beginning with dots?

The only definition of size which makes sense is the size of the message in
RFC822 format.  This includes the CRLFs between lines, but not transmission
artifacts such as the dots in front of lines beginning with dot.  It is not
the size of the message as it resides on the server or as it may reside on the
client.

> What I'm getting as is some leniency in the definition.  The main reason
> for this is to alleviate what could be substantial, unnecessary, overhead
> on the server end of things

``Efficiency'' is a truly horrible excuse for doing things wrong.  I could
spend hours telling horror stories about mailers (and the authors of such
mailers) which send bare LFs instead of CRLF because it is ``more efficient
that way.''  Interoperability has certain dictates that can not be tossed
aside on the false notion of ``efficiency''.

>     1. Not all operating systems maintain file sizes in bytes.

This is true, but usually the RFC822 message size in octets can be calculated
fairly easily.

>     2. Not all maildrops are structured such that messages are stored
>        in the same form as they will be transmitted -- for example,
>        the local operating system might use just linefeeds to end
>        lines, rather than CRLF's.

This is a fact of life that those of us stuck on UNIX have to deal with on a
daily basis.  Yet somehow we manage.  c-client software deals exclusively with
CRLF-format strings internally.

> And to be a little more forward-thinking, what about message stores capable
> of handling multimedia and multipart messages stored in a binary format?
> Isn't it a little much to ask that POP servers built on these sorts of
> message stores encode an entire maildrop in base64 (or whatever), just
> to get the size?

You are postulating a future message store for the Internet which fails to
engineer in a capability to calculate an RFC822 size.  This is a strawman
argument.  It is perfectly reasonable to mandate a requirement for future
designs.

It isn't as if it is rocket science to build such a message store which has
the capabilities you outline and yet still has the ability to derive the
RFC822 size trivially.

> After all, the exact size isn't needed for the purposes of the protocol -
> a retrieved message is terminated with CRLF.CRLF.

This is BAD BAD BAD news for clients.  If you don't know the size in advance,
it is difficult to allocate resources for the data.  You end up having to have
a bigbuf (whether disk or memory based) to read in the data.  The consequence
of having an inadequate bigbuf is that either your application crashes or you
have to undergo an expensive reallocation operation.

> If a client might need the exact size of a message, how about adding a
> SIZE command to the protocol -- with appropriate definition of "size"
> and appropriate admonitions about the possible overhead the use of
> the command could incur?

So, there would be two sizes in POP, the true size and the false size (hmm,
gotta add a random number generator to ipop3d).  That's fine by me.  But it
could be argued that I have a vested interest in seeing POP become crufty and
laden with kludges. ;-)