Comments on draft IAFA doc

Hank Nussbacher <HANK@vm.biu.ac.il> Mon, 25 May 1992 00:50 UTC
Date: Sun, 24 May 1992 06:11:50 -0400
From: Hank Nussbacher <HANK@vm.biu.ac.il>
Subject: Comments on draft IAFA doc
To: IAFA@cc.mcgill.ca
Message-Id: <92May24.190201edt.8794@ugw.utcs.utoronto.ca>
I am new to this list so please excuse any comments I make that have already
been discussed on this list.

>
>
>Mar  2 22:54 1992   Page 1
>
>
>	           	   IAFA-WG
>		Guide to FTP Site Administration
>
>		        DRAFT 92.03.02
>
>
>Introduction
>------------
>
>As the growth of the Internet continues it is now fair to speak of an
>"Internet infostructure". Companion to the extensive physical
>infrastructure that is itself responsible for the specific routing and
>delivery of messages, the Internet infostructure is comprised of that
>growing body of information and the structure that supports it.  Much of
>this infostructure is available equally to all users. Other portions are
>available to anyone participating in specific network-based workgroups.
>The Internet acts as an enabling technology that makes available this
>wealth of information to those who know how to access it.

In the introduction you should state the reason why we need an RFC
for IAFA.  You state that the information is available equally to all
users, but what you fail to mention is that "many users, especially
novice users don't know where to look for such information.  This RFC
will attempt to provide one means of assisting users in finding
various publically accessible files scattered throughout the Internet."

>
>In this document we concentrate on the remote file transfer model
>for sharing information in an Internet environment. On the
>Internet, this model is primarily implemented using the File
>Transfer Protocol (FTP) [1]. Available at most
>sites, an FTP service provides users with a secure and reliable
>mechanism to copy specific files from one host to another across
>the network.
>
>In particular, we aim to provide information to anyone
>contemplating setting up or maintaining an Internet information
>archive using the facilities of FTP. A companion document will
>cover the use and operation of FTP archives from the user's
>perspective.
>
>Users of FTP  normally must go through a login sequence when
>connecting to a foreign host. They are then allowed to copy those
>files to which they have been granted access permission. The login
>sequence provides basic authentication and security in an open
>systems environment with hundreds of thousands of interconnected
>hosts and millions of users. The underlying FTP protocol provides
>needed error checking and thus ensures reliability.
>
>
>What is Anonymous FTP?
>----------------------
>
>The FTP service has been around since the early days of the
>Internet and it has been a successful service, with over 40% of
>current network traffic being used for this purpose [* ref *].

It is currently 50% - based on nic.merit.edu nsfnet/statistics/1992
t1-9204.ports statistics file.

>
>The FTP system is designed around a client/server model. Users
>invoke the client to connect to the server process running on the
>remote host. The server is responsible for verifying the authenticity
>of the user and performing the operations requested by the user
>through the client, enforcing the security and integrity of the host
>system.  In ordinary FTP sessions one would log into an account on the
>remote host from where one either wanted to retrieve the file or to
>
>
>
>
>
>
>
>Mar  2 22:54 1992   Page 2
>
>
>which the file (or files) were to be placed. Basic commands allow the
>user to navigate through the remote file system and copy or delete
>files. This form of access requires one to have the name and password
>of the account on the remote system.
>
>
>The basic FTP-based file sharing model has been extended throughout the
>creation of a network of universally accessible FTP archives sites.
>Information at such sites is available to all users of the Internet,
>without the usual authentication step using the convention of
>"anonymous FTP".  Under this mechanism, site administrators make
>available a collection of files to the Internet community by creating
>a special "anonymous" user account. Normal authentication mechanisms
>are disabled (hence the use of the term "anonymous") thus allowing
>anyone who connects to such a site to copy information back to their
>own host.
>
>The use of anonymous FTP began as a convention among relatively few sites
>on the Internet. Details on establishing and operating such an archive
>and even the names of sites supporting this mechanism was shared among
>users of the net through ad hoc methods. With the continued growth of the
>Internet [2] such methods are now seen to be inadequate.
>
>
>Anonymous FTP indexing tools
>
>It is expected that as the amount of information on the network grows,
>such resource discovery tools will become increasingly more important.
>The Internet Anonymous FTP Archive Working Group (IAFA-WG) has been
>formed under the auspices of the Internet Engineering Task Force to
>foster better utilization of the anonymous FTP archive mechanism for
>sharing information on the Internet.
>
>This guide is intended for current and potential archive
>administrators and includes procedures for setting up a site and
>guidelines for its operation. In addition, we include proposals for
>using anonymous archives as the first step to "publishing"
>information in the Internet environment are included.
>
>Despite their success, anonymous FTP archives (AFA) are not the ideal
>manner for publishing information. They do however, have the advantage of
>being relatively cheap and easy to establish and provide near universal
>access to their contents. With proper attention by archive site
>administrators, they provide a relatively simple way to distribute
>information.
>
>
>
>Organization of this document
>-----------------------------
>
>This document is divided into four sections. Part I discusses
>the reasons why an organization might wish to establish an anonymous
>FTP archive site. Specific issues, both technical and non-technical
>are addressed to help the site administrator determine if establishing
>such an archive is appropriate for your site.
>
>
>
>
>
>
>
>Mar  2 22:54 1992   Page 3
>
>
>
>Part II describes the steps needed to set up and maintain an
>anonymous FTP archive site. Specific examples for the most common
>operating system environments are included.
>
>Part III offers a set recommended information that a site may wish
                   set of recommended

>to compile and make available to archive users. It is expected that
>archive indexing tools would automatically gather and make
>available this information to a wider audience. Although not
>required, it is expected that providing such information would make
>your archive a more useful resource.
>
>At the heart of this section are specific recommendations that are
>intended to provide a standardized means for sharing information
>about the contents of a specific archive site. These
>recommendations include information concerning services provided by
>the institution, document abstracts and information such as
>administrative contacts, local Timezone and other site-specific
>details.
>
>Part IV contains a set of recommended encoding procedures for
>the information outlined in Part III. These procedures allow the
>administrator to take into account site-specific issues, such as
>whether a particular operating system offers the capability of
>creating and using subdirectories, any limitations on filename
>length or the inability to use specific characters in filenames.
>
>Using these recommendations, it is assumed that automated tools
>will be developed that can incorporate information about that
>site into larger infostructures which enable the general public to
>locate and access the information at the site in a timely and
>efficient way.
>
>
>Part I: What is an anonymous FTP archive and why set one up ?
>
>Internet archives are repositories of information of common interest
>to a group. For example, researchers sharing a common set of data
>will often put the information in a central location so that it can
>be accessed by all those in the group. How this access is performed
>can vary, but on the Internet the FTP service and the associated
>remote file sharing paradigm are often used.
>
>
>Why set up an anonymous FTP archive ?
>
>Site administrators set up anonymous ftp archives for any of
>several reasons:
>
>a) Sharing of useful information. Many sites contain data which their
>owners would like to make publicly available.  Research papers, locally
>produced software and datasets are some of the most common offerings.
>An anonymous ftp archive allows you to make this information available
>to a large audience that would not otherwise be able to easily access
>it.
>
>
>
>
>
>
>
>
>Mar  2 22:54 1992   Page 4
>
>
>b) Caching and redundancy. Sites at the end of slow network links often
>set up an AFA site to redistribute information obtained from other
>sites so that the operation need not be repeated multiple times for
>the same piece of information. Large software offerings such as X11 or
>TeX which can total several hundred Megabytes are often prime
>candidates for caching at the closer end of a slow network link.
>
>c) To raise a site's profile by providing a valuable network resource.
>A useful large and well maintained archive site is a valuable resource
 A useful,

>to the general Internet community. This can give the group providing
>the archive higher visibility which in turn can call attention to
>other work by that site.
       work done by

>
>d) A site with a large internal population of machines that are not
>themselves directly connected to the Internet (typically making use of a
>Secure Gateway) will often cache packages of interest to their internal
>population on a machine that is visible to both the internal machines as
>well as the rest of the Internet. This can often ease the fears of
>management about Internet connectivity and still be a useful service to
>the Internet as a whole.
>
>
>Initially, most ftp archives resided on centrally controlled mainframe or
>minicomputers. The huge growth in the number of workstations and PCs on
>the Internet has led to the growth of a number of smaller, more
>site-specific archives.  The current population of archives now offer
>everything from small collections of specialized data to collections
>consisting of hundreds or even thousands of Megabytes of information,
>much of it shadowed or copied from other sites on the Internet.
>
>
>Part II: Setting up and maintaining an anonymous ftp archive site.
>
>Once it has been decided that an anonymous ftp account is to be created
>it is up to the system administrator to configure the FTP server to allow
>such access. How this is done exactly is operating system dependent and
>may be as simple as creating a password entry with the appropriate
>information for an FTP pseudo-user. In most modern systems, support for
>the anonymous account is built into the FTP server program primarily to
>enforce security. It is important to bear in mind that once the account
>is enabled, by its very definition, _anyone_ on the network can access
>the account. Examples for some common operating systems are given below:
>
>
>UNIX
>
>In most implementations of UNIX, the ftp server (ftpd) is launched from
>inetd running as the super user (root). The anonymous facility is enabled
>by adding an account for the user "ftp" to the password file. A typical
>/etc/passwd entry would look like:
>
>ftp::67:20:Anonymous FTP account:/home/ftp:/bin/true
>
>Note that a) there is no entry in the second (password) field and b) that
>the shell is listed as /bin/true so as to prevent access to the account
>by telnet(1) or rlogin(1). The line
>
>
>
>
>
>
>
>Mar  2 22:54 1992   Page 5
>
>
>
>/bin/true
>
>may have to be added to the file /etc/shells to allow /bin/true to be
>used as a "shell". Most UNIX systems will have the ftp server perform a
>chroot(2) call in order to enforce the confinement of the process to the
>specified anonymous ftp home directory (in this case /home/ftp).
>
>Subdirectories of the ftp home directory called "bin" and "etc" should be
>created. root should own the home directory.
>
>a) ~ftp/bin should have a copy of the ls(1) program with execute only
>permissions to all and the directory should be not by writable by
>anyone. Both should be owned by root.
>
>b) ~ftp/etc should be created with owner root and readonly permissions.
>It can optionally contain a file called passwd with one entry of the form
>
>ftp:*:67:20:Anonymous FTP account:/:/bin/true
>
>Note in this case that the passwd field is "*" and the home directory is
>listed as "/". A file called "group" can also be placed in this directory
>with an entry for the ftp group conforming to the group(5) manual entry.
>
>In systems with dynamic libraries (eg. SunOS 4.X), a copy of these
>libraries and certain devices may need also to be created. Consult your
>documentation.
>
>[VMS]
>
>[other OS?]

VM/CMS

The IBM implementation for anonymous FTP iq quite straightforward.
In the PROFILE EXEC for the FTPSERVE server machine there is a line
that is commented out that needs to be altered:

/*ftp_options = 'RACF TRACE ANONYMOU'    some examples of arguments  */

By changing the above line to

  ftp_options = 'ANONYMOU'    some examples of arguments

you will have enabled anonymous FTP.  A user connecting to your system
via anonymous FTP will not be placed into any default directory but
rather must issue his own 'cd' to access the necessary directory.
The user must know in advance which 'cd' to perform since the VM/CMS
system does not provide any information about where to find searchable
subdirectories.  Any directory with a read password of ALL (meaning
publically available), can be connected to, as well as any password
protected directory, so long as the user knows the correct password.

>
>There are a few areas of potential problems on both the security and
>administrative sides of running an anonymous ftp archive site which will
>be mentioned here. If you are not sure of the capabilities of your server
>it is a good idea to consult your system documentation or your software
>vendor. Some of these problems can be solved by using one of the freely
>distributable ftp servers now available.
>
>Technical
>
>a) The view of the filesystem that the ftp client has access to should be
>restricted with only those files specifically intended to being
>distributed actually visible. In the best case, this restriction should
>be enforced at the lowest possible level, preferably by the operating
>system itself. Application-level enforcement should be avoided. For
>example, some ftp servers try to restrict the movement of the clients by
>filtering pathname requests. This is a weaker enforcement of
>access policies than those supplied by the operating system and alternate
>servers which utilize OS support should be used in preference.
>
>b) Many sites maintain "incoming" directories which allow the uploading
>of information into the archive by the general public. These can be very
>useful for the easy distribution of data. However, they can also be used
>as a transfer point for files that should not be on your system.  Most
>operating systems allow having a directory be world writable but not world
                      ...a directory to be world...
>
>
>
>
>
>
>
>Mar  2 22:54 1992   Page 6
>
>
>readable. If you really want to have one of these directories, it is a
>good idea to configure them in this way to allow the site administrator to
>examine and approve the data before it is moved to its final location and
>made generally available.
>
>c) Check the permissions and ownerships of the files in your archive.
>Many administrators have adopted the practice of transferring ownership
>of archive files to the ftp pseudo-user. The file permissions should then
>be scrutinized to make sure that individual files cannot now be modified
>by that user (unless of course, that is specifically the intention). The
>"ftp user" is anyone using the anonymous account. The replacement of
>files with corrupted versions (viruses, trojan horses etc) has been known
>to occur.
>
>d) The anonymous ftp subtree of the file system is usually self
>contained. This means that references (UNIX symbolic links for example)
>outside of this subtree will not be resolved and are thus inaccessible to
>users of the system.
>
>e) Care should be taken when naming or renaming files in archives. The
>truism that names should be meaningful takes on a greater significance in
>this environment since this is often all that the remote user has to work
>with when trying to discover the contents of the file without actually
>retrieving it. If one is caching a file from another ftp site, renaming
>is usually not recommended since the ability to determine if the two
>files contain identical information can be lost. Additionally whitespace
>an non printable characters (on operating systems which allow this) is
 and

>frowned upon since this can make the file inaccessible to the remote user.
>Additionally, characters such as '@', '!', '|', or "_" may not be available
>or have special significance on remote systems and should be used with
>caution.
>
>f) Very large files should be split into smaller pieces when placed in
>ftp archives. The retrieval of large files can be difficult on unreliable
>or congested links since if a failure during transfer occurs, it is
>usually not possible to restart from the point of failure and continue.
>The entire transfer has to be restarted. This can be time consuming and
>costly in terms of network bandwidth. Currently, files of 500 - 600 K are
>usually considered as the maximum desirable size. Files larger than this
>should be split.
>
>g) As the site administrator you might want to consider creating a CNAME
>record in the Domain Name System for your AFA. This record is usually
>"ftp.<your domain>". This allows you to move the archive from one
>physical host to another without the need for your users to find the new
>host. For example the machine quiche.cs.mcgill.ca would have a CNAME
>record which gave it the alternate name of ftp.cs.mcgill.ca. Thus if the
>archive for the domain cs.mcgill.ca moved to another host, only the CNAME
>record would need to change. This change would in most cases be
>completely transparent to your users.
>
>
>Non technical
>
>a) Check the contents of the archive to make sure that the files stored
>there can legally (and ethically) be obtained by the general public.
>
>
>
>
>
>
>
>Mar  2 22:54 1992   Page 7
>
>
>Information (programs, documents, datasets) which is freely distributable
>or in the public domain should be the only information placed in an
>anonymous ftp archive site. That information of unknown legal status
>should not be made generally available until the question is resolved: do
>not assume that because the information might have been retrieved from
>another archive that it is supposed to be generally available. There have
>been many instances in the past of proprietary information being
>unwittingly distributed by uninformed archive administrators. This could
>prove to be an expensive mistake. Know what is in your archive.
>
>b) It is wise to only obtain files for caching on your system from
>"reputable" sites around the net. These are well known and are run in a
>professional manner.
>
>c) Many ftp servers allow the logging of operations requested by their
>users. This logged information usually contains the names or IP addresses
>of the hosts from which the client is logged on. In the past when there
>were may users on a system, this information didn't say much about who
>was doing what. However, in today's network environment where individual
>computers have in fact become _personal_ computers, this information can
>easily identify the actual user to a high degree of probability. It is
>considered unethical behavior to release this logged information to
>individuals or groups not directly associated with the maintenance of the
>archive. Privacy rights have in many respects not been legally
>defined for computer environments and as such it is up to each site
>administrator to see that privileged information is not consciously or
>inadvertently distributed.
>
>d) Anonymous ftp site administrators should be aware that the storage of
>pornographic material in their archives may cause problems of a legal or
>(more likely) political nature. This is also true of other potentially
>offensive material such as that related to weaponry or terrorism. There
>are a number of cases where the network provider for sites carrying such
>material has threatened termination of network access until the offending
>files have been removed.
>
>
>Part III: Useful Configuration and Contents Information
>
>In this section we define a minimum recommended set collection of
>information that you could offer as the administrator of an archive
>site. In doing so, you would extend the functionality of your
>archive, as well as the functionality of indexing and resource
>discovery tools that choose to pick up and redistribute this
>information.
>
>It is expected that this information will itself be made available
>through the anonymous ftp archive mechanism. The specific encoding
>method used will be site-specific and encoding methods for some of
>the most popular computing environments are presented in Part IV.
>
>Note that these recommendations do not mandate or require that any
>particular piece of information be offered. However it is expected that
>those sites wishing to participate in the system offered here will
>adhere to the formats given.
>
>
>
>
>
>
>
>
>Mar  2 22:54 1992   Page 8
>
>
>
>Site-Specific Configuration Information:
>
>Information about the site itself can often be very valuable to users of
>your system in order for the to utilize the resource in an efficient
>manner.
>
>Description Information
>-----------------------
>
>Archive profile:
>
>	- A brief description of the purpose of the anonymous archive. If
>	  your site is intended to specialize in a particular type of
>	  information (examples might include software for a specific
>	  machine type, on-line copies of a particular type of literature
>	  or research papers and information in a particular branch of
>	  science or engineering) you should indicate this.
>
>
>Configuration Information
>-------------------------
>
>Site configuration information will help users better understand your
>wishes on how and when to access your site.  This would include such
>information as:
>
>Access:
>
>	- A summary of the access policies of this site. This
>	  should include such information as preferred times of
>	  usage, conventions or restrictions for uploading files
>	  to this site etc.
>
>Contact:
>
>	- The name of the site
>	- The name of the organization or group owning the site
>	- The name of the person responsible for administering the site
>	- The postal address
>	- The telephone number
>	- Email address of person or persons responsible for site
>	  administration.
>	- The location of the site by city, state, country
>	- The geographical (latitude/longitude) location
>	- The timezone of site
>
>[* Suggestions welcome for additions to this list *]
>
>Site-Specific Content Information:
>
>The preceding selection all pertains to access and utilization policies
>for a site. You could also wish to make available a selection of
>information about the actual contents of your archive or the services
>available from your organization or institution.
>
>
>
>
>
>
>
>
>Mar  2 22:54 1992   Page 9
>
>
>The following categories have been identified.
>
>
>Services:
>	- The archive can offer an overall description of each the
>	  various Internet services offered by the associated
>	  organization's systems, along with corresponding contact
>	  information. This description would then indicate whether the
>	  the parent organization offers such services as:
>
>		- on-line library catalogues
>		- WAIS [3], gopher [4], Prospero [5], WWW [6], archie [7]
>		  or any other such online information services
>		- specialized information servers such as those for weather,
>		  geographic information, newswire feeds etc.
>		- any other kind of information service
>
>	The following information will be made available:
>
>		- Name of service
>		- Host providing service
>		- Description
>		- Access protocol (telnet, ftp, Prospero etc)
>		- Keywords
>
>
>Document abstracts:
>	- An description of documents contained in the archive. This might
    A  description

>	  correspond to the actual abstract for each technical report or
>	  other document served from this archive site but any document
>	  offered could have a corresponding abstract made available for
>	  it.
>		- Filename or directory name of the document
>		- Title of the document
>		- Filename of the document
>		- Name(s) of author(s)
>		- Last revision date
>		- Scope (technical report, conference paper etc)
>		- The summary or abstract of the document being referenced
>		- Appropriate keywords
>		- Format that the document is stored in (ASCII text,
>		  PostScript, DVI etc.)
>		- Publication status (draft, published etc.)
>		- Document size [* length ? *]
>
>Datasets:
>	- A description of any datasets (star catalogs, DNA sequences,
>	  census statistics etc.) stored on the archive.
>		- Name of the dataset
>		- Title
>		- Version number or string of the dataset
>		- Date last revised for dataset
>		- Source of data
>		- Name of individual or group responsible for compilation
>		- Size of dataset
>		- Format of data (special record format name etc.)
>
>
>
>
>
>
>
>Mar  2 22:54 1992   Page 10
>
>
>		- Programs used to manipulate the data in the set
>
>Software Packages:
>	- Outlines for each of the program packages offered at this site.
>		- Name of file containing the package
>		- Title of package
>		- Version of the package
>		- Description of the function of package
>		- Name of author
>		- Package maintainer
>		- Package origin (original site, copied)
>		- Special considerations or restrictions on
>		  the package's use (GNU copyleft, hardware
>		  restrictions, etc).
>		- The copying policy (Public Domain, Freely
>		  Redistributable)
>		- Keywords appropriate for the package
>
>Mailing Lists:
>	- Publicly available mailing lists are maintained by this
>	  organization or institution. It is assumed that only lists
>	  admitting general subscriptions will be listed.
>	  	- Name of mailing list
>		- Description of list (function)
>		- Email address of list
>		- Email address of administrative contact
>		- Archives of list (if any)
>		- Keywords appropriate for describing the function of the
>		  list
>
>Complete File listing:
>	- A listing of all archive entries at this archive in format
>	  appropriate for that environment. Such listings, if properly
>	  maintained, reduce network traffic while simplifying the task of
>	  archive indexing services.
>
>
>Part IV: Information Encoding for Specific Environments
>
>In this section we offer recommended encoding methods for the standard
>items of information listed in Part III. In many cases these
>recommendations should be applicable to all environments.  Where this is
>not true standardized encodings are offered for specific environments.
>
>We offer such a standardized format so that if such information _is_ to
>be offered, it is formatted in such a way that it can be utilized by
>automated indexing and retrieval tools. The encoding methods proposed
>were developed to be extensible, so that additional information can be
>offered in a similar format, if the site administrator so wishes.
>
>Developing such recommendations offers several challenges. It is
>hoped that the encoding conventions should be applicable to as wide a
>variety of operating systems, file structures and encoding schemes as
>possible. In addition, the globalization of the Internet requires
>attention to constraints such as the language in use at an archive site.
>
>
>
>
>
>
>
>
>Mar  2 22:54 1992   Page 11
>
>
>In addition, the encoding methods proposed must be easy to implement and
>for the moment, use existing methods of access and retrieval.  We
>currently assume that the site language is English. It is assumed that
>additional formats for other languages will be proposed at a later date.
>
>
>An Encoding for UNIX Systems
>---------------------------
>
>All information encoded using this scheme should reside in a standard
>configuration and content directory located at the root of the anonymous
>ftp file tree. Specific categories of are to reside in specific named
                                    ????????

>files.
>
>Files should be made world readable and it is assumed that size and last
>modification dates [* times? *] can be obtained through the existing FTP
>mechanism.
>
>The encoding form for each entry is included in its description.
>
>The advantages to this system are that this information need only be
>constructed once with infrequent periodic updates as changes occur. Many
>of these files may never change during the lifetime of the host as an
>anonymous ftp site. They require no special programs or protocols to
>construct: a text editor is all that is needed.
>
>
>Configuration directory
>-----------------------
>
>All information will reside in a directory called "CONFIG" and which is
>located in the file structure directly under ("the child of") the
>directory in which the FTP client is located on login to the anonymous
>ftp archive host
>
>
>Configuration files
>-------------------
>
>A file of the given name will exist for each category listed in Part III.
>For the sake of consistency with other operating systems and for the
>ability to distinguish them from non configuration files of the same name,
>the filenames will be in all uppercase letters. Because of restrictions
>on older systems, filenames will be kept to a maximum of 14 characters.
>
>Files that contain multiple instances of a given category (Mailing lists
>for example) will logically be divided into "records" and each record
>containing multiple "fields".  The start of each field is marked by a
>special fieldname on a new line in the rightmost column followed by a
>colon (:). Field data may be separated from fieldname by whitespace. Any
>field may continue on the next line by whitespace (blank, tab) in the
>first column.
>
>
>Site Information
>----------------
>
>
>
>
>
>
>
>Mar  2 22:54 1992   Page 12
>
>
>[* General consensus such as it is, seems to favor a single file
>containing contact information... so here it is *]
>
>This file contains 1 record with the following fields.
>
>Filename: CONTACT
>
>Fields for the is file.
            ???????????

>
>Name:			Primary DNS name
>
>Cname:			Preferred DNS-registered canonical name for the site
>
>Postal-Address:		The postal address of the site
>
>Telephone:		The telephone number of the site. Should be in
>			international format and including the country code
>
>Organization:		Name of institution/organization/individual to
>			which the site belongs
>
>Electronic-Address:	Email address in RFC 822 format for the AFA
>			administrator (See Note <1>)
>
>Contact:		Name of person or group responsible for AFA
>			administration
>
>Location:		City, State and Country of the site
>
>Latitude-Longitude:	Latitude and longitude of site (See Note <2>)
>
>Timezone:		Timezone as hours and minutes from UTC (See Note <3>)
>
>Written-by:		Name of person writing this file (See Note <4>)
>
>Frequency:		Preferred frequency of retrieval of all AFA
>			extended configuration information by automated
>			retrieval tools (See Note <5>)
>
>
>
>Notes for this file.
>
><1> Email addresses must be in RFC 822 format. Names may be included in
>    the Email address.
>
>    For example:
>
>    		"Alan Emtage" <bajan@cc.mcgill.ca>
>            or
>	    	bajan@cc.mcgill.ca (Alan Emtage)
>
>    are valid Email addresses.
>
><2> Latitude and longitude are specified in that order as
>
>
>
>
>
>
>
>
>Mar  2 22:54 1992   Page 13
>
>
>	DD MM SS C / DD MM SS C
>
>    Where
>        DD is in degrees
>	MM is in minutes
>	SS is in seconds
>	C is the direction designator which is
>	  For latitude
>		"+"	is west of the Greenwich meridian
>		"-"	is east of the Greenwich meridian
>
>	  For longitude
>	        "+"	is north of the equator
>		"-"	is south of the equator
>
>    The double quotes (") are not part of the designator, but are used
>    here to delimit the symbols.
>
>
><3> Timezone is specified in hours and minutes from UTC (GMT). Specified
>    as
>
>    DHHMM
>
>    Where D is one of
>		"+"	is west of the Greenwich meridian
>		"-"	is east of the Greenwich meridian
>
>    HH is hours from UTC
>    MM is minutes from UTC
>
>[* I'd prefer not to use "E" and "W" for this etc. since it is so language
>   specific *]
>
><4> Email address and date of document composition may be included in
>    this field.
>
><5> The period is measured in days. This value should be chosen to
>    reflect the turnover of information at the archive.
>
>
>An example of a CONTACT file:
>
>Name:			gatekeeper.3com.com
>Cname:			ftp.3com.com
>Organization:		3Com Corporation
>Contact:		Mark D. Baushke
>Electronic-Address:	ftp@3Com.COM
>Telephone:		+1 408 764 5000 (general operator)
>Postal-Address:		5400 Bayfront Plaza, P.O. Box 58145, Santa Clara, CA 950
>			2-8145, USA
>Location:		Santa Clara, California, USA
>Latitude-Longitude:	37 24 43 + / 121 58 54 +
>Timezone:		-0800 (Pacific Standard Time)
>Written-by:		mdb@NSD.3Com.COM (Mark D. Baushke); Mon Feb 10 22:43:31 PST 1992
>Frequency:		10
>
>
>
>
>
>
>
>Mar  2 22:54 1992   Page 14
>
>
>
>
>
>
>Site Access Policy Information
>------------------------------
>
>Description of the access policies of this site.
>
>Filename: ACCESS
>
>This file contains 1 record with the following fields.
>
>Access-times:	Period of preferred times of access to anonymous ftp
>		users in UTC. (See Note <1>)
>
>Policy: 	Information such as conventions or restrictions for uploading
>		files to this site etc.
>
>
>Notes for this file.
>
><1> Times in UTC between which access to this site is preferred. This
>    takes the format
>
>    HHMM / HHMM
>
>    Where
>       HH is in hours
>       MM is in minutes
>
>    The first HHMM is starting time, the second the ending time.
>
>
>
>Example of ACCESS file.
>
>Access-Times:	0200 / 1300
>Policy:		Non-proprietary data may be uploaded to this site in the
>		"incoming" directory. Please contact site administrators if
>		you do so. Proprietary or offensive material found in
>		this directory will be removed. This site is not to be
>		used as a temporary storage area.
>
>
>Site Description
>----------------
>
>Filename: DESCRIPTION
>
>This file contains 1 record with the following fields.
>
>
>Description:	Contains text describing any field or area of
>		specialization that the site subscribes to. For example,
>		if the site was concerned with molecular biology a
>
>
>
>
>
>
>
>Mar  2 22:54 1992   Page 15
>
>
>		paragraph or two with that keyword and some further
>		description would be in order.
>Keywords:	Appropriate keywords describing contents of this AFA
>
>
>Example for DESCRIPTION file.
>
>Description:	This site contains data relating to DNA sequencing
>		particularly Yeast chromosome 1. Datasets are available.
>		There is also a selection of programs available for
>		manipulating this information.
>Keywords:	DNA, sequencing, yeast, genome, chromosome
>
>
>
>For the following categories the assumption shouldn't be made that the
>information applies to the anonymous ftp host itself. Rather, the group
>or organization may publish general information: the specific information
>will be contained inside the file describing the category.
>
>Services information
>--------------------
>
>Filename: SERVICES
>
>This file contains records with the following fields. Each record is
>started and delimited by the "Service-Name" field.

I am not so sure this should be contained within a documented entitled
"Guide to FTP Site Administration".  There is a need to keep a list of
services such as those listed below but I don't believe this is the
place to do it.

>
>Service-Name:		Name of the service (See Note <1>)
>
>Description:		Short description of the service provided
>
>Access-protocol:	Method required to access service (See Note <2>)
>
>Keywords:		Keywords appropriate for the service
>
>Notes on this file.
>
><1> This can be a generic name such as "NNTP" or "WAIS" or something more
>    specific such as "Geographic Name Server"
>
><2> A description of how the service is to be accessed. This may be as
>   simple as "Email" or "telnet to port 201" or more complex such as
>   "Prospero protocol on port 5678"
>
>
>Example of SERVICES file.
>
>Service-Name:		Census Bureau information server
>Hostname:		census.foo.com (127.0.0.2)
>Description:		This server provides information from the latest USA
>			Census Bureau statistics (1990).
>Access-protocol:	telnet protocol to port 3000. A server-specific
>			query language is used. Type "help" for more
>			information.
>Keywords:		census, population, 1990, statistics
>
>
>
>
>
>
>
>Mar  2 22:54 1992   Page 16
>
>
>
>
>
>Document Abstracts information
>------------------------------
>
>Filename: ABSTRACTS
>
>This file contains records with the following fields. Each record is
>started and the previous record is delimited by the "Document-Name" field.
>
>
>Document-Name:		Filename containing the document
>
>Title:			Title of the document
>
>Authors:		Name of authors (See Note <1>)
>
>Revision-Date:		Last date that document was revised
>
>Category:		Type of document. (See Note <2>)
>
>Abstract:		Summary of the document
>
>Format:			Format or formats in which the document is
>			available (See Note <3>)
>
>Citation:		The official bibliographic entry for the document
>
>Publication-Status: 	Current status of document (draft, published etc)
>
>Keywords:		Keywords relevant to the document
>
>Size:			Length of document in pages
>
>
>
>Notes for this file.
>
><1> The names of the individuals or group appearing on the document as
>    authors. Names can be separated by a semicolon. RFC 822 Email address
>    for each author should be included where appropriate.  addresses may also
>    be included where appropriate.
>
>    For example
>
>Authors:     Alan Emtage <bajan@cc.mcgill.ca>; Peter Deutsch
>       	     <peterd@cc.mcgill.ca> 805 Sherbrooke W., Rm 222, Montreal,
>	     Quebec CANADA H3A 2K6
>
><2> The intention of this field is to define the category of the document.
>    It can be "Technical Report", or perhaps the name and date of the
>    conference at which the paper was presented. It may also be something
>    like "General guide" or "User manual"
>
><3> Documents are often available in several formats. Examples include
>
>
>
>
>
>
>
>Mar  2 22:54 1992   Page 17
>
>
>    "PostScript", "ASCII text", "DVI" etc.
>
>
>
>Example of ABSTRACTS file.
>
>Document-Name:	 	yeast-homeobox
>Title:			The function of homeoboxes in Yeast Chromosome 1
>Authors:		John Doe jdoe@yeast.foobar.com; Jane Buck
>			jane@fungus.newu.edu
>Revision-Date:		25 November 1991
>Category:		Yeastcon, January 1992, San Francisco
>Abstract:		Homeoboxes have been shown to have a
>			significant impact on the expressions of genes
>			in Chromosome 1 of bakers yeast. This paper
>			surveys	this impact.
>Format:			PostScript, ASCII (without graphs)
>Citation:		J. Doe, J. Buck, The function of homeoboxes in
>			Yeast Chromosome 1, Conf. proc. Yeastcon, January
>			1992, San Francisco, pp. 33-50
>Keywords:		yeast, chromosome, DNA, sequencing
>Publication-Status:	Published
>Size:			18 pages
>
>
>
>Mailing List Information

I would suggest to make this 'MAILING LIST ARCHIVE' information
and not attempt to be the "List of Lists" for mailing lists.  This
document should only handle those items that pertain to anonymous FTP.

>------------------------
>
>Filename: MAILINGLISTS
>
>This file contains records with the following fields. New records are
>marked and delimited by the "Mailinglist-Name" field.
>
>
>Mailinglist-Name:	The name of the list
>
>Address:		The address (in RFC 822 format) that mail
>			intended for this list should be sent to.
>
>Administration: 	The address (in RFC 822 format) of the
>			administrative contact for the list. Additions
>			and deletions to the list as well as questions
>			about the list should be directed to this address.
>
>Description:		A description of the purpose of the list. Any
>			special conditions for the list should be
>			included.
>
>Keywords:		Keywords useful for anyone looking for the list
>
>Archive:		Location and access method for any archive for
>			this mailing list.
>
>
>
>
>
>
>
>
>
>
>Mar  2 22:54 1992   Page 18
>
>
>Example of the MAILINGLIST file.
>
>Mailinglist-Name:	Internet Engineering Task Force (IETF) Internet
>			Anonymous FTP Archive working group (IAFA-WG)
>			mailing list
>Address:		iafa@cc.mcgill.ca
>Administration:		iafa-request@cc.mcgill.ca
>Description:		Discussion list for the IAFA Working Group
>			concerning the administration of anonymous FTP
>			archive sites.
>Keywords:		IETF, IAFA, anonymous, FTP, archive, Internet
>Archive:		The archive for this mailing list is available on
>			archive.cc.mcgill.ca via anonymous FTP in the file
>			pub/mailing-lists/iafa
>
>
>
>Software Packages Information
>-----------------------------
>
>Filename: PACKAGES
>
>This file contains records with the following fields. The record is
>started and delimited from other records by the "Package-Name" field.
>
>Package-Name:	Name of the file or directory containing the package
>
>Title:		Title of the package
>
>Version:	This field can be used if a version number or string is
>		associated with the package
>
>Description:	Description of the function of the programs in the package
>
>Author:		Name and Email address of authors if available
>
>Maintained-by:	This field should be included when the current maintainer
>		of the package is known. Contact information should be
>		included
>
>Maintained-at:	Host name of the "home" of the package if known. This is
>		the site at which the most uptodate version of the package
>		would be expected to be found
>
>Platforms:	Any requirements or restrictions that the package may
>		have in terms of hardware or software (OS) platforms.
>		The programming language the package is written in should
>		be included.
>
>Copying-Policy:	The status of the package for copying purposes. (See Note
>		<2>)
>
>Keywords:	Keywords appropriate for users trying to locate the
>		package
>
>
>
>
>
>
>
>
>
>Mar  2 22:54 1992   Page 19
>
>
>Notes on this file.
>
><1> The most common entries for this field would be "Public Domain" or
>    "Freely Redistributable" or "Voluntary Payment" (shareware). However
>    since a record may exist for software packages not resident on the
>    AFA, this field may be entered as "Proprietary" or some other form of
>    restricted access
>
>
>Example record for the PACKAGES file.
>
>Package-Name:	xarchie.tar.Z
>Title:		xarchie
>Version:	1.3
>Description:	This program provides and X11 interface to the archie
>		database. It allows the user to locate and retrieve files
>		found on anonymous FTP archive sites around the world.
>Author:		George Ferguson (ferguson@cs.rochester.edu)
>Maintained-by:	George Ferguson (ferguson@cs.rochester.edu)
>Maintained-at:	ftp.cs.rochester.edu
>Platforms:	X11 based program written in C. Known to run on Sun 3's
>		and Sun 4's under SunOS 3.X and 4.X
>Copying-policy:	Freely Redistributable. Copyright held by author.
>Keywords:	archie, X11, anonymous FTP archive, software location
>
>
>
>
>Datasets Information
>--------------------
>
>Filename: DATASETS
>
>This file contains records with the following fields. The record is
>started and delimited by the "Dataset-Name" field.
>
>Dataset-Name:	Name of the file or directory containing the dataset
>
>Title:		Title of the dataset.
>
>Version:	Version number can be used if one is associated with the
>		dataset
>
>Revision-Date:	The date of last revision of the dataset
>
>Source:		The group or organization providing the source for the
>		dataset. Email or postal addresses should be included
>		where possible
>
>Compiled-by:	The group or organization responsible for compiling the
>		dataset into the format for which this description
>		applies. Email or postal addresses should be included
>		where possible
>
>Size:		Size of the dataset. (See Note <1>)
>
>
>
>
>
>
>
>
>Mar  2 22:54 1992   Page 20
>
>
>Format:		The format in which the dataset is distributed. (See Note
>		<2>)
>
>Software:	A list of any programs used to manipulate the dataset.
>		Contact names and addresses (Email,postal) should be
>		included where possible.
>
>
>Notes on this file.
>
><1> This information should be in well-known units such as octets
>    (bytes). Alternatively the number of records in the dataset and the
>    record size may be given.
>
><2> The format may be well-known or specific to a specific dataset.
>    Additional information on programs used with this dataset should be
>    provided in the "Software" field.
>
>
>[* Example of the DATASETS file. *]
>
>
>Listing Information
>-------------------
>
>Filename: LISTINGS
>
>This file differs from the others in that it has no user defined fields.
>For UNIX sites this file should contain a long recursive listing (ls -lR)
>from the directory in which the anonymous ftp client would find itself on
>initial login. The file may be compressed by the UNIX compress(1) format.
>Any anonymous ftp retrieve of this file should be, by default in binary
>mode to accommodate the case of the file being compressed.
>
>This file should be automatically generated on a frequent basis,
>depending on how often the files at your site change. Compressing the
>file lessens the load on the network since less traffic has to flow and it
>lessens the load on the anonymous ftp archive host since the
>administrators can determine when it is most convenient for the program to
>run.
>
>
>Bibliography [* To be fixed up *]
>------------
>
>[1] RFC 959		Postel, J.B.; Reynolds, J.K. File Transfer
>                        Protocol. 1985 October
>
>[2] RFC1296		Lottor, M.  Internet Growth (1981-1991).  1992
>                        January;

One area not covered is a "quick index" to files.  This should be
a file with one record per available file which a user can retrieve
and quickly scan to find the necessary file he is searching for rather
than having to go through the ABSTRACTS, MAILINGLISTS, PACKAGES,
DATASETS indices to find what he needs.  Especially since each entry
may be 10-20 lines long, these indices will grow quite large.  I
maintain a "quick index" for an anonymous FTP on my VM system, that
looks like this (it is almost a LISTING type file):

 Filename Filetype   Size     Last     Description
                     bytes    update
 -------- --------   -------  -------  ---------------------------------
 $INDEX   INDEX        12288  17May92  An index to all files on hank.400
 NSF-MAP  PS          110592  14May92  NSFNET backbone map
 NSF-NREN PS          253952  14May92  NSF NREN Implementation plan
 RIPE     DB         1753088  13May92  Ripe name and link database
 RFC1325  TEXT         98304    May92  FYI for new Internet users
 ST-V12   PS          139264    May92  Simple Times May/Jun '92 - SNMP
 CRUISE   README        4096  12Apr92  What is cruise?
 CRUISE   HQX        1142784  12Apr92  Read the README first
 RFC1314  TEXT         57344    Apr92  NetFAX: A file format for fax
 RFC      INDEX        61440   7Apr92  Sorted by Title thru RFC1313
 ATM-LAN  PS          598016   3Apr92  ATM for LANs

(For the complete file: ftp vm.tau.ac.il; cd hank.400; get $index.index).
The users I support have found this file extremely useful and it isn't
as much an effort to support as the large indices described in this
draft RFC.  The network administrators may not want to spend 10-15
minutes to add an entry to the index for each file.

Perhaps this type of file (with the 4 major fields) could be
automagically generated from the various index files.

Hank Nussbacher
hank@vm.tau.ac.il
Israel
Comments on draft IAFA doc Hank Nussbacher
Re: Comments on draft IAFA doc Alan Emtage