Re: how to make progress on the URL document

Tim Berners-Lee <timbl@ptpc00.cern.ch> Thu, 24 March 1994 02:31 UTC

Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa24955; 23 Mar 94 21:31 EST
Received: from CNRI.RESTON.VA.US by IETF.CNRI.Reston.VA.US id aa24951; 23 Mar 94 21:30 EST
Received: from mocha.bunyip.com by CNRI.Reston.VA.US id aa13127; 23 Mar 94 21:29 EST
Received: by mocha.bunyip.com (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA05871 on Wed, 23 Mar 94 12:58:18 -0500
Received: from dxmint.cern.ch by mocha.bunyip.com with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA05842 (mail destined for /usr/lib/sendmail -odq -oi -furi-request uri-out) on Wed, 23 Mar 94 12:56:06 -0500
Received: from ptpc00.cern.ch by dxmint.cern.ch (5.65/DEC-Ultrix/4.3) id AA14936; Wed, 23 Mar 1994 18:55:48 +0100
Received: by ptpc00.cern.ch (NX5.67d/NX3.0S) id AA14263; Wed, 23 Mar 94 18:58:05 +0100
Date: Wed, 23 Mar 1994 18:58:05 +0100
Sender: ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Tim Berners-Lee <timbl@ptpc00.cern.ch>
Message-Id: <9403231758.AA14263@ptpc00.cern.ch>
Received: by NeXT.Mailer (1.95)
Received: by NeXT Mailer (1.95)
To: "Mark P. McCahill" <mpm@boombox.micro.umn.edu>
Subject: Re: how to make progress on the URL document
Cc: mitra@pandora.sf.ca.us, uri@bunyip.com
Reply-To: timbl@www0.cern.ch

Mitra and Mark, you ask for diffs.  You're not going to like them
because the formatting messes it up quiet a lot but for what it's worth
here it is.

Tim





diff url-spec.txt /pub/www/doc/draft-uri-url-02.txt
2,3c2,3
< draft-ietf-uri-url-03.{ps,txt}                URI working Group
< Expires 21 September 1994                         21 March 1994
---
> draft-ietf-uri-url-02.{ps,txt}                             CERN
> Expires 1 July 1994                                  1 Jan 1994
8,9c8,9
<                   A Syntax for the Expression of
<              Access Information of Objects on the Network
---
>              A Unifying Syntax for the Expression of
>           Names and Addresses of Objects on the Network
12,23c12
<                          ABOUT THIS DOCUMENT
<                                    

<    This document specifies a Uniform Resource Locator (URL), the
<    syntax and semantics of formalized information for location and
<    access of resources on the Internet.
<    

<    This document was written by the URI working group of the Internet
<    Engineering Task Force.  Comments may be addressed to the editor,
<    Tim Berners-Lee <timbl@info.cern.ch>, or to the URI-WG
<    <uri@bunyip.com>. Discussions of the group are archived at 

<    

<   <http://www.acl.lanl.gov/URI/archive/uri-archive.index.html>
---
> Status of this memo
25,41d13
<    This document is bound by the Requirements Specification in
<    preparation.
<    

<    The work is derived from concepts introduced by the World-Wide Web
<    global information initiative,  whose use of  such objects dates
<    from 1990 and is described in "Universal Resource identifeirs for
<    the World-Wide Web", RFCXXX.
<    

<    This document is available in hypertext form, with links to
<    background information, as: 

<    

<   <http://info.cern.ch/hypertext/WWW/Addressing/URL/Overview.html>
< 

<    .
<    

<   STATUS OF THIS MEMO
<   

53c25,29
<    Distribution of this document is unlimited. 

---
>    Distribution of this document is unlimited.  Please send comments
>    to the author as timbl@info.cern.ch. or to the discussion list 

>    ietf-url@merit.edu. 

>    

> Abstract  

54a31,53
>    Many protocols and systems for document search and retrieval are
>    currently in use, and many more protocols or refinements of
>    existing protocols are to be expected in a field whose expansion is
>    explosive.  

>    

>    These systems are aiming to achieve global search and readership of
>    documents across differing computing platforms, and despite a
>    plethora of protocols and data formats.   As protocols evolve,
>    gateways can allow global access to remain possible. As data
>    formats evolve, format conversion programs can preserve global
>    access. There is one area, however, in which it is impractical to
>    make conversions, and that is in the names and addresses used to
>    identify objects.  This is because names and addresses of objects
>    are passed on in so many ways, from the backs of envelopes to
>    hypertext objects, and may have a long life.
>    

>    A common feature of almost all the data models of past and proposed
>    systems is something whicch can be mapped onto a concept of "object"
>    and some kind of name, address, or identifier for that object.  One
>    can therefore define a set of name spaces in which these objects
>    can be said to exist.
>    

>    Practical systems need to access and mix objects which are part of
56a56
> 

58a59,467
>    different existing and proposed systems. 

>    

>    This paper discusses the requirements on a universal syntax which
>    can be used to encapsulate a name in any registered name space. 

>    This will allow names in different spaces to be treated in a common
>    way, even though names in different spaces have differing
>    characteristics, as do the objects to which they refer
>    

>    The universal syntax to objects available using existing protocols,
>    and may be extended with technology.  It makes a recommendation for
>    a generic syntax, and for specific forms for "Uniform Resource
>    Locators" (URLs)of objects accessible using existing Internet
>    protocols.
>    

>    The syntax has been in widespread use by World-Wide Web software
>    since 1990.
>    

> Terms
> 

>    The objects on the network which are to be named and addressed
>    include typically objects which can be retrieved, and objects which
>    can be searched.  There is a great variety of other objects which
>    may support other operations. We imply nothing about the contents
>    of objects in this document. Whereas human-readable documents are
>    currently the center of interest of the field, we envisage all
>    aspects discussed in this paper applying to generalized objects
>    when systems to handle them become available. The "object" is the
>    unit of reference and need not correspond to any unit of storage.
>    We refer to objects which can be searched as "indexes".  We
>    emphasize that this is the abstract view of the client, and these
>    objects need not correspond to physical files on computers. We
>    refer to the person who does the retrieval or searchiing as the
>    user.  

>    

>    Within this document, we use the terms "name" very generally for a
>    string of characters describing an object,  whatever its
>    combination of properties mentioned below.  (The term usually has a
>    narrower meaning but we needed some term for the universal set.). 

>    This uniform syntax applied to a generic name is known as a Uniform
>    Resource Identifier (URI). The term "address" is reserved for an
>    string which specifies a more or less physical location.  The term
>    "locator" refers to a URL as here defined.  URIs which have a
>    greater persistence than URLs are referred to as URNs.
>    

> Characteristics  

> 

>    This section characteristics of various naming schemes,
>    requirements which some ofexisting schemes meet, and requirements
>    for the URL scheme itself. URLs, as an introduction of and
>    background for the Recommendations section. 

>    

>   USES OF NAMES AND ADDRESSES  

>   

> 

> 

> 

> Berners-Lee                                                          2
> 


>    A name allows a user, with the help of a "client" program, to
>    retrieve or operate on objects via a "server" program.  A name may
>    be passed for example:   

>    

>       In communication of any form between two people, to refer to a
>       document, or part of a document; 

>       

>       As part of the description of a link associated with a hypertext
>       document; 

>       

>       As part of the result of searching an index.   

>       

>    Some typical requirements on a name which are met to a varying
>    degree by various schemes are for example that the name is 

>    

>   Persistent              A given name will remain valid as long as it
>                          is needed;   

>                          

>   Extensible              A given naming syntax will remain valid
>                          through the introduction of new protocols and
>                          directory technologies; 

>                          

>   Resolvable              A name will contain enough information to
>                          allow the document or index to which it
>                          refers to be accessed, perhaps via resolution
>                          into an intermediate, more physical, name.  

>                          

>   Unique                  Each object can only have one such name. 

>                          The fact that two such names are different
>                          implies that the objects to which they refer
>                          are different (in some way).   

>                          

>   Unambiguous             The fact that two names are identical
>                          implies that the objects named are the same
>                          (in some way). 

>                          

>    The syntax discussed is the syntax of one name, be it a lasting
>    name or a physical address.  When a directory server or hypertext
>    link contains a set of alternative names, then that is beyond the
>    scope of this syntax.  Similarly, a syntax for describing a
>    compound object is outside the scope of this syntax.  The specific
>    locator name spaces (defined under the umbrella of the general
>    syntax) each meet the requirements above to a greater or lesser
>    extent. 

>    

>   CURRENT PRACTICE  

>   

>    Current protocols use many different standards for names. For some
>    protocols, such as ISO-10163 Search and Retrieve protocol[16], the
>    names returned in a search are only valid during the session. For
>    others, such as FTP[9], they are lasting names which may be used
>    for object retrieval at a later time.  Typically, however, they are
>    not long-lasting names which are independent of the location of the
> 

> 

> 

> Berners-Lee                                                          3
> 


>    object. Such names may be provided using directory servers such as
>    x.500. They will refer to the registration, however formal or
>    informal, of a object with a particular organisation or person. 

>    Both hypertext and  manual references rely on long- lasting names. 

>    Current names are basically location specifiers (addresses). These
>    may be known as Uniform Resource Locators (URLs). They give the
>    necessary parts of an address for a reader to access an information
>    provider using the given protocol, and ask for the object required.
>    Examples of names used by various protocols include 

>    

>     File Transfer Protocol (Postel 1985):
>     

>       Host name or IP-address 

>       

>       [TCP port]   

>       

>       [user name, password]   

>       

>       Filename 

>       

>     W.A.I.S. (Kahle 1990)
>     

>       Host name or IP-address 

>       

>       [TCP port]   

>       

>       local document id 

>       

>     Gopher (Alberti 1991)
>     

>       Host name or IP-address   

>       

>       [TCP port] 

>       

>       database name 

>       

>       selector string 

>       

>     HTTP (Berners-Lee 1991)
>     

>       Host name or IP-address   

>       

>       [TCP port]   

>       

>       local object id 

>       

>     NNTP (Kantor 1986)
>     

>       NNTP  group
>       

>       Group name 

>       

>       NNTP article
> 

> 

> 

> Berners-Lee                                                          4
> 


>       Host name 

>       

>       unique message identifier 

>       

>     Prospero links (Neuman 1992)
>     

>       Host name or IP address   

>       

>       [UDP port]   

>       

>        Host specific object name   

>       

>       [version]   

>       

>       [identifier]* 

>       

>     x.500 distinguished name
>     

>       Country   

>       

>       Organisation   

>       

>       Organisational unit   

>       

>       Person   

>       

>       Local object identifier   

>       

>    Other systems with their own naming schemes include BITNET
>    "LISTSERV" application, FTAM file retrieval, SQLnetTM remote
>    database search, proprietary  distributed file systems, etc.
>    Conventional syntax for writing these addresses involve various
>    forms of punctuation to separate these parts.  This sometimes,  but
>    not always, allows the naming scheme to be deduced from the
>    punctuation. For example, a name of the form
>    xxx.yyy.zz.edu:/pub.aa.bb.cc often implies anonymous FTP access.
>    However, there is no well-defined algorithm for parsing an
>    arbitrary name, as there is no common syntax. 

>    

>   EXPANDABILITY  

>   

>    There will necessarily be a phase during which lasting names will
>    become more  common, as the deployment of directory services
>    increases to the point where  every user has direct or indirect
>    access to one.  Even then, however, one can envisage more than one
>    competing directory system, and cases in which physical  names are
>    still required.  A directory service takes a lasting name and
>    reduces it  to a physical address (or set of addresses) which,
>    though less useful for lasting reference, is the only way to
>    actually retrieve the object. An addressing syntax is required
>    which will be able to encompass existing  physical address spaces,
>    and be extendible to any future protocols.  This  requires that it
>    contain an identifier for the protocol in use. The format of the
> 

> 

> 

> Berners-Lee                                                          5
> 


>    rest  of the address will necessarily depend to a certain extent on
>    the protocol. 

>    

>   RELEVANCE  

>   

>    The life of a name is limited by any information contained within
>    it which may  become prematurely invalid. It is therefore necessary
>    to limit the contents of a name to the information required for the
>    operations above.  Other extraneous information about the object
>    (its size, data format, authorisation details, etc.) may in general
>    change with time and should not be part of the name.  One might
>    expect such information to be part of the "header" of a object, and
>    for protocols to allow the header information to be retrieved
>    independently of the objects themselves.  Any physical address may
>    be subject to change with time: hence we encourage the move to
>    lasting names and directory services. 

>    

>   UNIQUENESS  

>   

>    Clearly one requires unambiguous names in the sense that one name
>    should refer to only one logical object. This is the case with all
>    the addressing schemes in use, whether they are directory systems
>    or physical addresses. (The internet addresses all rely on the
>    domain name (Mockapetris 1987) of the host to achieve this).
>    However, given that names can be translated, many apparently
>    different names  may lead to the same object. Any object may
>    therefore be referred to by many  names. One needs to be able to
>    know whether two objects, retrieved through  different paths, are
>    in fact the same object.  It is suggested that each object have a
>    unique "official" name. This name could be stored in the object in
>    some representations, or stored in a database  accessible to the
>    server, for example.  Any references within that object should be
>    parsed in the context of the official name.  In the presence of a 

>    directory service, the official name will normally be the
>    registered name of the object. However, a name in any scheme will
>    do, so long as it is completely specified. On systems which do not
>    allow the name to be stored (such as anonymous FTP archive sites),
>    a possible ambiguity will always exist as to whether two similarly
>    named objects are in fact the same.  Note that Internet newsgroup
>    names are unique world-wide, and news articles carry a unique
>    message id. In most other cases, however, there is no guarantee
>    that dereferencing a URL will work, or that if it does the object
>    it refers to will in fact be the object intended.  URLs such as FTP
>    addresses are transient in that files may be moved and even
>    replaced by different files of the same name.  This disorganisation
>    may be limited by good server management, but a naming scheme which
>    is independent also of internet host name is obviously preferable. 

>    

>   READABILITY BY PEOPLE  

>   

>    This requirement has been put forward by several people (Clifford
>    Lynch, Douglas Engelbart among others), and disputed by others. 

>    The author's view is that it will be a while before technology and
> 

> 

> 

> Berners-Lee                                                          6
> 


>    standardisation have reached the point at which names and addresses
>    will be hidden from human beings. As long as they must be written
>    on the backs of envelopes and "cut and pasted" between workstation
>    windows, there is a strong need for names to be   

>    

>       Short   

>       

>        Composed of printable (preferably non-white) characters   

>       

>       To a certain extent, understadable by a human being. 

>       

>   STRUCTURE OF NAMES AND ADDRESSES
>   

>    A physical address is required in order for: 

>    

>       The user's program to contact the server; 

>       

>       The server to perform the operation (e.g. search and index,
>       retrieve a object,  or look up the name) and return a result;   

>       

>       The user's program to locate an individual position or element
>       within a returned object.  

>       

>    This suggests that a name be structured, such that the parts
>    necessary for these  three operations be separate and only used by
>    those system elements which need  those parts. This corresponds to
>    the basic principle of information hiding.  In fact,  four parts
>    are necessary, including the indicator of the naming scheme to be
>    used: 

>    

>       The naming scheme: a registered identifier for the protocol.   

>       

>       The name of a suitable server. The format of this part must be
>       well defined. It will depend on the lower-layer protocols in
>       use.  Systems which use widely distributed information, such as
>       x.500 and NNTP, do not need this part as each client generally
>       contacts his nearest server (or a particular server). 

>       

>       Information to be passed to the server. This may be private to
>       the server, as all names may be generated and used by the same
>       server. This part of the name should be opaque to the client. 

>       

>       Information to be used by the application once the object has
>       been retrieved. This part is private to the application (or,
>       more strictly, the data format) and so cannot be defined here. 

>       

>    Both lasting names and physical addresses often share a
>    hierarchical structure. This follows often from the organisation of
>    the system. From the naming point of view, it has the advantage
>    that a reference in one object to another object need not include
>    that part of the structure which is common to both names. 

>    

>   CHOICES FOR A UNIVERSAL SYNTAX 

> 

> 

> 

> Berners-Lee                                                          7
> 


>    The requirements above leave little room for choice save for the
>    order and punctuation of the elements of an address.  It is only
>    reasonable for the order of writing of the parts to be consistently
>    from left to right (or right to left) with increasing specificity. 

>    Punctuation schemes fall into two categories (Huitema 1991): tagged
>    schemes in which field are given names, and fields which use
>    special characters and field order. The latter tend to be more
>    compact schemes. 

>    

> 

>         protocol: aftp host: xxx.yyy.edu path:  

> 

>         /pub/doc/README
> 

>         PR=aftp; H=xx.yy.edu; PA=/pub/doc/README;
> 

>         PR:aftp/xx.yy.edu/pub/doc/README
>   

>         /aftp/xx.yy.edu/pub/doc/README
> 

>    Fig 1. Some alternative tagged and untagged representations 

>    

>    The choice of special symbols for punctuation tends to be a matter
>    of taste. It is easier to read  addresses whose symbols correspond
>    to those of one's favourite operating system. A variety of symbols
>    is needed so that when a name is abbreviated it is possible to tell
>    which parts have been omitted. 

>    

>    The  recommendation below uses special characters in order to
>    achieve a compact name, and uses where possible punctuation symbols
>    established in the internet or unix community.
>    

>    The choice of escape character for introducing representations of
>    non-allowed characters also tends to be a matter of taste. An ANSI
>    standard exists in the C language, using the back-slash character
>    "\". The use of this character on unix command lines, however, can
>    be a problem as it is interpreted by many shell programs, and would
>    have itself to be escaped. 

>    

>    There is a conflict between the need to be able to represent many
>    characters including spaces within a URL directly, and the need to
>    be able to use a URL in environments which have limited character
>    sets or in which certain characters are prone to corruption. This
>    conflict has been resolved by use of an hexadecimal escaping method
>    which may be applied to any characters forbidden in a given
>    context. When URLs are moved between contexts, the set of
>    characters escaped may be enlarged or reduced unambiguously.
>    

>    The use of multiple white space characters is discouraged  in URLs
>    to be printed or sent by electronic mail.  This is because of the
>    frequent introduction of extraneous white space when lines are
>    wrapped by systems such as mail, or sheer necessity of narrow
>    column width, and because of the  inter-conversion of various forms
> 

> 

> 

> Berners-Lee                                                          8
> 


>    of white space which occurs during character code conversion and
>    the transfer of text between applications.
>    

72c481
<   URL SYNTAX  

---
>   FULL FORM  

82,90c491,492
<     PrePrefix
<     

<    To be a Uniform Resource Locator as currently defined by the URI
<    working group, the whole string must start with a constant prefix
<    "URL:". Note that to save space in this document, URLs have been
<    quoted throughout without this preprefix. 

<    

<     Scheme  

<     

---
>   SCHEME  

>   

97,99c499,501
<    Those schemes which refer to internet protocols mostly have a
<    common syntax for the rest of the object name. This starts with a
<    double slash "//" to indicate its presence, and continues until the
---
>    Those schemes which refer to internet protocols have a common
>    syntax for the rest of the object name. This starts with a double
>    slash "//" to indicate its presence, and continues until the
112,116d513
< 

< 

< 

< Berners-Lee                                                          2
< 


121c518,522
<                          

---
> 

> 

> 

> Berners-Lee                                                          9
> 


156c557
<    the syntax shall not be used unencoded in a URL. 

---
>    the syntax shall not be used in a URL. 

162,167c563,566
<    awkward in a given environment.  Because a % sign always indicates
<    an encoded character, a URL may be made safer simply by encoding
<    any characters considered unsafe, while leaving already encoded
<    characters still encoded.  Similarly, in cases where a larger set
<    of characters is acceptable, % signs can be selectively and
<    reversibly expanded.
---
>    awkward in a given environment.  As a % sign always indicates an
>    encoded character, a URL may be made safer simply by encoding any
>    characters considered unsafe, while leaving already encoded
>    characters still encoded.  

170,174d568
< 

< 

< 

< Berners-Lee                                                          3
< 


176c570
<    hexadecimal or base 64 would be more appropriate.) 

---
>    hex or base 64 would be more appropriate.)  

177a572,574
>    The same considerations apply to mapping local fragment identifiers
>    onto the fragmentid part of a URL.
>    

179a577,580
> 

> 

> Berners-Lee                                                          10
> 


182c583
<    protocols follow. The schemes covered are 

---
>    protocols follow. 

184,208c585,593
<   http                    Hypertext Transfer Protocol 

<                          

<   ftp                     File Transfer protocol 

<                          

<   gopher                  The Gopher protocol 

<                          

<   mailto                  Electronic mail address 

<                          

<   mid                     Message identifiers for electroni mail 

<                          

<   cid                     Content identifiers for MIME body part 

<                          

<   news                    Usenet news 

<                          

<   nntp                    Usenet news for local NNTP access only 

<                          

<   prospero                Access using the prospero protocols 

<                          

<   telnet , rlogin and tn3270 

<                           Reference to interactive sessions 

<                          

<   wais                    Wide Area Information Servers 

<                          

<    The schemes for x.500, network management database and whois++ have
<    not been specified and may be the subject of futher study.
---
>   HTTP  

>   

>    The HTTP protocol specifies that the path is handled transparently
>    by those who handle URLs, except for the servers which de-reference
>    them.   The path is passed by the client to the server with any
>    request, but is not otherwise understood by the client.  The
>    fragmentid part is not sent with the request.  The search part, if
>    present, is sent. Spaces in URLs should be escaped for transmission
>    in HTTP. 

210,214d594
<    The url: prefix is reserved for use in encoding a Uniform Resource
<    Name when that has been developed by the IETF working group.
<    

<    New schemes may be registered at a later time.
<    

218,223c598,603
<    file system of the given host. The FTP protocol is used, as defined
<    in RFC957 or any successor. The port number, if present, gives the
<    port of the FTP server if not the FTP default. (A client may in
<    practice use local file access to retrieve objects which are
<    available though more efficient means such as local file open or
<    NFS mounting, where this is available and equivalent). 

---
>    file system of the given host. The FTP protocol is used. The port
>    number if given gives the port of the FTP server if not the FTP
>    default. (A client may in practice use local file access to
>    retrieve objects which are available though more efficient means
>    such as local file open or NFS mounting, where this is available
>    and equivalent). 

225,232c605
<       User name and password
<       

<    The syntax allows for the inclusion of a user name and even a
< 

< 

< 

< Berners-Lee                                                          4
< 


---
>     The syntax allows for the inclusion of a user name and even a
236,237c609
<    is "anonymous" and the password the user's Internet-style mail
<    address .
---
>    is "anonymous" and the password the user's mail address. 

239,242c611,620
<    Where possible, this mail address should correspond to a usable
<    mail address for the user, and preferably give a DNS host name
<    which resolves to the IP address of the client. Note that servers
<    currently vary in their treatment of the anonymous password.  

---
>    The adoption of a unix-style syntax involves the conversion into
>    non-unix local forms by either the client or server. Some non-unix
>    servers do this, but clients wishing to access sites which do not
>    have unix-style naming will need certain algorithms to enable 

>    other file systems to be identified and treated.  Client software
>    may also have to be flexible in terms of the sequence of FTP
>    commands used with different varieties of server.  In view of a
>    tendency for file systems to look increasingly similar, it was felt
>    that the URL convention should not be weighed down by extra
>    mechanisms for identifying these cases. 

244,296d621
<       Path
<       

<    The FTP protocol allows for a sequence of CWD commands (change
<    working directory) prior to a RETR (retrieve) which actually
<    accesses a file.  The arguments of any CWD commands are successive
<    segment parts of the URL, and the filename argument to the RETR
<    command is the final segment of the URL path. 

<    

<         Note
<         

<    In the case in which the file system of the server is known or
<    guessed by the client, the path may possibly converted into a
<    filename.  This may (in some cases)  allow the file to be retrieved
<    in one RETR command with no CWD command. In the case of unix, the
<    filename will in fact look the same as the URI path.  This must NOT
<    be taken to indicate that the URL is a unix filename.   In
<    practice, as many FTP servers in fact have or emulate unix file
<    systems, it may in fact be time-efficient to attempt first a direct
<    retrieval guessing unix syntax, and, if that fails, to attempt the
<    official sequence of succession of directory changes followed by a
<    RETR command.
<    

<    There is no common hierarchical model to the FTP protocol, so if a
<    directory change command has been given, it is impossible in
<    general to deduce what sequence should be given to navigate to
<    another directory for a second retrieval, if the paths are
<    different.  The only reliable algorithm is to disconnect and
<    reestablish the control connection.  However, if no directory
<    changes have been made, but direct retrieval has been done, then
<    the control connection may be kept.  Another possible
<    uninvestigated method is to use CDUP on the trial assumption of a
<    hierarchical structure to return a point in common between the
<    first and second URLs.
<    

<    (This note previously read:  "The adoption of a unix-style syntax
<    involves the conversion into non-unix local forms by either the
<    client or server. Some non-unix servers do this, but clients
<    wishing to access sites which do not have unix-style naming will
<    need certain algorithms to enable other file systems to be
<    identified and treated.  Client software may also have to be
<    flexible in terms of the sequence of FTP commands used with
<    different varieties of server. In view of a tendency for file
< 

< 

< 

< Berners-Lee                                                          5
< 


<    systems to look increasingly similar, it was felt that the URL
<    convention should not be weighed down by extra mechanisms for
<    identifying these cases." ) 

<    

<       Data type
<       

303c628
<    but it is outside the scope of this paper. 

---
>    but it outside the scope of this paper. 

305,328c630
<    An FTP URL may specify the method by which an object is to be
<    retrieved.  Two of the modes correspond to the FTP "Data Types"
<    ASCII and IMAGE for the retrieval of a document, as specified in
<    FTP by the TYPE command.  One mode indicates directory access.
<    

<    The data type is specified by a suffix to the URL separated by an
<    unencoded exclamation mark (ASCII 21 hex).  Possible suffixes are: 

<    

<   !I                     Use FTP image (I) mode to perform data
<                          transfer. 

<                          

<   !A                     Use FTP ASCII (A) mode to perform data
<                          transfer 

<                          

<   !D                     Use FTP directory list commands to read
<                          directory 

<                          

<    [suggestion: tenex. reference?] 

<    

<       Transfer Mode
<       

<    Stream Mode is always used.
<    

<   HTTP  

---
>   NEWS  

330,343c632,633
<    The HTTP protocol specifies that the path is handled transparently
<    by those who handle URLs, except for the servers which de-reference
<    them.   The path is passed by the client to the server with any
<    request, but is not otherwise understood by the client.  The
<    fragmentid part is not sent with the request.  The search part, if
<    present, is sent. Spaces and control characters in URLs must be
<    escaped for transmission in HTTP.
<    

<   GOPHER
<   

<    Gopher selector strings may contain any characters other than tab,
<    return, or  linefeed, so it is important to encode all disallowed
<    characters and encode any  space characters so these characters are
<    not altered during transport of the  URL. Note that since gopher
---
>    The news locators refer to either news group names or article
>    message identifiers which must conform to the rules of RFC 850.  A
347c637
< Berners-Lee                                                          6
---
> Berners-Lee                                                          11
349,357c639,642
<    selector string are opaque and in many cases map to  native file
<    system of the gopher server, so encoding of disallowed characters 

<    in the selector string is to map to binary codes rather than ISO
<    character  sets. In other words, the "%" character followed by two
<    hexadecimal digits is  used to encode binary data. Clients shall
<    not interpret gopher selector strings. While many Gopher servers
<    map to Unix file systems, you cannot assume that "/"  characters
<    imply a heirarchy since Gopher servers on non-Unix file systems may
<     use the "/" as part of a file name. 

---
>    message identifier may be distinguished from a news group name by
>    the presence of the commercial at "@" character. These rules imply
>    that within an article, a reference to a news group or to another
>    article will be a valid URL (in the partial form). 

359,361c644,645
<  

< 

<    The format of a gopher URL is: 

---
>    A news URL may be dereferenced using NNTP or using any other
>    protocol for the conveyance of usenet news articles. 

363,508c647
<       1. A single-character field to denote the Gopher type of the
<       resource to which the URL refers. 

<       

<       2. The gopher selector string.  Note that some gopher selector
<       strings begin with a copy of the gopher type character, in which
<       case that character will occur twice consecutively. Also note
<       that the gopher selector string may be an empty string since
<       this is how  gopher clients refer to the top-level directory on
<       a gopher server. 

<       

<       3. An encoded tab character (%09) to seperate the gopher
<       selector string from the optional search string (see 4 below).  

<       

<       4. If the URL does not refer to a Gopher+ item and if there is
<       no gopher search  string then parts 3, 4, 5, and 6 of the URL
<       are optional  

<       

<       4.) The gopher search string.  If the URL refers to a search to
<       be submitted to a gopher search engine, the  search string is
<       required. Otherwise this is an empty string. 

<       

<       5.) A question mark  [suggestion: an encoded tab character
<       (%09)] to seperate the gopher search string from the optional
<       gopher+ string (see 6 below). [suggestion: Note that if the URL
<       refers to a  gopher+ item and does not have a gopher search
<       string, there will be two  encoded tab characters in a row.] 

<       

<       6.) The Gopher+ string. Gopher+ strings consist of a one or more
<        characters and are used to represent information required for
<       retrieval  of the Gopher+ item. Gopher+ items may have alternate
<       views, arbitrary sets  of attributes, and may have electronic
<       forms associated with them. To  accomodate the various Gopher+
<       objects, the Gopher+ string in the URL must  accomodate a
<       mapping of the information a Gopher+ client sends to the server.
<       This makes this section a bit long since we basically cover the
<       entire Gopher+ protocol here. 

<       

<    When a Gopher server returns a directory listing to a client,
<    Gopher+ items are tagged with either a "+" (denoting gopher+ items)
< 

< 

< 

< Berners-Lee                                                          7
< 


<    or a "?" (denoting items  which have a +ASK form associated with
<    them). A Gopher+ string which is only a  "+" refers to the default
<    view (data representation) of the item. To retrieve  this item a
<    gopher+ client should send 

<    

<        a_gopher_selector<tab>+<cr><lf>
< 

<    to the gopher+ server.
<    

<    Note that items which have a +ASK asssociated with them (ie.
<    Gopher+ items  tagged with a "?") require the client to fetch the
<    item's +ASK attribute to  get the form definition, and then ask the
<    user to fill out the form and return  the user's responces along
<    with the selector string to retrieve the item.  Gopher+ clients
<    know how to do this but depend on the "?" tag in the gopher+  item
<    description to know when to handle this case. The "?" is used in
<    the Gopher+ string to be consistent with Gopher+ protocol's use of
<    this symbol.
<    

<    To refer to the Gopher+ attributes of an item, the Gopher+ string
<    might consist of "!" or "$". "!" refers to the all of a gopher+
<    item's attributes. "$" refers to all the item attributes for all
<    items in a Gopher directory. To retrieve an item or directory's
<    attributes, a gopher client will send: 

<    

<        a_gopher_selector<tab>!<cr><lf>
< 

<    for items or 

<    

<        a_gopher_selector<tab>$<cr><lf>
< 

<    for directories to the gopher+ server.
<    

<    To refer to specific attributes, the Gopher+ string is
<    "!attribute_name" or "$attribute_name". For example, to refer to
<    the attribute containing the  abstract of an item, the Gopher+
<    string would be "!+ABSTRACT". To refer to  several attributes,
<    clients send the server the attribute names seperated by spaces so
<    it is neccesary to seperate the attribute names with coded spaces.
<    To retrieve a collection of item attributes specified with a
<    gopher+ string of "!+ABSTRACT%20+SMELL" a gopher client would send 

<    

<        a_gopher_selector<tab>!+ABSTRACT +SMELL<cr><lf>
< 

<    to the gopher server.
<    

<    Gopher+ allows for optional alternate data representations
<    (alternate views) of items. To retrieve a Gopher+ alternate view,
<    the gopher+ client sends the appropriate view and language
<    identifier (found in the item's +VIEW attribute). To refer to a
<    specific Gopher+ alternate view, the URL's Gopher+ string would be
<    in the form "+view_name%20language_name". For example, a gopher+
<    string of "+application/postscript%20Es_ES" refers to the spanish
< 

< 

< 

< Berners-Lee                                                          8
< 


<    language postscript alternate view of a gopher+ item. To retrieve
<    this alternate view the client would send 

<    

<        a_gopher_selector<tab>+application/postscript Es_ES<cr><lf>
< 

<    to the gopher server.
<    

<    The gopher+ string for a URL that refers to an item referenced by
<    an ASK form  filled out with specific values is essentially a coded
<    version of what the  client sends to the server. The gopher+ string
<    will be of the form  

<    

<   +%091%0D%0A+-1%0D%0Aask_item1_value%0D%0Aask_item2_value%0D%0A.%0D%0
< A 

< 

<    To retrieve this item, the gopher client sends: 

<    

<        a_gopher_selector<tab>+<tab>1<cr><lf>
<        +-1<cr><lf>
<        ask_item1_value<cr><lf>
<        ask_item2_value<cr><lf>
<        .<cr><lf>
< 

<    to the gopher server.
<    

<    For a really complex example, consider a URL that refers to an
<    alternate view of an item that is referenced with a filled-out
<    Gopher +ASK form. The  gopher+ string will be of the form:  

<    

<    

<     +view_name%20language_name%091%0D%0A+-1%0D%0Aask_item1_value%0D%0A
<     ask_item2_value%0D%0A.%0D%0A 

< 

<    To retrieve this item, the gopher client sends: 

<    

<        a_gopher_selector<tab>+view_name language_name<tab>1<cr><lf>
<        +-1<cr><lf>
<        ask_item1_value<cr><lf>
<        ask_item2_value<cr><lf>
<        .<cr><lf>
< 

<    to the gopher server. 

<    

<     Summary: gopher+ string part of Gopher URL
---
>     Note1: 

510,621c649
< 

< 

<    To refer to an item which has an ASK form associated with it where
<    the  intent is to allow the user to enter values into the form as
<    part of the  retrieval process: 

<    

<    %3F [was: ?]  

< 

< 

< 

< 

< Berners-Lee                                                          9
< 


<    To refer to all or specific attributes of a gopher item: 

<    

<    ![attribute_name][%20attribute_name][%20attribute_name]...
< 

< 

<    To refer to all or specific attributes of a gopher directory: 

<    

<    $[attribute_name][%20attribute_name][%20attribute_name]...
< 

< 

<    To refer to the content of a gopher+ item (including an item
<    referred to by specific values in a filled-out ASK form): 

<    

<    +[view_name[%20language_name]]
<     [%091%0D%0A+-1%0D%0Aask_item1_value%0D%0Aask_item2_value...%0D%0A.
< %0D%0A]
< 

< 

< 

<     Overall summary and examples
<     

< 

<    The general format of a Gopher URL path refering to a gopher  type
<    "T" item is: 

<    

<   gopher://host [port]/T[gopher_selector]%09[search_string]?[gopher+_s
< tring]
< 

< 

<       Examples:
<       

<    An example of a URL pointing to a gopher type 0 item (a document)
<    is: 

<    

<   gopher://host [port]/0a_gopher_selector
< 

< 

<    An example of a URL pointing to a gopher type 7 item (a search
<    engine) where the string foobar is to be submitted to the search
<    engine is: 

<    

<   gopher://host [port]/7a_gopher_selector%09foobar
< 

< 

<    An example of a URL pointing to a Gopher+ type 0 item (a document)
<    is: 

<    

<   gopher://host [port]/0a_gopher_selector%09%09some_gplus_stuff
< 

< 

<    An example of a URL pointing to a Gopher+ type 0 (document) item's
<    attribute  information is: 

<    

< 

< 

< 

< Berners-Lee                                                          10
< 


<   gopher://host [port]/0a_gopher_selector%09%09!
< 

< 

<    An example of a URL pointing to a Gopher+ document's spanish
<    postscript representation is: 

<    

<   gopher://host [port]/0a_gopher_selector%09%09+application/postscript
< %20Es_ES
< 

<    .
<    

<   MAILTO
<   

<    This allows a URL to specify an RFC822 addr-spec mail address. 

<    Note that use of % , for example as used in forming a gatewayed
<    mail address, requires conversion to %25 in a URL.
<    

<    This semantics may be considered to be that the object referred to
<    by the mailto: URL is the set of messages sent to or from that
<    address. There is no algorithm to retrieve this set, but the SMTP
<    protocol allows messages to be added to it, and any given user may
<    be aware of a subset of its members.
<    

<   NEWS
<   

<    The news locators refer to either news group names or article
<    message identifiers which must conform to the rules for a
<    Message-Idof RFC 1036 (Horton 1987).  A message identifier may be
<    distinguished from a news group name by the presence of the
<    commercial at "@" character. These rules imply that within an
<    article, a reference to a news group or to another article will be
<    a valid URL (in the partial form). 

<    

<    A news URL may be dereferenced using NNTP (RFC977, Kantor 86)  (The
<    ARTICLE by message-id command ) or using any other protocol for the
<    conveyance of usenet news articles, or by reference to a body of
<    news articles already received. 

<    

<       Note1: 

<       

<    Among URLs the "news" URLs are anomalous in that they are
---
>    Among URLs the news: URLs are anomalous in that they are
629,630c657,658
<       Note 2:
<       

---
>     Note 2:
>     

634,638d661
< 

< 

< 

< Berners-Lee                                                          11
< 


641,643c664,666
<    Suggested subject of study in conjunction with NNTP working group. 

<    Further extension possible may be to allow the naming of subject
<    threads as addressable objects.
---
>    Suggested subject of study in conjunction with NNTP WG.  Further
>    extension possible may be to allow the naming of subject threads as
>    addressable objects. 

645,646c668,669
<     NNTP
<     

---
>   NNTP
>   

650,651c673
<    message identifier.  In all other cases the "news" scheme should be
<    used.
---
>    message identifier.
655d676
<    The NNTP protocol must be used. 

657,661c678,684
<       Note1.
<       

<    This form of URL is not of global accessability, as typically NNTP
<    servers only allow access from local clients.   Note that the
<    article numbers within groups vary from server to server.
---
>     Note1.
>     

>    This form of URL is not of global accessiablity, as typically NNTP
>    servers only allow access from local clients.  This form or URL
>    should not be quoted outside this local area.  It should not be
>    used within news articles for wider circulation than the one
>    server. 

663,668c686,699
<    This form or URL should not be quoted outside this local area.  It
<    should not be used within news articles for wider circulation than
<    the one server.  This is a local identifier for a resource which is
<    often available globally, and so is not recommended except in the
<    case in which incomplete NNTP implementations on the local server
<    force its adoption.
---
>   WAIS  

>   

>    The current WAIS implementation public domain requires that a
>    client know the "type" of a object prior to retrieval. This value
>    is returned along with the internal object identifier in the search
>    response. It has been encoded into the path part of the URL in
> 

> 

> 

> Berners-Lee                                                          12
> 


>    order to make the URL sufficient for the retrieval of the object.
>    Within the WAIS world, names do not of course not need to be
>    prefixed by "wais:"  (by the partial form rules). 

679c710
<    version number. If present, the version number is separated from
---
>    version number. If present, the version number is seperated from
681c712
<    zero zero), this being an escaped string terminator (null).
---
>    zero zero), this being an escaped string terminator (null). 

683c714
<    access method and are not represented as Prospero URLs.
---
>    access method and are not represented as Prospero URLs. 

684a716,740
>   GOPHER  

>   

>    The first character of the URL path part (after the initial single
>    slash) is a single-character "type" field which is that used by the
>    Gopher protocol.  The rest of the path is the "selector string",
>    with disallowed characters encoded. Note that some selector strings
>    begin with a copy of the gopher type character, in which case that
>    character will occur twice consecutively in the URL. If the type
>    character and selector are omitted, the type defaults to "1".
>    Gopher links which refer to non-Gopher protocols are represented
>    directly as URLs of the underlying access method and are not
>    represented as Gopher URLs. 

>    

>   MAILTO
>   

>    This allows a URL to specify an RFC822 addr-spec mail address. 

>    Note that use of % , for example as used in forming a gatewayed
>    mail address, requires conversion to %25 in a URL.
>    

>    This semantics may be considered to be that the object referred to
>    by the mailto: URL is the set of messages sent to or from that
>    address. There is no algorithm to retrieve this set, but the SMTP
>    protocol allows messages to be added to it, and any given user may
>    be aware of a subset of its members. 

>    

691a748,749
>    this is a less desirable, though currently common, solution. 

>    

695c753
< Berners-Lee                                                          12
---
> Berners-Lee                                                          13
697c755,762
<    this is a less desirable, though currently common, solution.
---
>   X500  

>   

>    The mapping of x500 names onto URLs is not defined here. A decision
>    is required as to whether "distinguished names" or "user friendly
>    names" (ufn), or both, should be allowed. If any punctuation
>    conversions are needed from the adopted x500 representation (such
>    as the use of slashes between parts of a ufn) they must be defined.
>    This is a subject for study. 

699c764
<   WAIS  

---
>   WHOIS  

701,707c766,770
<    The current WAIS implementation public domain requires that a
<    client know the "type" of a object prior to retrieval. This value
<    is returned along with the internal object identifier in the search
<    response. It has been encoded into the path part of the URL in
<    order to make the URL sufficient for the retrieval of the object.
<    Within the WAIS world, names do not of course need to be prefixed
<    by "wais:"  (by the partial form rules).
---
>    This prefix describes the access using the "whois++" scheme in the
>    process of definition. The host name part is the same as for other
>    IP based schemes. The path part can be either a whois handle for a
>    whois object, or it can be a valid whois query string. This is a
>    subject for further study. 

708a772,775
>   NETWORK MANAGEMENT DATABASE  

>   

>    This is a subject for study. 

>    

712,715c779,785
<    conforming URL syntax, using a new prefix. Experimental prefixes
<    may be used by mutual agreement between parties, and must start
<    with the characters "x-".  The scheme name "urn:" is reserved for
<    the work in progress on a scheme for more persistent names.  

---
>    conforming URL syntax, using a new scheme identifier. Experimental
>    scheme identifiers may be used by mutual agreement between parties,
>    and must start with the characters "x-".  The scheme name "urn:" is
>    reserved for the work in progress on a scheme for more persistent
>    names.  Therefore URNs (Names) and URLs (Locators)  be
>    distinguishable. An object which is either a URL or a URN is known
>    as a URI (Identifier).
731c801
<    retrieval by URL, that the client software have provision for being
---
>    retrieval by URI, that the client software have provision for being
735c805
< BNF for specific URL schemes
---
> BNF syntax
737a808,812
> 

> 

> 

> Berners-Lee                                                          14
> 


739,742c814,817
<    [brackets]  indicate optional parts.  Spaces are represented by the
<    word "space", and the vertical line character by "vline".   Single
<    letters stand for single letters. All words of more than one letter
<    below are entities described somewhere in this description.  

---
>    [brackets]  indicate optional parts.  Spaces are representated by
>    the word "space", and the vertical line character by "vline".  

>    Single letters stand for single letters. All words of more than one
>    letter below are entities described somewhere in this description. 

744,745c819,820
<    The current IETF URI working group preference  is for the
<    prefixedurl production. (Nov 1993. July 93: url).
---
>    The current IETF URI working group prefereence  is for the
>    prefiexedurl production. (Nov 1993. July 93: url).
749,754c824
<    characters do not appear in any productions and therefore may not
< 

< 

< 

< Berners-Lee                                                          13
< 


---
>    characters fo not appear in any productions and therefore may not
769c839
<                          | mailtoaddress  | midaddress | cidaddress 

---
>                          | mailtoaddress  

778c848
<   ftpaddress              f t p : / / login / path [ ! ftptype ] 

---
>   ftpaddress              f t p : / / login / path 

786,789d855
<   midaddress              m i d  :  addr-spec 

<                          

<   cidaddress              c i d : content-identifier 

<                          

799a866,870
> 

> 

> 

> Berners-Lee                                                          15
> 


808,812d878
< 

< 

< 

< Berners-Lee                                                          14
< 


839,840d904
<   ftptype                A | I | D 

<                          

851c915
<   path                    void |  segment  [  / path ] 

---
>   path                    void |  xpalphas  [  / path ]   

853,854d916
<   segment                 xpalphas 

<                          

862,865d923
<                          

<   gtype                   xalpha   

<                          

<   xalpha                  alpha | digit | safe | extra | escape   

869c927
< Berners-Lee                                                          15
---
> Berners-Lee                                                          16
870a929,932
>   gtype                   xalpha   

>                          

>   xalpha                  alpha | digit | safe | extra | escape   

>                          

885c947
<   digit                   0 |1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9   

---
>                           0 |1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9   

889c951
<   extra                   " |  ' | ( | ) | : | ; | , | space   

---
>   extra                   ! | * | " |  ' | ( | ) | : | ; | , | space  

891,892d952
<   reserved               ! | * 

<                          

910,911d969
<    (end of URL BNF)      

<    

920,923c978,980
<    A URL-related security threat is that it is sometimes possible to
<    construct a URL such that an attempt to perform a harmless
<    idempotent operation such as the retrieval of the object will in
<    fact cause a possibly damaging remote operation to occur.  The
---
>    The use of URLs containing passwords is clearly unwise.
>    

> Conclusion
927c984,985
< Berners-Lee                                                          16
---
> 

> Berners-Lee                                                          17
929,938c987,994
<    unsafe URL is typically constructed by specifying a port number
<    other than that reserved for the network protocol in question.  The
<    client unwittingly contacts a server which is in fact running a
<    different protocol.  The content of the URL contains instructions
<    which when interpreted according to this other protocol cause an
<    unexpected ooperation. An example has been the use of gopher URLs
<    to cause a rude message to be sent via a SMTP server.  Caution
<    should be used when using any URL which specifies a port number
<    other than the default for the protocol, especially when it is a
<    number within the reserved space.
---
>    A need has been demonstrated, and a number of requirements have
>    been stated for uniform resource locators (URLs). A scheme has been
>    proposed which builds on existing conventions to define a syntax
>    for URLs.  This scheme has been in serious use by World-Wide Web
>    (W3) initiative since 1991.  Adoption of the scheme in
>    correspondence, standards and software will ease the use of
>    references to on-line information in a flexible way as the coming
>    information age arrives.
940,948d995
<    Care should be taken when URLs contain embedded encoded delimiters
<    for a given protocol (for example,  CR and LF characters for telnet
<    protocols) that these are not unencoded before transmission.  This
<    would violate the protocol but could be used to simulate an extra
<    operation or parameter, again causing an unexpected and possible
<    harmful remote operation to be performed.
<    

<    The use of URLs containing passwords is clearly unwise.
<    

968c1015
<    Amsterdam IETF and refined in net discussion. 

---
>    Amsterdam IETF and refined in net discussion.
970,972d1016
<    The draft 03 includes changes made at Houston in Nov 93, and on the
<    net before Seattle March 1994.
<    

977c1021
< Wrappers for URIs in plain text
---
> Fragment-id  

979c1023,1027
<    This section does not formally form part of the URL specification .
---
>    This represents a part of, fragment of, or a sub-function within,
>    an object or object. Its syntax and semantics are defined by the
>    application responsible for the object, or the specification of the
>    content type of the object. The only definition here is of the
>    allowed characters by which it may be represented in a URL.  

981c1029,1039
<    URIs, including URLs, will ideally be transmitted though protocols
---
>    The fragment-id follows the URL of the whole object from which it
>    is separated by a hash sign (#).  If the fragment-id is void, the
>    hash sign may be omitted: A void fragment-id with or without the
>    hash sign means that the URL refers to the whole object.
>    

>    While this hook is allowed for identification of fragments, the
>    question of addressing of parts of objects, or of the grouping of
>    objects and relationship between contined and containing objects,
>    is not addressed by this object.
>    

>    This object does not address the question of objects which are
985c1043
< Berners-Lee                                                          17
---
> Berners-Lee                                                          18
986a1045,1111
>    different versions of a "living" object, nor of expressing the
>    relationships between different versions and the living object.
>    

> Partial form  

> 

>    In a certain limited set of cases, generally within a certain
>    application, it may be useful to pass only a section of the URL.
>    Within a object whose URL is well defined, the URL of another
>    object may be given in abbreviated form, where parts of the two
>    URLs are the same. This allows objects within a group to refer to
>    each other without requiring the space for a complete reference,
>    and it incidentally allows the group of objects  to be moved
>    without changing any references. This is not discussed in detail
>    here, it is only mentioned so that the characters required by the
>    technique be reserved for that purpose.  It must be emphasised that
>    when a reference is passed in anything other than a well controlled
>    context, the full form must always be used.  

>    

>    The partial form relies on a property of the URL syntax that
>    certain characters ("/") and certain path elements ("..", ".") have
>    a significance reserved for representing a hierarchical space, and
>    must be recognised as such by both clients and servers.  

>    

>    A partial form can be distinguished from a full form in that a full
>    form must have a colon and that colon must occur before any slash
>    characters.
>    

>    The rules for the use of a partial name are:   

>    

>       If the scheme parts  are different, the whole absolute locator
>       must be given. Otherwise, the scheme is omitted, and: 

>       

>       If the host and/or port parts are the different, the host, port
>       name and all the rest of the locator must be given. 

>       

>       If the access and host parts are the same, then the path may be
>       given in absolute (fully qualified) or relative form. Within the
>       path: 

>       

>       If a leading slash is present, the path is absolute. Otherwise,
>       a relative path is interpreted as follows:  

>       

>       The last part of the path of the context locator (anything
>       following the rightmost slash) is removed, and the given partial
>       URL appended in its place. 

>       

>       Within the result,  all occurrences of "xxx/../"  or "/." are
>       recursively removed, where xxx, ".." and "." are complete path
>       elements. 

>       

>    Note:  If a path of the context locator end in slash, partial URLs
>    will be treated differently to their treatment with respect to the
>    same path without a slash.   Using a trailing slash on a directory
> 

> 

> 

> Berners-Lee                                                          19
> 


>    name is not therefore recommended.  The signifcance of a trailing
>    slash may be considered as that of the locator of a file with void
>    name within that  directory.
>    

> Wrappers for URIs in plain text
> 

>    This section does not formally form part of the URL specification.
>    

>    URIs, including URLs, will ideally be transmitted though protocols
1005,1006c1130,1133
< Yes, Jim, I found it under <ftp://info.cern.ch/pub/www/doc> but
<     you can probably pick it up from <ftp://ds.internic.net/rfc>.
---
>                 Yes, Jim, I found it under <ftp://info.cern.ch/pub> bu
> t
>                 you can probably pick it up from <ftp://ds.internic.ne
> t/rfc>.
1009d1135
< 

1022,1024c1148,1150
<                          December 1991, as updated from time to time, 

<                          <ftp://info.cern.ch/pub/www/doc/http-spec.txt
<                          > 

---
>                          December 1991, 

>                          <ftp://info.cer
>                          n.ch/pub/www/doc/http-spec.txt> 

1029a1156,1160
> 

> 

> 

> Berners-Lee                                                          20
> 


1040,1047d1170
< 

< 

< 

< Berners-Lee                                                          18
< 


<   Horton (1987)           M. Horton, R. Adams, "Standard for
<                          interchange of USENET messages", Internet RFC
<                          1036 , 12/01/1987. 

1062c1185
<                          transmission of news" , Internet RFC-977,
---
>                          transmission of news", Internet RFC-977,
1066,1068d1188
<   Kunze, 1994            J. Kunze, Requirements for URLs, to be
<                          published. 

<                          

1092,1094d1211
<   Sollins 1994           K. Sollins and L. Masinter, Requiremnets for
<                          URNs, to be published. 

<                          

1097d1213
<                          Performance Systems International, Inc. 

1101c1217
< Berners-Lee                                                          19
---
> Berners-Lee                                                          21
1102a1219
>                          Performance Systems International, Inc. 

1109,1112c1226,1228
<    .
<    

<                           AUTHOR'S ADDRESS  

<                                    

---
> Author's address  

> 

> 

1122a1239
>    

1126d1242
<    

1160c1276
< Berners-Lee                                                          20
---
> Berners-Lee                                                          22