Re: SQUID 1.1.4 fixes from Don Lewis on 1997-01-17 (squid-users)

From: Don Lewis <Don.Lewis@dont-contact.us>
Date: Fri, 17 Jan 1997 22:36:59 -0800

On Jan 17, 11:06pm, Edward Henigin wrote:
} Subject: Re: SQUID 1.1.4 fixes
}
} Honestly, all this stuff sounds like the domain of the end
} client. Because, how can the Squid Proxy server know what the CR/LF
} conventions are on the client box using the proxy?

Squid doen't need to know the line termination convention of the client.
The standard line termination over the wire (which is only meaningful
for text files) is CR/LF in the case of FTP, SMTP, and NNTP. I would
imagine that HTTP is the same. Ack, it's not! HTTP is diffent than
everything else. RFC 2068 sez:

   When in canonical form, media subtypes of the "text" type use CRLF as
   the text line break. HTTP relaxes this requirement and allows the
   transport of text media with plain CR or LF alone representing a line
   break when it is done consistently for an entire entity-body. HTTP
   applications MUST accept CRLF, bare CR, and bare LF as being
   representative of a line break in text media received via HTTP.
     [ deleted ]
   This flexibility regarding
   line breaks applies only to text media in the entity-body; a bare CR
   or LF MUST NOT be substituted for CRLF within any of the HTTP control
   structures (such as header fields and multipart boundaries).

} My Squid server is running on a Solaris box. Most of its ASCII
} transfers don't change the file at all. If it serves to a Windows
} client, however, the local copy it has still has the wrong CR/LF
} convention. If it transfers to a Mac client (out of the cache, say)
} then it for sure has the wrong CR/LF or whatever convention.
}
} Make sense?

I guess so, though I guess that if file is being fetched from a Mac
server that we'd better be sure to use ASCII transfer mode to get
LF/CR translated to CR/LF.

I guess the client has to examine the text file that it gets and if
it wants to store it with native line termination, it has to examine
the file to determine the line termination used and then perform the
proper translation.

We do have a way out of this problem though. According to RFC 1738:

3.2.2. FTP url-path

     The url-path of a FTP URL has the following syntax:

          <cwd1>/<cwd2>/.../<cwdN>/<name>;type=<typecode>

     Where <cwd1> through <cwdN> and <name> are (possibly encoded) strings
     and <typecode> is one of the characters "a", "i", or "d". The part
     ";type=<typecode>" may be omitted. The <cwdx> and <name> parts may be
     empty. The whole url-path may be omitted, including the "/"
     delimiting it from the prefix containing user, password, host, and
     port.

     The url-path is interpreted as a series of FTP commands as follows:

        Each of the <cwd> elements is to be supplied, sequentially, as the
        argument to a CWD (change working directory) command.

        If the typecode is "d", perform a NLST (name list) command with
        <name> as the argument, and interpret the results as a file
        directory listing.

        Otherwise, perform a TYPE command with <typecode> as the argument,
        and then access the file whose name is <name> (for example, using
        the RETR command.)

    [ deleted ]

  3.2.3. FTP Typecode is Optional

     The entire ;type=<typecode> part of a FTP URL is optional. If it is
     omitted, the client program interpreting the URL must guess the
     appropriate mode to use. In general, the data content type of a file
     can only be guessed from the name, e.g., from the suffix of the name;
     the appropriate type code to be used for transfer of the file can
     then be deduced from the data content of the file.

So in those cases where we can't figure out what type the file is, we
could display two links for it. One with ;type=a, which we would transfer
in ASCII mode, and return as text, and one with ;type=i, which we would
transfer in binary mode and return as application/octet-stream. There
should be no confusion in the cache, since these have different URLs.
The only confusion will be on the part of the poor user who doesn't
know which one to pick.

--- Truck
Received on Fri Jan 17 1997 - 22:52:03 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:34:07 MST