Re: [squid-users] Incorrect HTTP GET request from squid

From: Colin Campbell <sgcccdc@dont-contact.us>
Date: Tue, 16 Jul 2002 16:54:17 +1000 (EST)

Hi,

On Tue, 16 Jul 2002, Eric Lawson wrote:

> OK here's the deal. I have 5 brand new Sun Cobalt CacheRaq4's straight
> out of the box using the default configuration. When I was doing some
> testing before putting the boxes into production I was encountering
> some strange problems with certain websites. Prime example is
> http://www.smh.com.au. If I go to this site bypassing the cache server
> I get the proper site. If I configure my browser to go through the
> cache server I get http://www.f2.com.au returned to my browser. Now
> both of these sites have the same IP address returned from a DNS
> query. I put a sniffer on my network to capture traffic to/from the
> cache server to see what was happening and the following is an extract
> from these captures. This is the packet from my browser to the cache
> server with the HTTP GET request
>
> Ethernet II Internet Protocol, Src Addr: (192.168.21.111), Dst Addr:
> (10.65.9.132) Transmission Control Protocol, Src Port: 1112 (1112),
> Dst Port: 3128 (3128), Seq: 5 Hypertext Transfer Protocol GET
> http://www.smh.com.au/ HTTP/1.0\r\n Accept: image/gif,
> image/x-xbitmap, image/jpeg, image/pjpeg, application/vnd.ms
> Accept-Language: en-au\r\n User-Agent: Mozilla/4.0 (compatible; MSIE
> 5.5; Windows NT 4.0)\r\n Host: www.smh.com.au\r\n Proxy-Connection:
> Keep-Alive\r\n \r\n
>
> This is the packet from the cache server to the www.smh.com.au server
>
> Ethernet II Internet Protocol, Src Addr: (10.65.9.132), Dst Addr:
> (203.26.51.42) Transmission Control Protocol, Src Port: 3100 (3100),
> Dst Port: 80 (80), Seq: 30173 Hypertext Transfer Protocol GET /
> HTTP/1.0\r\n Accept: image/gif, image/x-xbitmap, image/jpeg,
> image/pjpeg, application/vnd.ms Accept-Language: en-au\r\n Via: 1.0
> sydcache:3128 (Squid/2.3.STABLE4)\r\n X-Forwarded-For:
> 192.168.21.111\r\n Host: www.smh.com.au\r\n Cache-Control:
> max-age=172800\r\n Connection: keep-alive\r\n \r\n
>
> Now as you can see the HTTP GET request has been changed, it is not
> passing the URL that was typed in http://www.smh.com.au/ , it has
> replaced it with a /
>
> My question is why is it doing this, and what can I do to fix it?

This is perfectly normal behaviour. The brwoser puts the full URL in
because that's how it tells the cache server where to go. The cache server
pulls the URL apart into host:port and path. It connects to the host:port
and tells the web server the path component (/). Just in case there's
multiple web servers on the same IP, there's a "Host: www.smh.com.au"
header.

If you sniffed your browser bypassing the cache you'd see the same
behaviour.

I tried www.smh.com.au and had no problem and I go:

        browser->squid->plug-gw->squid->world

Maybe you need to sniff the response from the web server to you cache.

Colin

--
Colin Campbell
Unix Support/Postmaster/Hostmaster
CITEC
+61 7 3227 6334
Received on Tue Jul 16 2002 - 00:54:33 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:09:15 MST