I have been a long-time squid user (going on 10 years now), and have
been experiencing an issue that I believe is squid's fault, and may
cause me to drop squid entirely, because we have lost some customers
due to this behavior.
I run squid 2.6.STABLE6-5.el5_1 as a reverse proxy to two different
web servers on a decently-spec'd server (OS is RHEL 5) that only runs
squid (and iptables), and has a constant load of 1. The vast majority
of the time, everything works fine. Sometimes, however, the following
happens:
* User tries to connect to one of the web sites via a browser, and
either downloads some of the page elements, or none of the page
elements. This is duplicable from that user's computer within a
certain time window. Also, it is usually that the first exhibition of
the problem is that some of the page elements download, and on
retries, none of the page elements download.
* If a user is experiencing this issue, if the user drops to a shell
(or windows command-line), initiates a telnet session on port 80 to
the same server (that the user was trying to hit in the browser), and
enters a properly-formed HTTP request for the same page, squid
responds by dropping the connection with a blank.
* I suspect that it is squid returning the response for the following reasons:
(1) A squid log entry where the page was returned correctly looks like this:
1219668745.767 165 70.43.203.242 TCP_MISS/200 8056 GET
http://wiki.myserver.com/index.php?title=Home_Page/Work_Center/Page_Title&action=edit
- FIRST_UP_PARENT/74.213.131.84 text/html
(2) A squid log entry where the page never made it to the browser
looks like this:
1219665920.576 194 66.169.93.6 TCP_MISS/200 8056 GET
http://wiki.myserver.com/index.php?title=Home_Page/Work_Center/Page_Title&action=edit
- FIRST_UP_PARENT/74.213.131.84 text/html
... and so they both look pretty similar
(3) Apache log entry for #1:
74.213.131.82 - - [25/Aug/2008:07:48:42 -0500] "GET
/index.php?title=Home_Page/Work_Center/Page_Title&action=edit
HTTP/1.0" 200 7558
(4) Apache log entry for #2:
74.213.131.82 - - [25/Aug/2008:07:02:49 -0500] "GET
/index.php?title=Home_Page/Work_Center/Page_Title&action=edit
HTTP/1.0" 200 7558
(5) If I use a different ISP (either VNC to my home, or get on the
server running squid, or get on another server we administer halfway
across the country), the same page loads fine. And yes, I can have a
VNC session up while having the other web browser up, and hit "reload"
on the page on both machines, and the one on the client exhibiting the
issue will fail every time, and the one on the other machine will work
every time.
(6) Here are trace routes from two locations; the first is one that
was working properly, the second from a client that was exhibiting the
issue:
Tracing route to wiki.myserver.com [74.213.131.85]
over a maximum of 30 hops:
1 <1 ms <1 ms <1 ms 10.0.0.1
2 5 ms 6 ms 13 ms 10.116.240.1
3 8 ms 8 ms 7 ms 172.22.81.13
4 6 ms 6 ms 14 ms 172.22.33.161
5 9 ms 19 ms 9 ms 172.22.32.110
6 14 ms 12 ms 14 ms atl-edge-18.inet.qwest.net [216.206.221.149]
7 13 ms 11 ms 21 ms atl-core-01.inet.qwest.net [205.171.21.161]
8 27 ms 39 ms 28 ms cer-core-01.inet.qwest.net [67.14.8.202]
9 27 ms 28 ms 26 ms chp-brdr-02.inet.qwest.net [205.171.139.114]
10 27 ms 28 ms 27 ms ber1-ge-7-4.chicagoequinix.savvis.net
[208.173.180.25]
11 27 ms 26 ms 31 ms ber1-vlan-241.chicago.savvis.net [204.70.196.21]
12 * 29 ms 26 ms cr1-tengig-0-0-5-0.chicago.savvis.net
[204.70.195.113]
13 40 ms 39 ms 39 ms 204.70.200.86
14 43 ms 39 ms 42 ms acr2-so-4-0-0.washington.savvis.net
[204.70.196.182]
15 37 ms 40 ms 39 ms iar2-loopback.Washington.savvis.net
[206.24.226.13]
16 51 ms 47 ms 48 ms 208.174.125.110
17 52 ms 47 ms 49 ms cr2-g2-1.clt.hostedsolutions.com [216.27.69.226]
18 48 ms 49 ms 48 ms dr3-g6-1.clt.hostedsolutions.com [216.27.69.250]
19 50 ms 48 ms 50 ms shared-fw0.clt.hostedsolutions.com
[216.27.72.227]
20 47 ms 47 ms 50 ms 74.213.131.85
Trace complete.
Tracing route to wiki.myserver.com [74.213.131.85]
over a maximum of 30 hops:
1 5 ms 3 ms 3 ms 192.168.2.1
2 10 ms 13 ms 15 ms 10.117.64.1
3 14 ms 13 ms 11 ms 172.22.81.9
4 18 ms 12 ms 20 ms 172.22.33.161
5 16 ms 38 ms 17 ms 172.22.33.34
6 18 ms 18 ms 19 ms atl-edge-18.inet.qwest.net [216.206.221.149]
7 24 ms 23 ms 22 ms atl-core-02.inet.qwest.net [205.171.21.165]
8 42 ms 37 ms 37 ms cer-core-02.inet.qwest.net [67.14.8.206]
9 40 ms 37 ms 40 ms chp-brdr-02.inet.qwest.net [205.171.139.118]
10 42 ms 37 ms 41 ms ber1-ge-7-4.chicagoequinix.savvis.net
[208.173.180.25]
11 38 ms 38 ms 37 ms ber1-vlan-241.chicago.savvis.net [204.70.196.21]
12 40 ms 37 ms 36 ms cr1-tengig-0-0-5-0.chicago.savvis.net
[204.70.195.113]
13 50 ms 49 ms 50 ms 204.70.200.90
14 50 ms 49 ms 50 ms acr1-so-5-0-0.washington.savvis.net
[204.70.196.170]
15 51 ms 49 ms 49 ms iar2-loopback.Washington.savvis.net
[206.24.226.13]
16 58 ms 61 ms 57 ms 208.174.125.110
17 56 ms 59 ms 59 ms cr1-g2-1.clt.hostedsolutions.com [216.27.69.222]
18 58 ms 57 ms 57 ms dr3-g5-1.clt.hostedsolutions.com [216.27.69.242]
19 62 ms 59 ms 57 ms shared-fw0.clt.hostedsolutions.com
[216.27.72.227]
20 62 ms 70 ms 59 ms 74.213.131.85
Trace complete.
Both trace routes are basically taking the same path (one is using
Charter Cable residential; the other Charter Cable business).
Can anyone help me with this issue? Why would squid "diss" users
coming from one IP repeatedly? And if it isn't squid, why does squid
log the requests, but just not give any response? If you think it's
apache, how would apache be distinguishing from the two different
users, since apache just sees the request coming from a single IP?
Please also give me any other information on tests I could run to
figure this issue out. I really want to keep running squid, but I'm
pretty close to pulling the plug here.
(By the way, this issue first appeared when we moved to these new
servers, and we have had these issues from day one on this set of
servers. We also installed the RHEL5 version of squid on those
servers, which was a significant upgrade to the version we had been
running).
Joe
Received on Mon Aug 25 2008 - 13:23:23 MDT
This archive was generated by hypermail 2.2.0 : Mon Aug 25 2008 - 12:00:06 MDT