thanks for your answer David.
i'm seeing too much feature been included at squid 3.x, but it's getting as
slower as new features are added.
i think squid 3.2 with 1 worker should be as fast as 2.7, but it's getting
slower e hungry.
Marcos
----- Mensagem original ----
De: "david_at_lang.hm" <david_at_lang.hm>
Para: Marcos <mczueira_at_yahoo.com.br>
Cc: Amos Jeffries <squid3_at_treenet.co.nz>; squid-users_at_squid-cache.org;
squid-dev_at_squid-cache.org
Enviadas: Sexta-feira, 22 de Abril de 2011 15:10:44
Assunto: Re: Res: [squid-users] squid 3.2.0.5 smp scaling issues
ping, I haven't seen a response to this additional information that I sent out
last week.
squid 3.1 and 3.2 are a significant regression in performance from squid 2.7 or
3.0
David Lang
On Thu, 14 Apr 2011, david_at_lang.hm wrote:
> Subject: Re: Res: [squid-users] squid 3.2.0.5 smp scaling issues
>
> Ok, I finally got a chance to test 2.7STABLE9
>
> it performs about the same as squid 3.0, possibly a little better.
>
> with my somewhat stripped down config (smaller regex patterns, replacing CIDR
>blocks and names that would need to be looked up in /etc/hosts with individual
>IP addresses)
>
> 2.7 gives ~4800 requests/sec
> 3.0 gives ~4600 requests/sec
> 3.2.0.6 with 1 worker gives ~1300 requests/sec
> 3.2.0.6 with 5 workers gives ~2800 requests/sec
>
> the numbers for 3.0 are slightly better than what I was getting with the full
>ruleset, but the numbers for 3.2.0.6 are pretty much exactly what I got from the
>last round of tests (with either the full or simplified ruleset)
>
> so 3.1 and 3.2 are a very significant regression from 2.7 or 3.0, and the
>ability to use multiple worker processes in 3.2 doesn't make up for this.
>
> the time taken seems to almost all be in the ACL avaluation as eliminating all
>the ACLs takes 1 worker with 3.2 up to 4200 requests/sec.
>
> one theory is that even though I have IPv6 disabled on this build, the added
>space and more expensive checks needed to compare IPv6 addresses instead of IPv4
>addresses accounts for the single worker drop of ~66%. that seems rather
>expensive, even though there are 293 http_access lines (and one of them uses
>external file contents in it's acls, so it's a total of ~2400 source/destination
>pairs, however due to the ability to shortcut the comparison the number of tests
>that need to be done should be <400)
>
>
>
> In addition, there seems to be some sort of locking betwen the multiple worker
>processes in 3.2 when checking the ACLs as the test with almost no ACLs scales
>close to 100% per worker while with the ACLs it scales much more slowly, and
>above 4-5 workers actually drops off dramatically (to the point where with 8
>workers the throughput is down to about what you get with 1-2 workers) I don't
>see any conceptual reason why the ACL checks of the different worker threads
>should impact each other in any way, let alone in a way that limits scalability
>to ~4 workers before adding more workers is a net loss.
>
> David Lang
>
>
>> On Wed, 13 Apr 2011, Marcos wrote:
>>
>>> Hi David,
>>>
>>> could you run and publish your benchmark with squid 2.7 ???
>>> i'd like to know if is there any regression between 2.7 and 3.x series.
>>>
>>> thanks.
>>>
>>> Marcos
>>>
>>>
>>> ----- Mensagem original ----
>>> De: "david_at_lang.hm" <david_at_lang.hm>
>>> Para: Amos Jeffries <squid3_at_treenet.co.nz>
>>> Cc: squid-users_at_squid-cache.org; squid-dev_at_squid-cache.org
>>> Enviadas: S?bado, 9 de Abril de 2011 12:56:12
>>> Assunto: Re: [squid-users] squid 3.2.0.5 smp scaling issues
>>>
>>> On Sat, 9 Apr 2011, Amos Jeffries wrote:
>>>
>>>> On 09/04/11 14:27, david_at_lang.hm wrote:
>>>>> A couple more things about the ACLs used in my test
>>>>>
>>>>> all of them are allow ACLs (no deny rules to worry about precidence of)
>>>>> except for a deny-all at the bottom
>>>>>
>>>>> the ACL line that permits the test source to the test destination has
>>>>> zero overlap with the rest of the rules
>>>>>
>>>>> every rule has an IP based restriction (even the ones with url_regex are
>>>>> source -> URL regex)
>>>>>
>>>>> I moved the ACL that allows my test from the bottom of the ruleset to
>>>>> the top and the resulting performance numbers were up as if the other
>>>>> ACLs didn't exist. As such it is very clear that 3.2 is evaluating every
>>>>> rule.
>>>>>
>>>>> I changed one of the url_regex rules to just match one line rather than
>>>>> a file containing 307 lines to see if that made a difference, and it
>>>>> made no significant difference. So this indicates to me that it's not
>>>>> having to fully evaluate every rule (it's able to skip doing the regex
>>>>> if the IP match doesn't work)
>>>>>
>>>>> I then changed all the acl lines that used hostnames to have IP
>>>>> addresses in them, and this also made no significant difference
>>>>>
>>>>> I then changed all subnet matches to single IP address (just nuked /##
>>>>> throughout the config file) and this also made no significant difference.
>>>>>
>>>>
>>>> Squid has always worked this way. It will *test* every rule from the top down
>>>>to the one that matches. Also testing each line left-to-right until one fails or
>>>>the whole line matches.
>>>>
>>>>>
>>>>> so why are the address matches so expensive
>>>>>
>>>>
>>>> 3.0 and older IP address is a 32-bit comparison.
>>>> 3.1 and newer IP address is a 128-bit comparison with memcmp().
>>>>
>>>> If something like a word-wise comparison can be implemented faster than
>>>>memcmp() we would welcome it.
>>>
>>> I wonder if there should be a different version that's used when IPv6 is
>>>disabled. this is a pretty large hit.
>>>
>>> if the data is aligned properly, on a 64 bit system this should still only be 2
>>>compares. do you do any alignment on the data now?
>>>
>>>>> and as noted in the e-mail below, why do these checks not scale nicely
>>>>> with the number of worker processes? If they did, the fact that one 3.2
>>>>> process is about 1/3 the speed of a 3.0 process in checking the acls
>>>>> wouldn't matter nearly as much when it's so easy to get an 8+ core system.
>>>>>
>>>>
>>>> There you have the unknown.
>>>
>>> I think this is a fairly critical thing to figure out.
>
Received on Mon Apr 25 2011 - 19:15:36 MDT
This archive was generated by hypermail 2.2.0 : Tue Apr 26 2011 - 12:00:03 MDT