Re: Res: [squid-users] squid 3.2.0.5 smp scaling issues

From: <david_at_lang.hm>
Date: Wed, 4 May 2011 10:41:38 -0700 (PDT)

ping,

anything new on this issue? (including any patches for me to test?)

David Lang

On Mon, 25 Apr 2011, david_at_lang.hm wrote:

> Date: Mon, 25 Apr 2011 17:14:52 -0700 (PDT)
> From: david_at_lang.hm
> To: Alex Rousskov <rousskov_at_measurement-factory.com>
> Cc: Marcos <mczueira_at_yahoo.com.br>, squid-users_at_squid-cache.org,
> squid-dev_at_squid-cache.org
> Subject: Re: Res: [squid-users] squid 3.2.0.5 smp scaling issues
>
> On Mon, 25 Apr 2011, Alex Rousskov wrote:
>
>> On 04/25/2011 05:31 PM, david_at_lang.hm wrote:
>>> On Mon, 25 Apr 2011, david_at_lang.hm wrote:
>>>> On Mon, 25 Apr 2011, Alex Rousskov wrote:
>>>>> On 04/14/2011 09:06 PM, david_at_lang.hm wrote:
>>>>>
>>>>>> In addition, there seems to be some sort of locking betwen the multiple
>>>>>> worker processes in 3.2 when checking the ACLs
>>>>>
>>>>> There are pretty much no locks in the current official SMP code. This
>>>>> will change as we start adding shared caches in a week or so, but even
>>>>> then the ACLs will remain lock-free. There could be some internal
>>>>> locking in the 3rd-party libraries used by ACLs (regex and such), but I
>>>>> do not know much about them.
>>>>
>>>> what are the 3rd party libraries that I would be using?
>>
>> See "ldd squid". Here is a sample based on a randomly picked Squid:
>>
>> libnsl, libresolv, libstdc++, libgcc_s, libm, libc, libz, libepol
>>
>> Please note that I am not saying that any of these have problems in SMP
>> environment. I am only saying that Squid itself does not lock anything
>> runtime so if our suspect is SMP-related locks, they would have to
>> reside elsewhere. The other possibility is that we should suspect
>> something else, of course. IMHO, it is more likely to be something else:
>> after all, Squid does not use threads, where such problems are expected.
>
>
>> BTW, do you see more-or-less even load across CPU cores? If not, you may
>> need a patch that we find useful on older Linux kernels. It is discussed
>> in the "Will similar workers receive similar amount of work?" section of
>> http://wiki.squid-cache.org/Features/SmpScale
>
> the load is pretty even across all workers.
>
> with the problems descripted on that page, I would expect uneven utilization
> at low loads, but at high loads (with the workers busy serviceing requests
> rather than waiting for new connections), I would expect the work to even out
> (and the types of hacks described in that section to end up costing
> performance, but not in a way that would scale with the ACL processing load)
>
>>> one thought I had is that this could be locking on name lookups. how
>>> hard would it be to create a quick patch that would bypass the name
>>> lookups entirely and only do the lookups by IP.
>>
>> I did not realize your ACLs use DNS lookups. Squid internal DNS code
>> does not have any runtime SMP locks. However, the presence of DNS
>> lookups increases the number of suspects.
>
> they don't, everything in my test environment is by IP. But I've seen other
> software that still runs everything through name lookups, even if what's
> presented to the software (both in what's requested and in the ACLs) is all
> done by IPs. It's a easy way to bullet-proof the input (if it's a name it
> gets resolved, if it's an IP, the IP comes back as-is, and it works for IPv4
> and IPv6, no need to have logic that looks at the value and tries to figure
> out if the user intended to type a name or an IP). I don't know how squid is
> working internally (it's a pretty large codebase, and I haven't tried to
> really dive into it) so I don't know if squid does this or not.
>
>> A patch you propose does not sound difficult to me, but since I cannot
>> contribute such a patch soon, it is probably better to test with ACLs
>> that do not require any DNS lookups instead.
>>
>>
>>> if that regains the speed and/or scalability it would point fingers
>>> fairly conclusively at the DNS components.
>>>
>>> this is the only think that I can think of that should be shared between
>>> multiple workers processing ACLs
>>
>> but it is _not_ currently shared from Squid point of view.
>
> Ok, I was assuming from the description of things that there would be one DNS
> process that all the workers would be accessing. from the way it's described
> in the documentation it sounds as if it's already a separate process, so I
> was thinking that it was possible that if each ACL IP address is being put
> through a single DNS process, I could be running into contention on that
> process (and having to do name lookups for both IPv6 and then falling back to
> IPv4 would explain the severe performance hit far more than the difference
> between IPs being 128 bit values instead of 32 bit values)
>
> David Lang
>
>
Received on Wed May 04 2011 - 17:41:49 MDT

This archive was generated by hypermail 2.2.0 : Thu May 05 2011 - 12:00:02 MDT