On Wed, Dec 2, 2009 at 1:40 AM, Amos Jeffries <squid3_at_treenet.co.nz> wrote:
> I'd like to assure you we are taking this seriously. Henrik was just
> pointing out to us that the superficially perfect vfork() is not as reliably
> supported by Linux kernel groups as we would like.
The exploding unused/unaccounted-for memory usage occurs during
regular usage, without any forking at all.
At this point, routine forking has been eliminated by moving rotation
to the log child process, so vfork() isn't really a factor for us; it
never gets called. If you guys do choose to go the vfork route, I
hope you test it very carefully because "the parent process is
suspended while the child is using its resources" sounds like a bad
idea to me.
Thus that problem, terminally slow reconfigures due to process
forking, is resolved as far as I'm concerned. And I am really
grateful to everyone who pointed us along the track of using external
log daemons.
At the same time, the fork problem turned out to be related to the
page table size, not the actual memory usage, anyway. Thus, to
accurately reflect the remaining problem, the title of this thread
could accurately be changed to "'Double' memory usage with squid."
What I've seen about this issue in searching past discussions is the
general conclusion "oh that's just virtual address space, it doesn't
really mean anything." But in this case it does. Unlike "empty"
virtual address space, it gets paged out, takes up swap space, forces
other stuff to be paged out and back in, and eventually runs the
machine entirely out of swap and crashes. But this memory never gets
used (never gets paged back in, as far as I can tell) and squid
doesn't account for it in its memory usage reports.
Henrik suggested it could be due to undocumented/unreported metadata
related to in-memory objects, but didn't follow up to my answers to
his questions about average object size. And I don't understand why
over 50% of squid's memory usage would be undocumented/unreported like
that.
Stats right now:
Storage Mem size: 9959600 KB
Total accounted: 10404108 KB
Maximum Resident Size: 19531068 KB
342716 StoreEntries with MemObjects
I really wanted to see if people running squid on OS's other than
FreeBSD had similar or different measurements for "Total accounted"
versus the actual size of the process.
I did find one thread in ancient history where somebody said he saw
this exact same problem with squid on FreeBSD. In his case, I believe
he solved it by using the dlmalloc. Unfortunately, that's not an
option for me due to its 32-bit limits. But to me that implies that
there's something about the way squid allocates memory on FreeBSD
that's inefficient or problematic.
For example, I noticed that the size of the mem_node object is 4112
bytes and we have 2499136 of them at present. Suppose that each one
of those allocations had to be filled with two 4K pages (8192 bytes).
That would "waste" 4080 bytes per allocation, or about 9957495KB,
which would explain what we're seeing. I totally understand that
that's a grossly oversimplified example and not how squid's
memorypool-based allocation really works, so that's probably not
*actually* what's happening. I just offer it as an explanation of
what the problem "feels like." I, unfortunately, understand neither
squid's memory allocation practices nor the FreeBSD malloc()
implementation well enough to offer a more accurate guess.
The process is on track to start pushing into swap later today, I will
try to gather more information as it gets larger.
Received on Wed Dec 02 2009 - 15:07:11 MST
This archive was generated by hypermail 2.2.0 : Thu Dec 03 2009 - 12:00:01 MST