Re: [squid-users] "Quadruple" memory usage with squid

From: Linda Messerschmidt <linda.messerschmidt_at_gmail.com>
Date: Wed, 25 Nov 2009 00:47:46 -0500

On Tue, Nov 24, 2009 at 9:52 PM, Marcus Kool
<marcus.kool_at_urlfilterdb.com> wrote:
> Linda started this thread with huge performance problems
> when Squid with a size of 12 GB forks 15 times.

I'm still not convinced that our memory usage is correct. Here's
where the cache_mem 10GB process is at right now:

        Maximum Resident Size: 19376000 KB
        Total accounted: 9735764 KB
        Storage Mem size: 9296872 KB
        338229 StoreEntries
        338229 StoreEntries with MemObjects

So there is (9735764 - 9296872) 438,892 KB "overhead" (memory
accounted for that is not used for storing objects) and (19376000 -
9735764) 9,640,236 KB "unaccounted."

I will put the whole memory report at the bottom of this message
though it is very hard to read. :-)

> Linda emailed me that she is doing a test with
> vm.pmap.pg_ps_enabled set to 1 (the kernel will
> transparently transform 4K pages into superpages)
> which gives a big relief for TLB management
> and will most likely reduce performance problems with
> forks of very large processes.

I apologize I meant to send that to the whole list.

In any case, it did not help. This is from top while it was doing
daily rotation a few moments ago:

  PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
 1205 squid 1 118 0 18650M 18675M CPU6 1 57:22 97.85% squid
74418 squid 1 118 0 16259M 24K CPU2 2 0:05 97.66% squid

It doesn't look like the superpages enabled had a lot of effect:

$ sysctl vm.pmap
vm.pmap.pmap_collect_active: 0
vm.pmap.pmap_collect_inactive: 0
vm.pmap.pv_entry_spare: 6673
vm.pmap.pv_entry_allocs: 105465793
vm.pmap.pv_entry_frees: 100913786
vm.pmap.pc_chunk_tryfail: 0
vm.pmap.pc_chunk_frees: 610560
vm.pmap.pc_chunk_allocs: 637695
vm.pmap.pc_chunk_count: 27135
vm.pmap.pv_entry_count: 4552007 <--------
vm.pmap.pde.promotions: 12657 <------
vm.pmap.pde.p_failures: 222291
vm.pmap.pde.mappings: 0
vm.pmap.pde.demotions: 11968 <-------
vm.pmap.shpgperproc: 200
vm.pmap.pv_entry_max: 7330186
vm.pmap.pg_ps_enabled: 1

Here are similar readings from the 16GB machine, where the squid
process has VSZ of 8599M:

vm.pmap.pmap_collect_active: 0
vm.pmap.pmap_collect_inactive: 0
vm.pmap.pv_entry_spare: 16299
vm.pmap.pv_entry_allocs: 2196230824
vm.pmap.pv_entry_frees: 2193907723
vm.pmap.pc_chunk_tryfail: 0
vm.pmap.pc_chunk_frees: 13548719
vm.pmap.pc_chunk_allocs: 13562644
vm.pmap.pc_chunk_count: 13925
vm.pmap.pv_entry_count: 2323101 <------------
vm.pmap.pde.promotions: 0
vm.pmap.pde.p_failures: 0
vm.pmap.pde.mappings: 0
vm.pmap.pde.demotions: 0
vm.pmap.shpgperproc: 200
vm.pmap.pv_entry_max: 4276871
vm.pmap.pg_ps_enabled: 0

Squid uses almost all of the memory in use on both systems, so the
rough value of:

No superpages: 2323101 / 8599 = 270 pages per MB
Yes superpages: 4552007 / 18650 = 244 pages per MB

That's not quite fair since it's the whole system's page table vs.
squid's address space but these machines are all about squid so it's
fair enough.

I added up all the system's VSZ and divided it into the updated number
of pages; the system average is just 4.554KB per page, so it seems
like there must be barely any super pages in use, which is borne out
by the (promotions - demotions) figure of (12657 - 11968) 689 active
superpages. Assuming that means what I think it means.

> can we modify Squid to fork at a slower pace?

I think there is a config option (sleep_after_fork) for that already.
But it would not help as the processes are already forked one at a
time and the page table duplication happens before the exec. Even if
it did help, sleep_after_fork would have to be several seconds, and
squid would still be nonresponsive for sleep_after_fork * 16.

It sounds like this problem is a bit more serious and complicated than
I thought. I really hoped it was a config error on our part. :-(

Here is the full memory report as the process VSZ crossed over to
19,001 MB, which shows a total of 9,317,346 KB in use:

Current memory usage:
Pool Obj Size Allocated In Use Idle Allocations Saved Hit Rate
          (bytes) (#) (KB) high (KB) high (hrs) impact (%total) (#)
(KB) high (KB) portion (%alloc) (#) (KB) high (KB) (number)
(%num) (%vol) (%num) (number)
2K Buffer 2048 96 192 248 0.52 0 91 182 248 95
5 10 94 34958835 4.42 18.12 99.96 34972340
4K Buffer 4096 199 796 3904 11.98 0 194 776 3904
97 5 20 1720 7225790 0.91 7.49 98.41 7342427
8K Buffer 8192 21 168 272 12.05 0 17 136 272 81
4 32 176 1880965 0.24 3.90 98.40 1911548
16K Buffer 16384 3 48 64 13.39 0 0 0 64 0 3 48
64 3600 0.00 0.01 98.41 3658
32K Buffer 32768 2 64 128 14.56 0 0 0 128 0 2
64 128 369 0.00 0.00 98.14 376
64K Buffer 65536 2 128 192 12.81 0 0 0 192 0 2
128 128 88 0.00 0.00 95.65 92
Store Client Buffer 4096 22 88 1052 12.05 0 11 44 1052 50
11 44 812 30568572 3.87 31.69 99.74 30646811
acl 64 11 1 1 15.35 0 11 1 1 100 0 0
1 11 0.00 0.00 50.00 22
acl_ip_data 24 7 1 1 15.35 0 7 1 1 100 0 0 1
7 0.00 0.00 50.00 14
acl_list 24 18 1 1 15.35 0 18 1 1 100 0 0
1 18 0.00 0.00 50.00 36
dlink_node 24 337 8 8 0.52 0 0 0 8 0 337 8 8
1088 0.00 0.00 76.35 1425
FwdServer 24 41 1 6 12.05 0 9 1 6 22 32 1 5
3875752 0.49 0.02 99.87 3880603
HttpReply 168 338927 55606 55606 0.00 1 338922
55605 55606 100 5 1 23 10817354 1.37 0.46 96.17 11248509
HttpHeaderEntry 40 2602377 101656 101656 0.00 1 2602282
101652 101656 100 95 4 101 200511069 25.36 2.03 98.06
204487269
HttpHdrCc 40 27921 1091 1091 0.01 0 27914 1091
1091 100 7 1 3 7294400 0.92 0.07 99.17 7355123
HttpHdrRangeSpec 16 96 2 2 13.43 0 0 0 2 0 96 2 2
25596 0.00 0.00 99.31 25775
HttpHdrRange 16 2 1 1 12.07 0 0 0 1 0 2 1 1
21759 0.00 0.00 99.44 21881
HttpHdrContRange 24 5 1 1 13.46 0 0 0 1 0 5 1 1
10713 0.00 0.00 97.21 11021
intlist 16 3 1 1 15.35 0 3 1 1 100 0 0 1
3 0.00 0.00 50.00 6
MemObject 272 338920 90026 90026 0.00 1 338915
90025 90026 100 5 2 39 5022931 0.64 0.35 93.21 5388889
mem_node 4112 2338883 9392078 9392078 0.00 96
2338871 9392029 9392078 100 12 49 3908 15349368 1.94 15.98
81.20 18902877
relist 48 8 1 1 15.35 0 8 1 1 100 0 0 1
8 0.00 0.00 50.00 16
request_t 1360 341 453 893 12.05 0 326 433 893
96 15 20 496 9456631 1.20 3.26 98.60 9590792
StoreEntry 88 338920 29126 29126 0.00 0 338915
29126 29126 100 5 1 13 5025544 0.64 0.11 93.26 5388889
wordlist 16 14 1 1 15.35 0 11 1 1 79 3 1 1
88 0.00 0.00 86.27 102
ClientInfo 344 28096 9439 14657 12.40 0 15714 5279
14657 56 12382 4160 4852 653528 0.08 0.06 93.63 697962
MD5 digest 16 338920 5296 5296 0.00 0 338915 5296
5296 100 5 1 3 13112283 1.66 0.05 97.40 13462778
helper_request 40 353 14 14 0.52 0 1 1 14 0 352
14 14 4805587 0.61 0.05 99.99 4805971
storeSwapTLV 24 5 1 1 15.32 0 0 0 1 0 5 1 1
47627493 6.02 0.29 100.00 47627498
Short Strings 36 2939320 103336 103336 0.00 1 2939241
103333 103336 100 79 3 75 219162796 27.72 2.00 98.15
223283726
Medium Strings 128 15166 1896 1951 0.52 0 15126 1891
1951 100 40 5 89 50029245 6.33 1.62 99.39 50334410
Long Strings 512 5927 2964 3002 0.52 0 5911 2956
3002 100 16 8 143 9251916 1.17 1.20 99.22 9325056
cbdata acl_access (1) 56 13 1 1 15.35 0 13 1 1 100 0 0
1 13 0.00 0.00 50.00 26
cbdata aclCheck_t (2) 312
daf
 1 1 1 15.32 0 0 0 1 0 1 1 1 22575783 2.86 1.78 100.00 22575832
cbdata clientHttpRequest (3) 1096 21 23 481 0.52 0 15 17 481
71 6 7 387 4735510 0.60 1.31 98.21 4821935
cbdata ConnStateData (4) 320 194 61 299 11.98 0 190 60 299
98 4 2 137 2604058 0.33 0.21 97.41 2673373
cbdata ErrorState (5) 160 200 32 32 0.06 0 199 32 32 100
1 1 1 16266 0.00 0.00 98.79 16466
cbdata FwdState (6) 96 31 3 24 12.05 0 9 1 24 29 22 3
20 3871284 0.49 0.09 99.76 3880603
cbdata generic_cbdata (7) 32 30 1 4 15.32 0 0 0 4 0 30
1 4 14750620 1.87 0.12 100.00 14750938
cbdata helper (8) 136 1 1 1 15.35 0 1 1 1 100 0 0 0
0 0.00 0.00 0.00 1
cbdata helper_server (9) 152 16 3 3 15.35 0 16 3 3 100 0
0 3 32 0.00 0.00 66.67 48
cbdata HttpStateData (12) 152 38 6 37 12.05 0 9 2 37 24
29 5 32 3874204 0.49 0.15 99.85 3880194
cbdata ps_state (14) 200 1 1 1 15.32 0 0 0 1 0 1 1 1
3880602 0.49 0.20 100.00 3880603
cbdata RemovalPolicy (15) 104 1 1 1 15.35 0 1 1 1 100 0
0 0 0 0.00 0.00 0.00 1
cbdata store_client (18) 128 167 21 45 12.05 0 133 17 45
80 34 5 29 8643422 1.09 0.28 99.89 8652735
event 48 34 2 3 12.38 0 31 2 3 91 3 1 1
8007262 1.01 0.10 99.96 8010458
cbdata http_port_list (19) 128 2 1 1 14.75 0 2 1 1 100 0
0 0 0 0.00 0.00 0.00 2
cbdata body_size (20) 64 1 1 1 15.35 0 1 1 1 100 0 0 1
1 0.00 0.00 50.00 2
cbdata ConnectStateData (21) 96 16 2 11 15.32 0 0 0 11 0
16 2 11 3826157 0.48 0.09 99.96 3827561
close_handler 24 236 6 31 12.05 0 224 6 31 95 12
1 13 14200692 1.80 0.09 99.57 14261946
ipcache_entry 128 978 123 137 13.32 0 951 119 137
97 27 4 22 611237 0.08 0.02 98.86 618298
fqdncache_entry 160 20 4 4 15.35 0 20 4 4 100 0 0
4 24 0.00 0.00 54.55 44
cbdata idns_query (22) 8680 2 17 941 15.32 0 0 0 941 0 2
17 941 616884 0.08 1.36 99.78 618218
cbdata redirectStateData (23) 72 353 25 25 0.52 0 1 1 25
0 352 25 25 4805595 0.61 0.09 99.99 4805971
cbdata Logfile (26) 1120 3 4 4 15.35 0 3 4 4 100 0 0 4
3 0.00 0.00 50.00 6
pconn_data 32 6 1 1 13.34 0 1 1 1 17 5 1 1
61876 0.01 0.00 99.69 62069
pconn_fds 32 6 1 1 13.34 0 1 1 1 17 5 1 1
61889 0.01 0.00 99.71 62069
cbdata AddVaryState (29) 160 1 1 2 12.38 0 0 0 2 0 1 1
2 305954 0.04 0.01 99.84 306435
cbdata LocateVaryState (30) 136 3 1 1 12.38 0 0 0 1 0 3
1 1 571615 0.07 0.02 99.99 571661
VaryData 32 6 1 1 13.62 0 1 1 1 17 5 1 1
570046 0.07 0.00 99.72 571661
cbdata StoreUpdateState (31) 4152 2 9 41 12.34 0 0 0 41 0
2 9 41 1044654 0.13 1.10 99.71 1047729
Total - 9317346 9794810 9794810 0.00 100
9303255 9790117 9790185 100 14091 4693 5120 776329091 98.19
95.79 98.19 790614788
Cumulative allocated volume: 395.07 GB
Current overhead: 301304 bytes (0.003%)
Idle pool limit: 5.00 MB
memPoolAlloc calls: 790614788
memPoolFree calls: 781311532
String Pool Impact
          (%strings) (%volume)
Short Strings 99 96
Medium Strings 1 2
Long Strings 0 3
Other Strings 0 0

Large buffers: 0 (0 KB)

Are we 100% certain this is normal / correct? Cutting the process
size in half would double the process spawn rate. :)

Thanks!
Received on Wed Nov 25 2009 - 05:47:54 MST

This archive was generated by hypermail 2.2.0 : Wed Nov 25 2009 - 12:00:06 MST