Re: [squid-users] dynamin content "pattern_refresh".

From: Eliezer Croitoru <eliezer_at_ngtech.co.il>
Date: Thu, 24 May 2012 15:24:50 +0300

i'm not that fan of linking to web pages so you will need to read but
sometimes it's a must.
starts with this:
http://www.mnot.net/cache_docs/
to understand the basic things about http cache.

this page:
http://www.squid-cache.org/Doc/config/refresh_pattern/
is specific on squid refresh patterns usage.

i must say that the above squid document took me awhile to understand
and until today i dont like the way it's organized (just my opinion) but
it contains everything about refresh_pattern.

this document:
http://etutorials.org/Server+Administration/Squid.+The+definitive+guide/Chapter+7.+Disk+Cache+Basics/7.7+refresh_pattern/

was taken from squid book and is organized nicely.

i will give you one site that is organized as heaven for http cache:
http://www.djmaza.com/

about the frontera.info you will want to check the main page instead of
the one you checked using:
http://redbot.org/?descend=True&uri=http://www.frontera.info/Home.aspx

the server time is not correct but as for cache age it's not important.

Regards,

Eliezer
(you can look me up on squid irc channel)

On 24/05/2012 03:59, Beto Moreno wrote:
> Hi, thanks for your info.
>
> I had try that tool, just need to understand:
>
> frontera.info say:
>
> General
> The server's clock is 3 min 58 sec behind.
> Content Negotiation
> The resource doesn't send Vary consistently.
> The server's clock is 3 min 58 sec behind.
> Content negotiation for gzip compression is supported, saving 42%.
> The server's clock is 3 min 58 sec behind.
> Caching
> This response only allows a private cache to store it.
> This response allows a cache to assign its own freshness lifetime.
>
> Now the embedded:
>
> Problems
> The server's clock is 3 min 58 sec behind.
> This response allows a cache to assign its own freshness lifetime.
> The resource doesn't send Vary consistently.
> The Content-Disposition header doesn't have a 'filename' parameter.
> Cache-Control: public is rarely necessary.
> The If-Modified-Since response is missing required headers.
>
> Now yahoo.com
>
> General
> The server's clock is correct.
> Caching
>
> This response only allows a private cache to store it.
> This response allows a cache to assign its own freshness lifetime.
>
> Now: noticiasmvs.com
>
> General
> The server's clock is correct.
> The Content-Length header is correct.
>
> Content Negotiation
> Content negotiation for gzip compression is supported, saving 22%.
>
> Caching
> This response allows all caches to store it.
> This response allows a cache to assign its own freshness lifetime.
>
> What I understand is that yahoo/frontera won't let squid to save some
> of their data, and noticiasmvs is open for squid, right?
>
> Will very appreciated if someone could explain me a little more about
> this output from this site I want to go deeper with squid, what we can
> do in this situation(private cache)?
>
> Thanks!!!
>
> On Wed, May 23, 2012 at 2:57 AM, Eliezer Croitoru<eliezer_at_ngtech.co.il> wrote:
>> you can try to use this tool:
>> http://redbot.org/
>> to make sure what are the sites cachebilaty options
>> maybe some objects there need some cache enforcement rules in the
>> refresh_pattern specified for them.
>>
>> Eliezer
>>
>>
>> On 23/05/2012 02:54, Beto Moreno wrote:
>>>
>>> I had been working on the settings:
>>>
>>> refresh_pattern.
>>>
>>> The doc say that is better for the new websites that use dynamic
>>> content and a friend here at the list explain me the difference.
>>>
>>> My test was simple:
>>>
>>> use 2 browsers: firefox/iexplore.
>>> Run the test twice for each site.
>>>
>>> first run
>>> firefox site1, site2,site3,site4
>>> iexplore site1, site2,site3,site4
>>>
>>> run ccleaner, repeat the test.
>>>
>>> run srg to get my squid-cache peformance and free-sa.
>>>
>>> They where 3 settings I try and make the same test.
>>>
>>> NOTE: every time I start a setting, I delete my cache, clean my logs
>>> and start from 0.
>>>
>>> setting 1 default settings
>>> acl QUERY urlpath_regex cgi-bin \?
>>> cache deny QUERY
>>>
>>> setting 2 new way:
>>> disable the old way:
>>>
>>> #acl QUERY urlpath_regex cgi-bin \?
>>> #cache deny QUERY
>>> refresh_pattern -i (/cgi-bin/|\?) 0 0% 0
>>> refresh_pattern . 0 20% 4320
>>>
>>> setting 2:
>>>
>>> refresh_pattern -i (/cgi-bin/|\?) 0 0% 0
>>> refresh_pattern -i \.(gif|png|jpg|jpeg|ico)$ 10080 90% 43200
>>> refresh_pattern -i \.index.(html|htm)$ 0 40% 10080
>>> refresh_pattern -i \.(html|htm|css|js)$ 1440 40% 40320
>>> refresh_pattern . 0 20% 4320
>>>
>>> Them after I finish my test I start reviewing my logs and compare,
>>> the sites I use was:
>>>
>>> yahoo.com
>>> osnews.com
>>> frontera,info(local news paper)
>>> noticias,nvs.com
>>> centos.org
>>>
>>> I didn't interact with the site, just get to the first page, finish
>>> loading and done, continue with the next one.
>>>
>>> Once I check my reports I didn't see to much difference, I found just
>>> 1 log that the old way didn't "cache" 1 thing, check:
>>>
>>> setting 1/2 have this:
>>>
>>> 1337667655.898 0 192.168.50.100 TCP_MEM_HIT/200 21280 GET
>>> http://www.frontera.info/WebResource.axd? - NONE/-
>>> application/x-javascript
>>>
>>> setting 1 TCP_MISS.
>>>
>>> Example of part my logs:
>>>
>>> 1337667655.596 43 192.168.50.100 TCP_MISS/302 603 GET
>>> http://frontera.info/ - DIRECT/216.240.181.163 text/html
>>> 1337667655.748 54 192.168.50.100 TCP_MISS/200 1454 GET
>>> http://www.frontera.info/HojasEstilos/Horoscopos.css -
>>> DIRECT/216.240.181.163 text/css
>>> 1337667655.749 52 192.168.50.100 TCP_MISS/200 1740 GET
>>> http://www.frontera.info/Includes/Controles/LosEconomicos.css -
>>> DIRECT/216.240.181.163 text/css
>>> 1337667655.749 49 192.168.50.100 TCP_MISS/200 1557 GET
>>> http://www.frontera.info/Includes/Controles/ReporteroCiudadano.css -
>>> DIRECT/216.240.181.163 text/css
>>> 1337667655.754 54 192.168.50.100 TCP_MISS/200 1697 GET
>>> http://www.frontera.info/Includes/Controles/Elementos.css -
>>> DIRECT/216.240.181.163 text/css
>>> 1337667655.780 24 192.168.50.100 TCP_MISS/200 1406 GET
>>> http://www.frontera.info/Includes/Controles/Finanzas.css -
>>> DIRECT/216.240.181.163 text/css
>>> 1337667655.817 124 192.168.50.100 TCP_MISS/200 21639 GET
>>> http://www.frontera.info/HojasEstilos/Estilos2009.css -
>>> DIRECT/216.240.181.163 text/css
>>> 1337667655.898 0 192.168.50.100 TCP_MEM_HIT/200 21280 GET
>>> http://www.frontera.info/WebResource.axd? - NONE/-
>>> application/x-javascript
>>> 1337667655.903 20 192.168.50.100 TCP_MISS/200 1356 GET
>>> http://www.frontera.info/Interactivos/lib/jquery.jcarousel.css -
>>> DIRECT/216.240.181.163 text/css
>>> 1337667655.907 308 192.168.50.100 TCP_MISS/200 116552 GET
>>> http://www.frontera.info/Home.aspx - DIRECT/216.240.181.163 text/html
>>> 1337667655.935 23 192.168.50.100 TCP_MISS/200 3934 GET
>>> http://www.frontera.info/Interactivos/skins/fotos/skin.css -
>>> DIRECT/216.240.181.163 text/css
>>> 1337667655.966 27 192.168.50.100 TCP_MISS/200 3995 GET
>>> http://www.frontera.info/Interactivos/skins/elementos/skin.css -
>>> DIRECT/216.240.181.163 text/css
>>> 1337667655.971 23 192.168.50.100 TCP_MISS/200 4260 GET
>>> http://www.frontera.info/HojasEstilos/ui.tabs.css -
>>> DIRECT/216.240.181.163 text/css
>>> 1337667655.972 24 192.168.50.100 TCP_MISS/200 4953 GET
>>> http://www.frontera.info/HojasEstilos/thickbox.css -
>>> DIRECT/216.240.181.163 text/css
>>> 1337667655.993 21 192.168.50.100 TCP_MISS/200 4380 GET
>>> http://www.frontera.info/js/finanzas.js - DIRECT/216.240.181.163
>>> application/x-javascript
>>> 1337667655.997 47 192.168.50.100 TCP_MISS/200 9341 GET
>>> http://www.frontera.info/Interactivos/lib/jquery.jcarousel.pack.js -
>>> DIRECT/216.240.181.163 application/x-javascript
>>> 1337667656.023 25 192.168.50.100 TCP_MISS/200 4239 GET
>>> http://www.frontera.info/videos/external_script.js -
>>> DIRECT/216.240.181.163 application/x-javascript
>>>
>>> 3 settings same TCP_MISS.
>>>
>>> I was thinking that maybe I will get more TCP_HIT, MEM_HIT, but no.
>>> noticiasmvs.com a lot HIT's but with the 3 settings.
>>>
>>> do this site disable caching their site? exist a way to find out?
>>> what could cause to still get a lot of MISS?
>>> where my settings wrong?
>>> my test was not the best way?
>>> how can I see if this new settings make a difference?
>>>
>>> Any input will be appreciated, thanks for your time!!!
>>>
>>> I'm using squid 2.7.x
>>
>>
>>
>> --
>> Eliezer Croitoru
>> https://www1.ngtech.co.il
>> IT consulting for Nonprofit organizations
>> eliezer<at> ngtech.co.il
>>

-- 
Eliezer Croitoru
https://www1.ngtech.co.il
IT consulting for Nonprofit organizations
eliezer <at> ngtech.co.il
Received on Thu May 24 2012 - 12:25:06 MDT

This archive was generated by hypermail 2.2.0 : Fri May 25 2012 - 12:00:04 MDT