Re: Vector vs std::vector

From: Kinkie <gkinkie_at_gmail.com>
Date: Wed, 29 Jan 2014 22:32:35 +0100

On Wed, Jan 29, 2014 at 7:52 PM, Alex Rousskov
<rousskov_at_measurement-factory.com> wrote:
> On 01/29/2014 07:08 AM, Kinkie wrote:
>
>> Amos has asked me over IRC to investigate any performance
>> differences between Vector and std::vector. To do that, I've
>> implemented astd::vector-based implementation of Vector
>> (feature-branch: lp:~squid/squid/vector-to-stdvector).
>
> Does Launchpad offer a way of generating a merge patch/diff on the site?
> Currently, I have to checkout the branch and do "bzr send" to get the
> right diff. Is there a better way?

Sure:
(in a trunk checkout)
bzr diff -r lp:/squid/squid/vector-to-stdvector

The resulting diff is reversed, but that should be easy enough to manage.

>> I've then done the performance testing using ab. The results are in: a
>> Vector-based squid is about 3% speedier than a std::vector based
>> squid.
>
>
>> This may also be due to some egregious layering by users of Vector. I
>> have seen things which I would like to correct, also with the
>> objective of having Vector implement the same exact API as std::vector
>> to make future porting easier.
>
> Can you give any specific examples of the code change that you would
> attribute to a loss of performance when using std::vector? I did not
> notice any obvious cases, but I did not look closely.

I suspect that it's all those lines doing vector.items[accessor] and
thus using C-style unchecked accesses.

>> test conditions:
>> - done on rs-ubuntu-saucy-perf (4-core VM, 4 Gb RAM)
>> - testing with ab. 1m requests @10 parallelism with keepalive,
>> stressing the TCP_MEM_HIT code path on a cold cache
>> - test on a multicore VM; default out-of-the-box configuration, ab
>> running on same hardware over the loopback interface.
>> - immediately after ab exits, collect counters (mgr:counters)
>>
>> numbers (for trunk / stdvector)
>> - mean response time: 1.032/1.060ms
>> - req/sec: 9685/9430
>> - cpu_time: 102.878167/106.013725
>
>
> I hate to be the one asking this, but with so many red flags in the
> testing methodology, are you reasonably sure that the 0.28 millisecond
> difference does not include 0.29+ milliseconds of noise? At the very
> least, do you consistently get the same set of numbers when repeating
> the two tests in random order?

I know that the testing methodology is very rough, and I am not
offended by you pointing that out. In fact, that is one of the reasons
why I tried being thorough in describing the method. I hope that you
or maybe Pawel can obtain more meaningful measures without investing
too much effort in it.

> BTW, your req/sec line is redundant. In your best-effort test, the proxy
> response time determines the offered request rate:
>
> 9685/9430 = 1.027 (your "3% speedier")
> 1.060/1.032 = 1.027 (your "3% speedier")
>
> 10*(1000/1.032) = 9690 (10-robot request rate from response time)
> 10*(1000/1.060) = 9434 (10-robot request rate from response time)

Yes.
In fact, I consider that the interesting value is not either those,
but the ~3seconds of extra CPU time needed.

If you can suggest a more thorough set of commands using the rough
tools I have, I'll gladly run them. As another option, I hope Pawel
can take the time to run the standard polygraph on that branch. (can
you, Pawel?)

Thanks!

-- 
    /kinkie
Received on Wed Jan 29 2014 - 21:32:43 MST

This archive was generated by hypermail 2.2.0 : Thu Jan 30 2014 - 12:00:15 MST