[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <48B297D5.8020402@gmail.com>
Date: Mon, 25 Aug 2008 14:30:29 +0300
From: edwin <edwintorok@...il.com>
To: Peter Zijlstra <peterz@...radead.org>
CC: Ingo Molnar <mingo@...e.hu>, rml@...h9.net,
Linux Kernel <linux-kernel@...r.kernel.org>,
"Thomas Gleixner mingo@...hat.com" <tglx@...utronix.de>,
"H. Peter Anvin" <hpa@...or.com>
Subject: Re: Quad core CPUs loaded at only 50% when running a CPU and mmap
intensive multi-threaded task
edwin wrote:
> Peter Zijlstra wrote:
>> On Mon, 2008-08-25 at 13:22 +0300, Török Edwin wrote:
>>
>>
>>> Well, the real program (clamd) that this testprogram tries to
>>> simulate does an mmap for almost every file, and I have lots of
>>> small files.
>>> 6.5G, 114122 files, average size 57k.
>>>
>>> I'll run latencytop again, last time it has showed 100ms - 500ms
>>> latency
>
> Latencytop output attached.
> There is 4 - 60 ms latency for mmap/munmap, and the more threads there
> are the total latency gets higher (latencytop says sum was ~480ms).
>
> Running with MaxThreads 4 gets me 300-400% CPU usage, but with
> MaxThreads 8 CPU usage drops to around 120-250%.
> Now, maxthreads 4 looks like a good choice from a CPU usage point of
> view, but is actually bad because it means that threads gets stuck in
> iowait, and the CPU won't have anything to do. MaxThreads 8 looked
> like a good alternative to fill the iowait gaps, but we run into the
> mmap_sem issue.
> In a real world environment MaxThreads influences how many mails you
> can process in parallel with your MTA, so generally it should be as
> high as possible.
>
> On 2.6.27-rc4:
>
> MaxThreads 4 time, empty database (all cached, almost no I/O):
> 1m9s
>
> MaxThreads 4 time, after echo 3>/proc/sys/vm/drop_caches:
> 1m29s
>
> MaxThreads 8 time, empty database (all cached, almost no I/O):
> 2m16s
>
> MaxThreads 8 time, after echo 3>/proc/sys/vm/drop_caches:
> 2m15s
>
MaxThreads 8, full DB (13 % slower than 2.6.24)
4m42s
MaxThreads 4, full DB (8% faster than 2.6.24)
2m35s
MaxThreads 8, full DB, 2.6.24:
4m3s
MaxThreads 4, full DB, 2.6.24:
2m50s
I have run an echo 3>/proc/sys/vm/drop_caches before each, I hope that
clears all caches,
I have xfs on top of lvm, on top of raid10, and iostat shows only 0 -
20% activity (%util).
That could also mean of course that the disks can provide data fast
enough for clamd.
>
> Of course running with a full database will give different results, so
> I'll do some timing with that too (will take a little longer though).
>
>>> for clamd, and it was about mmap, I'll provide you with the exact
>>> output.
>>>
>>
>> Right - does it make sense to teach clamav about pread() ?
>
> If it is preferred over mmap, then maybe yes.
>
> Peter Zijlstra wrote:
> Best regards,
> --Edwin
>> OK, I'll poke a little more at is later today to see if I can spot
>> something
>
> Thanks!
Still, if I have more threads, performance *decreases* almost linearly
with 2.6.27 (and probably with 2.6.25+ if clamd behaves the same as my
test proggie), however with 2.6.24
With Debian etch having 2.6.24 (etchnhalf actually), and lenny shipping
with 2.6.25 or 2.6.26, users upgrading from etch to lenny could see a
performance decrease.
Best regards,
--Edwin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists