[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20130416055721.B8415E0085@blue.fi.intel.com>
Date: Tue, 16 Apr 2013 08:57:21 +0300 (EEST)
From: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
To: Dave Hansen <dave@...1.net>
Cc: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Al Viro <viro@...iv.linux.org.uk>,
Hugh Dickins <hughd@...gle.com>,
Wu Fengguang <fengguang.wu@...el.com>, Jan Kara <jack@...e.cz>,
Mel Gorman <mgorman@...e.de>, linux-mm@...ck.org,
Andi Kleen <ak@...ux.intel.com>,
Matthew Wilcox <matthew.r.wilcox@...el.com>,
"Kirill A. Shutemov" <kirill@...temov.name>,
Hillf Danton <dhillf@...il.com>, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [RESEND] IOZone with transparent huge page cache
Dave Hansen wrote:
> On 04/15/2013 11:17 AM, Kirill A. Shutemov wrote:
> > I run iozone using mmap files (-B) with different number of threads.
> > The test machine is 4s Westmere - 4x10 cores + HT.
>
> How did you run this, exactly? Which iozone arguments?
iozone -B -s 21822226/$threads -t $threads -r 4 -i 0 -i 1 -i 2 -i 3
It's slightly modified iozone test from mmtests.
> It was run on ramfs, since that's the only thing that transparent huge page
> cache supports right now?
Correct.
> > ** Initial writers **
> > threads: 1 2 4 8 16 32 64 128 256
> > baseline: 1103360 912585 500065 260503 128918 62039 34799 18718 9376
> > patched: 2127476 2155029 2345079 1942158 1127109 571899 127090 52939 25950
> > speed-up(times): 1.93 2.36 4.69 7.46 8.74 9.22 3.65 2.83 2.77
>
> I'm a _bit_ surprised that iozone scales _that_ badly especially while
> threads<nr_cpus. Is this normal for iozone? What are the units and
> metric there, btw?
The units is KB/sec per process (I used 'Avg throughput per process' from
iozone report). So it scales not that badly.
I will use total children throughput next time to avoid confusion.
> > Minimal speed up is in 1-thread reverse readers - 23%.
> > Maximal is 9.2 times in 32-thread initial writers. It's probably due
> > batched radix tree insert - we insert 512 pages a time. It reduces
> > mapping->tree_lock contention.
>
> It might actually be interesting to see this at 10, 20, 40, 80, etc...
> since that'll actually match iozone threads to CPU cores on your
> particular system.
Okay.
--
Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists