linux-kernel - Re: CFQ read performance regression

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Wed, 21 Apr 2010 15:25:24 +0200
From:	Miklos Szeredi <mszeredi@...e.cz>
To:	Corrado Zoccolo <czoccolo@...il.com>
Cc:	Jens Axboe <jens.axboe@...cle.com>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Jan Kara <jack@...e.cz>, Suresh Jayaraman <sjayaraman@...e.de>
Subject: Re: CFQ read performance regression

Corrado,

On Tue, 2010-04-20 at 22:50 +0200, Corrado Zoccolo wrote:
> can you give more information about the setup?
> How much memory do you have, what is the disk configuration (is this a
> hw raid?) and so on.

8G of memory 8-way Xeon CPU, fiber channel attached storage array (HP
HSV200).  I don't know the configuration of the array.

> > low_latency is set to zero in all tests.
> >
> > The layout difference doesn't explain why setting the scheduler to
> > "noop" consistently speeds up read throughput in 8-thread tiobench to
> > almost twice.  This fact alone pretty clearly indicates that the I/O
> > scheduler is the culprit.
> From the attached btt output, I see that a lot of time is spent
> waiting to allocate new request structures.
> > S2G               0.022460311   6.581680621  23.144763751          15
> Since noop doesn't attach fancy data to each request, it can save
> those allocations, thus resulting in no sleeps.
> The delays in allocation, though, may not be completely imputable to
> the I/O scheduler, and working in constrained memory conditions will
> negatively affect it.

I verified with the simple dd test that this happens even if there's no
memory pressure from the cache by dd-ing only 5G of files, after
clearing the cache.  This way ~2G of memory are completely free
throughout the test. 

> > I've also tested with plain "dd" instead of tiobench where the
> > filesystem layout stayed exactly the same between tests.  Still the
> > speed difference is there.
> Does dropping caches before the read test change the situation?

In all my tests I drop caches before running it.

Please let me know if you need more information.

Thanks,
Miklos

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/