linux-kernel - Re: cfq performance gap

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20061211140845.GL4576@kernel.dk>
Date:	Mon, 11 Dec 2006 15:08:45 +0100
From:	Jens Axboe <jens.axboe@...cle.com>
To:	Avantika Mathur <mathur@...ibm.com>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: cfq performance gap

On Fri, Dec 08 2006, Avantika Mathur wrote:
> On Fri, 2006-12-08 at 13:05 +0100, Jens Axboe wrote:
> > On Thu, Dec 07 2006, Avantika Mathur wrote:
> > > Hi Jens,
> > 
> > (you probably noticed now, but the axboe@...e.de email is no longer
> > valid)
> 
> I saw that, thanks!
> > > I've noticed a performance gap between the cfq scheduler and other io
> > > schedulers when running the rawio benchmark.
> > > Results from rawio on 2.6.19, cfq and noop schedulers:
> > >
> > > CFQ:
> > >
> > > procs           device    num read   KB/sec     I/O Ops/sec
> > > -----  ---------------  ----------  -------  --------------
> > >   16         /dev/sda       16412     8338            2084
> > > -----  ---------------  ----------  -------  --------------
> > >   16                        16412     8338            2084
> > >
> > > Total run time 0.492072 seconds
> > >
> > >
> > > NOOP:
> > >
> > > procs           device    num read   KB/sec     I/O Ops/sec
> > > -----  ---------------  ----------  -------  --------------
> > >   16         /dev/sda       16399    29224            7306
> > > -----  ---------------  ----------  -------  --------------
> > >   16                        16399    29224            7306
> > >
> > > Total run time 0.140284 seconds
> > >
> > > The benchmark workload is 16 processes running 4k random reads.
> > >
> > > Is this performance gap a known issue?
> > 
> > CFQ could be a little slower at this benchmark, but your results are
> > much worse than I would expect. What is the queueing depth of sda? How
> > are you invoking rawio?
> 
> I am running rawio with the following options:
> rawread -p 16 -m 1 -d 1 -x -z -t 0 -s 4096
>  
> The queue depth on sda is 4.
> 
> > 
> > Your runtime is very low, how does it look if you allow the test to run
> > for much longer? 30MiB/sec random read bandwidth seems very high, I'm
> > wondering what exactly is being tested here.
> > 
> 
> rawio is actually performing sequential reads, but I don't believe it is
> purely sequential with the multiple processes.
> I am currently running the test with longer runtimes and will post
> results once it is complete. 
> I've also attached the rawio source.

It's certainly the slice and idling hurting here. But at the same time,
I don't really think your test case is very interesting. The test area
is very small and you have 16 threads trying to read the same thing,
optimizing for that would be silly as I don't think it has much real
world relevance.

That said, I might add some logic to detect when we can cheaply switch
queues instead of waiting for a new request from the same queue.
Averaging slice times over a period of time instead of 1:1 with that
logic, should help cases like this while still being fair.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/