linux-kernel - Re: [RFC] cfq: adapt slice to number of processes doing I/O

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090905161635.GI18599@kernel.dk>
Date:	Sat, 5 Sep 2009 18:16:35 +0200
From:	Jens Axboe <jens.axboe@...cle.com>
To:	Corrado Zoccolo <czoccolo@...il.com>
Cc:	Jeff Moyer <jmoyer@...hat.com>,
	Linux-Kernel <linux-kernel@...r.kernel.org>
Subject: Re: [RFC] cfq: adapt slice to number of processes doing I/O

On Thu, Sep 03 2009, Corrado Zoccolo wrote:
> Hi Jens,
> On Thu, Sep 3, 2009 at 3:07 PM, Jens Axboe<jens.axboe@...cle.com> wrote:
> > On Thu, Sep 03 2009, Jeff Moyer wrote:
> >> Corrado Zoccolo <czoccolo@...il.com> writes:
> >>
> >> > When the number of processes performing I/O concurrently increases,  a
> >> > fixed time slice per process will cause large latencies.
> >> > In the patch, if there are more than 3 processes performing concurrent
> >> > I/O, we scale the time slice down proportionally.
> >> > To safeguard sequential bandwidth, we impose a minimum time slice,
> >> > computed from cfq_slice_idle (the idea is that cfq_slice_idle
> >> > approximates the cost for a seek).
> >> >
> >> > I performed two tests, on a rotational disk:
> >> > * 32 concurrent processes performing random reads
> >> > ** the bandwidth is improved from 466KB/s to 477KB/s
> >> > ** the maximum latency is reduced from 7.667s to 1.728
> >> > * 32 concurrent processes performing sequential reads
> >> > ** the bandwidth is reduced from 28093KB/s to 24393KB/s
> >> > ** the maximum latency is reduced from 3.781s to 1.115s
> >> >
> >> > I expect numbers to be even better on SSDs, where the penalty to
> >> > disrupt sequential read is much less.
> >>
> >> Interesting approach.  I'm not sure what the benefits will be on SSDs,
> >> as the idling logic is disabled for them (when nonrot is set and they
> >> support ncq).  See cfq_arm_slice_timer.
> >
> > Also, the problem with scaling the slice a lot is that throughput has a
> > tendency to fall off a cliff at some point.
> 
> This is the reason that I have a minimum slice. It is already reached
> for 32 processes as in my example, so the throughput drop is at most
> 20%.
> Currently it is computed as 2*slice_idle for sync, and 1*slice_idle
> for async queues.
> I think this causes the leveling of data transferred regardless of
> priorities. I'll cook up a formula to better scale also the minimum
> slice according to priority, to fix this issue.

For your case, it may be different for other hardware. But I think the
approach is sane to some degree, it'll require more work though. One
problem is that the count of busy queues will fluctuate a lot for sync
IO, so you'll have fairness issues. The number of potentially interested
processes needs to be a rolling average of some sort, not just looking
at ->busy_queues.

> > Have you tried benchmarking
> > buffered writes with reads?
> 
> Yes. I used that workload for benchmarks while tuning the patch.
> Adding async writes doesn't change the results, mostly because cfq
> preempts async queues when sync queues have new requests, and with
> many readers, there are always plenty of incoming reads. Writes almost
> have no chance to happen.

OK, it should not, if the slice start logic is working. Just wanted to
make sure :-)

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/