linux-kernel - Re: [patch,rfc] cfq: merge cooperating cfq

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <x49ljj470ig.fsf@segfault.boston.devel.redhat.com>
Date:	Wed, 21 Oct 2009 20:09:11 -0400
From:	Jeff Moyer <jmoyer@...hat.com>
To:	Corrado Zoccolo <czoccolo@...il.com>
Cc:	jens.axboe@...cle.com,
	Linux Kernel Mailing <linux-kernel@...r.kernel.org>
Subject: Re: [patch,rfc] cfq: merge cooperating cfq_queues

Corrado Zoccolo <czoccolo@...il.com> writes:

Hi, Corrado!  Thanks for looking at the patch.

> Hi Jeff,
[...]
> I'm not sure that 3 broken userspace programs justify increasing the
> complexity of a core kernel part as the I/O scheduler.

I think it's wrong to call the userspace programs broken.  They worked
fine when CFQ was quantum based, and they work well with noop and
deadline.  Further, the patch I posted is fairly trivial, in my opinion.

> The original close cooperator code is not limited to those programs.
> It can actually result in a better overall scheduling on rotating
> media, since it can help with transient close relationships (and
> should probably be disabled on non-rotating ones).
> Merging queues, instead, can lead to bad results in case of false
> positives. I'm thinking for examples to two programs that are loading
> shared libraries (that are close on disk, being in the same dir) on
> startup, and end up being tied to the same queue.

The idea is not to leave cfqq's merged indefinitely.  I'm putting
together a follow-on patch that will split the queues back up when they
are no longer working on the same area of the disk.

> Can't the userspace programs be fixed to use the same I/O context for
> their threads?
> qemu already has a bug report for it
> (https://bugzilla.redhat.com/show_bug.cgi?id=498242).

I submitted a patch to dump to address this.  I think the SCSI target
mode driver folks also patched their code.  The qemu folks are working
on a couple of different fixes to the problem.  That leaves nfsd, which
I could certainly try to whip into shape, but I wonder if there are
others.

>> The next step will be to break apart the cfqq's when the I/O patterns
>> are no longer sequential.  This is not very important for dump(8), but
>> for NFSd, this could make a big difference.  The problem with sharing
>> the cfq_queue when the NFSd threads are no longer serving requests from
>> a single client is that instead of having 8 scheduling entities, NFSd
>> only gets one.  This could considerably hurt performance when serving
>> shares to multiple clients, though I don't have a test to show this yet.
>
> I think it will hurt performance only if it is competing with other
> I/O. In that case, having 8 scheduling entities will get 8 times more
> disk share (but this can be fixed by adjusting the nfsd I/O priority).

It may be common that nfsd is the only thing accessing the device, good
point.

> For the I/O pattern, instead, sorting all requests in a single queue
> may still be preferable, since they will be at least sorted in disk
> order, instead of the random order given by which thread in the pool
> received the request.
> This is, though, an argument in favor of using CLONE_IO inside nfsd,
> since having a single queue, with proper priority, will always give a
> better overall performance.

Well, I started to work on a patch to nfsd that would share and unshare
I/O contexts based on the client with which the request was associated.
So, much like there is the shared readahead state, there would now be a
shared I/O scheduler state.  However, believe it or not, it is much
simpler to do in the I/O scheduler.  But maybe that's because cfq is my
hammer.  ;-)

Thanks again for your review Corrado.  It is much appreciated.

Cheers,
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/