lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090709154446.GD30832@redhat.com>
Date:	Thu, 9 Jul 2009 11:44:46 -0400
From:	Vivek Goyal <vgoyal@...hat.com>
To:	Jens Axboe <jens.axboe@...cle.com>
Cc:	Jeff Moyer <jmoyer@...hat.com>, linux-kernel@...r.kernel.org,
	akpm@...ux-foundation.org
Subject: Re: [PATCH 2/2] cfq-iosched: get rid of the need for __GFP_FAIL in
	cfq_find_alloc_queue()

On Sat, Jun 27, 2009 at 08:26:17PM +0200, Jens Axboe wrote:
> On Fri, Jun 26 2009, Jeff Moyer wrote:
> > Jens Axboe <jens.axboe@...cle.com> writes:
> > 
> > > Setup an emergency fallback cfqq that we allocate at IO scheduler init
> > > time. If the slab allocation fails in cfq_find_alloc_queue(), we'll just
> > > punt IO to that cfqq instead. This ensures that cfq_find_alloc_queue()
> > > never fails without having to ensure free memory.
> > >
> > > Signed-off-by: Jens Axboe <jens.axboe@...cle.com>
> > > ---
> > >  block/cfq-iosched.c |  124 +++++++++++++++++++++++++++-----------------------
> > >  1 files changed, 67 insertions(+), 57 deletions(-)
> > >
> > > diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> > > index c760ae7..91e7e0b 100644
> > > --- a/block/cfq-iosched.c
> > > +++ b/block/cfq-iosched.c
> > > +	/*
> > > +	 * Fallback dummy cfqq for extreme OOM conditions
> > > +	 */
> > > +	struct cfq_queue oom_cfqq;
> > 
> > OK, so you're embedding a cfqq into the cfqd.  That's 136 bytes, so I
> > guess that's not too bad.
> > 
> > > +	/*
> > > +	 * Our fallback cfqq if cfq_find_alloc_queue() runs into OOM issues.
> > > +	 * Grab a permanent reference to it, so that the normal code flow
> > > +	 * will not attempt to free it.
> > > +	 */
> > > +	cfq_init_cfqq(cfqd, &cfqd->oom_cfqq, 1, 0);
> > > +	atomic_inc(&cfqd->oom_cfqq.ref);
> > > +
> > 
> > I guess this is so we never try to free it, good.  ;)
> > 
> > One issue I have with this patch is that, if a task happens to run into
> > this condition, there is no way out.  It will always have the oom_cfqq
> > as it's cfqq.  Can't we fix that if we recover from the OOM condition?
> 
> Yeah, I fixed that about an hour after posting the patches. See:
> 
> http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=0370bc158cb1d5faa4b8a38c0de3211f0fd5bd64
> 

Hi Jens,

I think above patch might not fix the issue of an oom_cfqq getting stuck
with an io context. The reason being that once we allocate the cfqq, it
will be cached in cic and once next request comes, we will retrieve it
from cic and never call cfq_get_queue()/cfq_find_alloc_queue().

I think we probably need to do cfqq == oom_cfqq check in cfq_set_request()
also.


---
 block/cfq-iosched.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux5/block/cfq-iosched.c
===================================================================
--- linux5.orig/block/cfq-iosched.c	2009-07-04 13:58:48.000000000 -0400
+++ linux5/block/cfq-iosched.c	2009-07-09 11:33:45.000000000 -0400
@@ -2311,7 +2311,7 @@ cfq_set_request(struct request_queue *q,
 		goto queue_fail;
 
 	cfqq = cic_to_cfqq(cic, is_sync);
-	if (!cfqq) {
+	if (!cfqq || cfqq == &cfqd->oom_cfqq) {
 		cfqq = cfq_get_queue(cfqd, is_sync, cic->ioc, gfp_mask);
 		cic_set_cfqq(cic, cfqq, is_sync);
 	}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ