lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 9 Jul 2009 19:38:23 +0200
From:	Jens Axboe <jens.axboe@...cle.com>
To:	Vivek Goyal <vgoyal@...hat.com>
Cc:	Jeff Moyer <jmoyer@...hat.com>, linux-kernel@...r.kernel.org,
	akpm@...ux-foundation.org
Subject: Re: [PATCH 2/2] cfq-iosched: get rid of the need for __GFP_FAIL in
	cfq_find_alloc_queue()

On Thu, Jul 09 2009, Vivek Goyal wrote:
> On Sat, Jun 27, 2009 at 08:26:17PM +0200, Jens Axboe wrote:
> > On Fri, Jun 26 2009, Jeff Moyer wrote:
> > > Jens Axboe <jens.axboe@...cle.com> writes:
> > > 
> > > > Setup an emergency fallback cfqq that we allocate at IO scheduler init
> > > > time. If the slab allocation fails in cfq_find_alloc_queue(), we'll just
> > > > punt IO to that cfqq instead. This ensures that cfq_find_alloc_queue()
> > > > never fails without having to ensure free memory.
> > > >
> > > > Signed-off-by: Jens Axboe <jens.axboe@...cle.com>
> > > > ---
> > > >  block/cfq-iosched.c |  124 +++++++++++++++++++++++++++-----------------------
> > > >  1 files changed, 67 insertions(+), 57 deletions(-)
> > > >
> > > > diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> > > > index c760ae7..91e7e0b 100644
> > > > --- a/block/cfq-iosched.c
> > > > +++ b/block/cfq-iosched.c
> > > > +	/*
> > > > +	 * Fallback dummy cfqq for extreme OOM conditions
> > > > +	 */
> > > > +	struct cfq_queue oom_cfqq;
> > > 
> > > OK, so you're embedding a cfqq into the cfqd.  That's 136 bytes, so I
> > > guess that's not too bad.
> > > 
> > > > +	/*
> > > > +	 * Our fallback cfqq if cfq_find_alloc_queue() runs into OOM issues.
> > > > +	 * Grab a permanent reference to it, so that the normal code flow
> > > > +	 * will not attempt to free it.
> > > > +	 */
> > > > +	cfq_init_cfqq(cfqd, &cfqd->oom_cfqq, 1, 0);
> > > > +	atomic_inc(&cfqd->oom_cfqq.ref);
> > > > +
> > > 
> > > I guess this is so we never try to free it, good.  ;)
> > > 
> > > One issue I have with this patch is that, if a task happens to run into
> > > this condition, there is no way out.  It will always have the oom_cfqq
> > > as it's cfqq.  Can't we fix that if we recover from the OOM condition?
> > 
> > Yeah, I fixed that about an hour after posting the patches. See:
> > 
> > http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=0370bc158cb1d5faa4b8a38c0de3211f0fd5bd64
> > 
> 
> Hi Jens,
> 
> I think above patch might not fix the issue of an oom_cfqq getting stuck
> with an io context. The reason being that once we allocate the cfqq, it
> will be cached in cic and once next request comes, we will retrieve it
> from cic and never call cfq_get_queue()/cfq_find_alloc_queue().
> 
> I think we probably need to do cfqq == oom_cfqq check in cfq_set_request()
> also.

Yes good catch, this is needed too!  Can you please send as a "real"
patch with signed-off-by added? Thanks!

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ