linux-kernel - Re: [RFT PATCH] blkio: alloc per cpu data from worker thread context( Re: kvm deadlock)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20111219173519.GL24519@google.com>
Date:	Mon, 19 Dec 2011 09:35:19 -0800
From:	Tejun Heo <tj@...nel.org>
To:	Vivek Goyal <vgoyal@...hat.com>
Cc:	Nate Custer <nate@...nel.net>, Jens Axboe <axboe@...nel.dk>,
	Avi Kivity <avi@...hat.com>,
	Marcelo Tosatti <mtosatti@...hat.com>, kvm@...r.kernel.org,
	linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [RFT PATCH] blkio: alloc per cpu data from worker thread
 context( Re: kvm deadlock)

On Mon, Dec 19, 2011 at 12:27:17PM -0500, Vivek Goyal wrote:
> On Sun, Dec 18, 2011 at 03:25:48PM -0600, Nate Custer wrote:
> > 
> > On Dec 16, 2011, at 2:29 PM, Vivek Goyal wrote:
> > > Thanks for testing it Nate. I did some debugging and found out that patch
> > > is doing double free on per cpu pointer hence the crash you are running
> > > into. I could reproduce this problem on my box. It is just a matter of
> > > doing rmdir on the blkio cgroup.
> > > 
> > > I understood the cmpxchg() semantics wrong. I have fixed it now and
> > > no crashes on directory removal. Can you please give this version a
> > > try.
> > > 
> > > Thanks
> > > Vivek
> > 
> > After 24 hours of stress testing the machine remains up and working without issue. I will continue to test it, but am reasonably confident that this patch resolves my issue.
> > 
> 
> Hi Nate,
> 
> I have come up with final version of the patch. This time I have used non
> rentrant work queue to queue the stat alloc work. This also gets rid of
> cmpxchg code as there is only one writer at a time. There are couple of
> other small cleanups.
> 
> Can you please give patch also a try to make sure I have not broken
> something while doing changes.
> 
> This version is based on 3.2-rc6. Once you confirm the results, I will
> rebase it on top of "linux-block for-3.3/core" and post it to Jens for
> inclusion.
> 
> Thanks
> Vivek
> 
> Block cgroup currently allocates percpu data for some stats. This allocation
> is happening in IO submission path which can recurse in IO stack.
> 
> Percpu allocation API does not take any allocation flags as input, hence
> to avoid the deadlock problem under memory pressure, alloc per cpu data
> from a worker thread context.
> 
> Only side affect of delayed allocation is that we will lose the blkio cgroup
> stats for that group a small duration.
> 
> In case per cpu memory allocation fails, worker thread re-arms the work
> with a delay of 1 second.
> 
> This patch is generated on top of 3.2-rc6
> 
> Signed-off-by: Vivek Goyal <vgoyal@...hat.com>
> Reported-by: Nate Custer <nate@...nel.net>
> Tested-by: Nate Custer <nate@...nel.net>

Hmmm... I really don't like this approach.  It's so unnecessarily
complex with extra refcounting and all when about the same thing can
be achieved by implementing simple mempool which is filled
asynchronously.  Also, the fix seems way too invasive even for -rc6,
let alone -stable.  If reverting isn't gonna be invasive, maybe that's
a better approach for now?

I've been thinking about it and I think the use case is legit and
maybe making percpu allocator support that isn't such a bad idea.  I'm
not completely sure yet tho.  I'll give it a try later today.

Thank you.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/