linux-kernel - Re: [PATCH 2/3] work_on_cpu: Use our own workqueue.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20090130141744.007fe725.akpm@linux-foundation.org>
Date:	Fri, 30 Jan 2009 14:17:44 -0800
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Rusty Russell <rusty@...tcorp.com.au>
Cc:	travis@....com, mingo@...hat.com, davej@...hat.com,
	cpufreq@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/3] work_on_cpu: Use our own workqueue.

On Sat, 31 Jan 2009 08:29:15 +1030
Rusty Russell <rusty@...tcorp.com.au> wrote:

> On Friday 30 January 2009 17:00:42 Andrew Morton wrote:
> > On Fri, 30 Jan 2009 16:33:53 +1030 Rusty Russell <rusty@...tcorp.com.au> wrote:
> > 
> > > On Thursday 29 January 2009 12:42:05 Andrew Morton wrote:
> > > > On Thu, 29 Jan 2009 12:13:32 +1030 Rusty Russell <rusty@...tcorp.com.au> wrote:
> > > > 
> > > > > On Thursday 29 January 2009 06:14:40 Andrew Morton wrote:
> > > > > > It's vulnerable to the same deadlock, I think?  Suppose we have:
> > > > > ...
> > > > > > - A calls work_on_cpu() and takes woc_mutex.
> > > > > > 
> > > > > > - Before function_which_takes_L() has started to execute, task B takes L
> > > > > >   then calls work_on_cpu() and task B blocks on woc_mutex.
> > > > > > 
> > > > > > - Now function_which_takes_L() runs, and blocks on L
> > > > > 
> > > > > Agreed, but now it's a fairly simple case.  Both sides have to take lock L, and both have to call work_on_cpu.
> > > > > 
> > > > > Workqueues are more generic and widespread, and an amazing amount of stuff gets called from them.  That's why I felt uncomfortable with removing the one known problematic caller.
> > > > > 
> > > > 
> > > > hm.  it's a bit of a timebomb.
> > > > 
> > > > y'know, the original way in which acpi-cpufreq did this is starting to
> > > > look attractive.  Migrate self to that CPU then just call the dang
> > > > function.  Slow, but no deadlocks (I think)?
> > > 
> > > Just buggy.  What random thread was it mugging?  If there's any path where
> > > it's not a kthread, what if userspace does the same thing at the same time?
> > > We risk running on the wrong cpu, *then* overriding userspace when we restore
> > > it.
> > 
> > hm, Ok, not unficable but not pleasant.
> > 
> > > In general these cpumask games are a bad idea.
> > 
> > So we still don't have any non-buggy proposal.
> 
> I disagree about the avoiding-workqueue one being buggy.
> 

I assume you're talking about the patch I looked at a couple of days ago.

It's vulnerable to the same deadlock as work_on_cpu() has always been.

Just as an example, take a look at allocate_threshold_blocks().  That
function way down in the innards of x86 has blotted out large amounts of
kernel code, so that code can now not use work_on_cpu().  Anything which
happens inside ext3 commit (the entire block layer and all drivers
underneath it).  Large lumps of networking code.  Parts of the page
allocator and the VFS which I haven't started to think about yet.

>  The same logic
> applies to any simple callback function.

Not!  The difference here is the queueing and serialisation which
introduces dependencies between unrelated subsystems which happen to
use this piece of core infrastructure.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/