lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <000001c71fec$8d5ca370$d034030a@amr.corp.intel.com>
Date:	Thu, 14 Dec 2006 17:58:43 -0800
From:	"Chen, Kenneth W" <kenneth.w.chen@...el.com>
To:	"'Andrew Morton'" <akpm@...l.org>, <linux-aio@...ck.org>,
	"'xb'" <xavier.bru@...l.net>, <linux-kernel@...r.kernel.org>
Subject: RE: 2.6.18.4: flush_workqueue calls mutex_lock in interrupt environment

Andrew Morton wrote on Thursday, December 14, 2006 5:20 PM
> it's hard to disagree.
> 
> Begin forwarded message:
> > On Wed, 2006-12-13 at 08:25 +0100, xb wrote:
> > > Hi all,
> > > 
> > > Running some IO stress tests on a 8*ways IA64 platform, we got:
> > >      BUG: warning at kernel/mutex.c:132/__mutex_lock_common()  message
> > > followed by:
> > >      Unable to handle kernel paging request at virtual address
> > > 0000000000200200
> > > oops corresponding to anon_vma_unlink() calling list_del() on a
> > > poisonned list.
> > > 
> > > Having a look to the stack, we see that flush_workqueue() calls
> > > mutex_lock() with softirqs disabled.
> > 
> > something is wrong here... flush_workqueue() is a sleeping function and
> > is not allowed to be called in such a context!
> 
> It seems utterly insane to have aio_complete() flush a workqueue. That
> function has to be called from a number of different environments,
> including non-sleep tolerant environments.
> 
> For instance it means that directIO on NFS will now cause the rpciod
> workqueues to call flush_workqueue(aio_wq), thus slowing down all RPC
> activity.

The bug appears to be somewhere else, somehow the ref count on ioctx is
all messed up.

In aio_complete, __put_ioctx() should not be invoked because ref count
on ioctx is supposedly more than 2, aio_complete decrement it once and
should return without invoking the free function.

The real freeing ioctx should be coming from exit_aio() or io_destroy(),
in which case both wait until no further pending AIO request via
wait_for_all_aios().

- Ken
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ