linux-kernel - Re: [PATCH 0/11] Per-bdi writeback flusher threads v8

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <m1octdw8sv.fsf@fess.ebiederm.org>
Date:	Thu, 28 May 2009 08:23:28 -0700
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Jan Kara <jack@...e.cz>
Cc:	Theodore Tso <tytso@....edu>, Jens Axboe <jens.axboe@...cle.com>,
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	chris.mason@...cle.com, david@...morbit.com, hch@...radead.org,
	akpm@...ux-foundation.org, yanmin_zhang@...ux.intel.com,
	richard@....demon.co.uk, damien.wyart@...e.fr,
	Alex Chiang <achiang@...com>,
	"Eric W. Biederman" <ebiederm@...stanetworks.com>
Subject: Re: [PATCH 0/11] Per-bdi writeback flusher threads v8

Jan Kara <jack@...e.cz> writes:

> On Wed 27-05-09 20:49:59, Theodore Tso wrote:
>> On Wed, May 27, 2009 at 09:45:43PM +0200, Jens Axboe wrote:
>> > 
>> > This one has been tested good, where good means that it boots and
>> > functions normally at least. Whether it fixes your issue, that would be
>> > interesting to know :-)
>> > 
>> 
>> Unfortunately, it doesn't seem to have.  Here's a dmesg with the
>> softlockup report and the sysrq-t output.  Unfortunately the dmesg
>> file is too big for LKML, so I've compressed it so you can get the
>> whole thing.
>   Everybody waits for sys_sync() to complete and they never seem to be
> woken up. Jens, wb_work_complete() seems a bit fishy - who does
> wb_clear_work() in sync_mode == WB_SYNC_ALL which is on stack?
>
>> There's also a lockdep warning which fsx triggered.
>   The lockdep warning is definitely unrelated. It's really a possible
> deadlock, although not quite probable. IMHO the problem is that
> sysfs_mutex gets above mmap_sem due to code in sysfs_readdir which calls
> filldir() which may cause page fault. At the same time it gets quite low
> on the lock stack because filesystems call sysfs functions from their
> internal functions (in this case ext4_put_super) holding quite some locks.
> Adding a few CC's for this.

Interesting.

I thought the network stack was the only piece of code silly enough
to hold locks while deleting sysfs files.

Holding any lock while deleting a objects from sysfs, sysctl or proc,
is asking for serious mischief, and unfixable from the fs side.

The usual problem is that lockdep doesn't yet understand
sysfs_deactivate which waits for any running sysfs operations to
complete before it deletes the sysfs files.

Which means any lock you hold in a show or store method is can deadlock
with any lock you hold while deleting from sysfs.

ext4 appears lock loose and fancy free in it's show and store methods
so it might be ok except for this issue of mmap_sem vs sysfs_mutex.
But apparently even that isn't enough to git rid of the requirement
to not hold locks when deleting objects.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/