lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20090402152853.GB17275@atrey.karlin.mff.cuni.cz>
Date:	Thu, 2 Apr 2009 17:28:53 +0200
From:	Jan Kara <jack@...e.cz>
To:	HongChao Zhang <zhanghc08@...oo.com.cn>
Cc:	linux-fsdevel@...r.kernel.org, viro@...iv.linux.org.uk,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: Problem in "prune_icache"

  Hi,

> I'am from Lustre, which is a product of SUN Mirocsystem to implement
> Scaled Distributed FileSystem, and we encounter a deadlock problem 
> in prune_icache, the detailed is,
>  
> during truncating a file, a new update in current journal transaction
> will be created, but it found memory in low level during processing, 
> then it call try_to_free_pages to free some pages, which finially call
> shrink_icache_memory/prune_icache to free cache memory occupied by inodes.
> Note: prune_icache will get and hold "iprune_mutex" during its whole pruning work.
>  
> but at the same time, kswapd have called shrink_icache_memory/prune_icache with 
> "iprune_mutex" locked, which found some inodes to dispose and call 
> clear_inode/DQUOT_DROP/fs-specific-quota-drop-op(say "ldiskfs_dquot_drop" in our case)
> to drop dquot, and this fs-specific-quota-drop-op can call journal_start to
> start a new update, but it found the buffers in current transaction is up to
> j_max_transaction_buffers, so it wake up kjournald to commit the transaction.
> so kjournald will call journal_commit_transaction to commit the transcation,
> which set the state of the transaction as T_LOCKED then check whether there are
> still pending updates for the committing transaction, and it found there is a
> pending update(started in truncating operation, see above), so it will wait
> the update to complete, BUT the update won't be completed for it can't get the
> "iprune_mutex" hold by kswapd, so the deadlock is triggered.
  Yes, this has happened with other filesystems as well (ext3,
ext4,...). The usual solution for this problem is to specify GFP_NOFS to
all allocations that happen while the transaction is open. That way we
never get to recursing back to the filesystem in the allocation. Is
there some reason why that is no-go for you?

									Honza

-- 
Jan Kara <jack@...e.cz>
SuSE CR Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ