linux-ext4 - Re: [URGENT PATCH] ext4: fix potential deadlock in ext4_evict

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAFgt=MBAGurdr6Cce9zZ5sWX=BQz4RXJ4HHVWuUvM+gPcgpkFg@mail.gmail.com>
Date:	Fri, 26 Aug 2011 09:58:45 -0700
From:	Jiaying Zhang <jiayingz@...gle.com>
To:	"Ted Ts'o" <tytso@....edu>
Cc:	Tao Ma <tm@....ma>, Dave Chinner <david@...morbit.com>,
	linux-ext4@...r.kernel.org
Subject: Re: [URGENT PATCH] ext4: fix potential deadlock in ext4_evict_inode()

Hi Ted,

On Fri, Aug 26, 2011 at 8:52 AM, Ted Ts'o <tytso@....edu> wrote:
> On Fri, Aug 26, 2011 at 05:27:39PM +0800, Tao Ma wrote:
>> No, it doesn't mean the ext4_truncate. But another race pasted below.
>>
>> Flush inode's i_completed_io_list before calling ext4_io_wait to
>> prevent the following deadlock scenario: A page fault happens while
>> some process is writing inode A. During page fault,
>> shrink_icache_memory is called that in turn evicts another inode
>> B. Inode B has some pending io_end work so it calls ext4_ioend_wait()
>> that waits for inode B's i_ioend_count to become zero. However, inode
>> B's ioend work was queued behind some of inode A's ioend work on the
>> same cpu's ext4-dio-unwritten workqueue. As the ext4-dio-unwritten
>> thread on that cpu is processing inode A's ioend work, it tries to
>> grab inode A's i_mutex lock. Since the i_mutex lock of inode A is
>> still hold before the page fault happened, we enter a deadlock.
>
> ... but that shouldn't be a problem since we're not holding A's
> i_mutex at this point, right?  Or am I missing something?
I think it is possible that we are holding A's i_mutex lock if the page
fault happens while we are writing inode A. The problem is if we call
ext4_evict_inode for inode B during the page fault handling and we
just call ext4_ioend_wait() to wait for inode B's i_ioend_count to
become zero, we rely on the ext4-dio-unwritten worker thread to
finish any queued work at some time. But as mentioned in the change
commit log, B's io_end work may be queued after A's work on the
same cpu. Since A's i_mutex lock may be still hold during the page
fault time, the ext4-dio-unwritten worker thread can't make progress.

Now thinking about an alternative approach to resolve the deadlock
mentioned above, maybe we can use mutex_trylock() in
ext4_end_io_work() and if we can't grab the mutex lock for an inode,
just requeue the work to the end of workqueue?

Jiaying
>
>                                       - Ted
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html