linux-ext4 - Re: [RFC PATCH v2 7/7] ext4: fix race between blkdev_releasepage() and ext4_put

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210420130841.GA3618564@infradead.org>
Date:   Tue, 20 Apr 2021 14:08:41 +0100
From:   Christoph Hellwig <hch@...radead.org>
To:     Zhang Yi <yi.zhang@...wei.com>
Cc:     Christoph Hellwig <hch@...radead.org>, linux-ext4@...r.kernel.org,
        linux-fsdevel@...r.kernel.org, tytso@....edu,
        adilger.kernel@...ger.ca, jack@...e.cz, yukuai3@...wei.com
Subject: Re: [RFC PATCH v2 7/7] ext4: fix race between blkdev_releasepage()
 and ext4_put_super()

On Fri, Apr 16, 2021 at 04:00:48PM +0800, Zhang Yi wrote:
> Now, we use already use "if (bdev->bd_super)" to prevent call into
> ->bdev_try_to_free_page unless the super is alive, and the problem is
> bd_super becomes NULL concurrently after this check. So, IIUC, I think it's
> the same to switch to check the superblock is active or not. The acvive
> flag also could becomes inactive (raced by umount) after we call into
> bdev_try_to_free_page().

Indeed.

> In order to close this race, One solution is introduce a lock to synchronize
> the active state between kill_block_super() and blkdev_releasepage(), but
> the releasing page process have to try to acquire this lock in
> blkdev_releasepage() for each page, and the umount process still need to wait
> until the page release if some one invoke into ->bdev_try_to_free_page().
> I think this solution may affect performace and is not a good way.
> Think about it in depth, use percpu refcount seems have the smallest
> performance effect on blkdev_releasepage().
> 
> If you don't like the refcount, maybe we could add synchronize_rcu_expedited()
> in ext4_put_super(), it also could prevent this race. Any suggestions?

I really don't like to put a lot of overhead into the core VFS and block
device code.  ext4/jbd does not own the block device inode and really
has no business controlling releasepage for it.  I suspect the right
answer might be to simply revert the commit that added this hook.