lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 15 Sep 2020 11:27:19 +0200
From:   Jan Kara <jack@...e.cz>
To:     Dave Chinner <david@...morbit.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Amir Goldstein <amir73il@...il.com>,
        Hugh Dickins <hughd@...gle.com>,
        Michael Larabel <Michael@...haellarabel.com>,
        Ted Ts'o <tytso@...gle.com>,
        Andreas Dilger <adilger.kernel@...ger.ca>,
        Ext4 Developers List <linux-ext4@...r.kernel.org>,
        Jan Kara <jack@...e.cz>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: Kernel Benchmarking

On Mon 14-09-20 09:45:03, Dave Chinner wrote:
> I have my doubts that complex page cache manipulation operations
> like ->migrate_page that rely exclusively on page and internal mm
> serialisation are really safe against ->fallocate based invalidation
> races.  I think they probably also need to be wrapped in the
> MMAPLOCK, but I don't understand all the locking and constraints
> that ->migrate_page has and there's been no evidence yet that it's a
> problem so I've kinda left that alone. I suspect that "no evidence"
> thing comes from "filesystem people are largely unable to induce
> page migrations in regression testing" so it has pretty much zero
> test coverage....

Last time I've looked, ->migrate_page seemed safe to me. Page migration
happens under page lock so truncate_inode_pages_range() will block until
page migration is done (and this covers currently pretty much anything
fallocate related). And once truncate_inode_pages_range() is done,
there are no pages to migrate :) (plus migration code checks page->mapping
!= NULL after locking the page).

But I agree testing would be nice. When I was chasing a data corruption in
block device page cache caused by page migration, I was using thpscale [1]
or thpfioscale [2] benchmarks from mmtests which create anon hugepage
mapping and bang it from several threads thus making kernel try to compact
pages (and thus migrate other pages that block compaction) really hard. And
with it in parallel I was running the filesystem stress that seemed to
cause issues for the customer... I guess something like fsx & fsstress runs
with this THP stress test in parallel might be decent fstests to have.

								Honza

[1] https://github.com/gormanm/mmtests/blob/master/shellpack_src/src/thpscale/thpscale.c
[2] https://github.com/gormanm/mmtests/blob/master/shellpack_src/src/thpfioscale/thpfioscale.c

-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists