linux-ext4 - Re: [PATCH] ext4: shrink race window in ext4_should_retry

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <YEI/yFdT2BtIiUrJ@mit.edu>
Date:   Fri, 5 Mar 2021 09:27:20 -0500
From:   "Theodore Ts'o" <tytso@....edu>
To:     Eric Whitney <enwlinux@...il.com>
Cc:     linux-ext4@...r.kernel.org
Subject: Re: [PATCH] ext4: shrink race window in ext4_should_retry_alloc()

On Thu, Feb 18, 2021 at 10:11:32AM -0500, Eric Whitney wrote:
> When generic/371 is run on kvm-xfstests using 5.10 and 5.11 kernels, it
> fails at significant rates on the two test scenarios that disable
> delayed allocation (ext3conv and data_journal) and force actual block
> allocation for the fallocate and pwrite functions in the test.  The
> failure rate on 5.10 for both ext3conv and data_journal on one test
> system typically runs about 85%.  On 5.11, the failure rate on ext3conv
> sometimes drops to as low as 1% while the rate on data_journal
> increases to nearly 100%.
> 
> The observed failures are largely due to ext4_should_retry_alloc()
> cutting off block allocation retries when s_mb_free_pending (used to
> indicate that a transaction in progress will free blocks) is 0.
> However, free space is usually available when this occurs during runs
> of generic/371.  It appears that a thread attempting to allocate
> blocks is just missing transaction commits in other threads that
> increase the free cluster count and reset s_mb_free_pending while
> the allocating thread isn't running.  Explicitly testing for free space
> availability avoids this race.
> 
> The current code uses a post-increment operator in the conditional
> expression that determines whether the retry limit has been exceeded.
> This means that the conditional expression uses the value of the
> retry counter before it's increased, resulting in an extra retry cycle.
> The current code actually retries twice before hitting its retry limit
> rather than once.
> 
> Increasing the retry limit to 3 from the current actual maximum retry
> count of 2 in combination with the change described above reduces the
> observed failure rate to less that 0.1% on both ext3conv and
> data_journal with what should be limited impact on users sensitive to
> the overhead caused by retries.
> 
> A per filesystem percpu counter exported via sysfs is added to allow
> users or developers to track the number of times the retry limit is
> exceeded without resorting to debugging methods.  This should provide
> some insight into worst case retry behavior.
> 
> Signed-off-by: Eric Whitney <enwlinux@...il.com>

Thanks, applied.

						- Ted