[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <YEI/yFdT2BtIiUrJ@mit.edu>
Date: Fri, 5 Mar 2021 09:27:20 -0500
From: "Theodore Ts'o" <tytso@....edu>
To: Eric Whitney <enwlinux@...il.com>
Cc: linux-ext4@...r.kernel.org
Subject: Re: [PATCH] ext4: shrink race window in ext4_should_retry_alloc()
On Thu, Feb 18, 2021 at 10:11:32AM -0500, Eric Whitney wrote:
> When generic/371 is run on kvm-xfstests using 5.10 and 5.11 kernels, it
> fails at significant rates on the two test scenarios that disable
> delayed allocation (ext3conv and data_journal) and force actual block
> allocation for the fallocate and pwrite functions in the test. The
> failure rate on 5.10 for both ext3conv and data_journal on one test
> system typically runs about 85%. On 5.11, the failure rate on ext3conv
> sometimes drops to as low as 1% while the rate on data_journal
> increases to nearly 100%.
>
> The observed failures are largely due to ext4_should_retry_alloc()
> cutting off block allocation retries when s_mb_free_pending (used to
> indicate that a transaction in progress will free blocks) is 0.
> However, free space is usually available when this occurs during runs
> of generic/371. It appears that a thread attempting to allocate
> blocks is just missing transaction commits in other threads that
> increase the free cluster count and reset s_mb_free_pending while
> the allocating thread isn't running. Explicitly testing for free space
> availability avoids this race.
>
> The current code uses a post-increment operator in the conditional
> expression that determines whether the retry limit has been exceeded.
> This means that the conditional expression uses the value of the
> retry counter before it's increased, resulting in an extra retry cycle.
> The current code actually retries twice before hitting its retry limit
> rather than once.
>
> Increasing the retry limit to 3 from the current actual maximum retry
> count of 2 in combination with the change described above reduces the
> observed failure rate to less that 0.1% on both ext3conv and
> data_journal with what should be limited impact on users sensitive to
> the overhead caused by retries.
>
> A per filesystem percpu counter exported via sysfs is added to allow
> users or developers to track the number of times the retry limit is
> exceeded without resorting to debugging methods. This should provide
> some insight into worst case retry behavior.
>
> Signed-off-by: Eric Whitney <enwlinux@...il.com>
Thanks, applied.
- Ted
Powered by blists - more mailing lists