linux-kernel - Re: [PATCH btrfs/for-next] btrfs: fix fatal extent

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAL3q7H5q1ocLa6HjSfiXgVJ67kyfqNBDhcwMqeRVDfbiyr5-tg@mail.gmail.com>
Date:   Wed, 17 Jun 2020 19:11:12 +0100
From:   Filipe Manana <fdmanana@...il.com>
To:     Chris Mason <clm@...com>
Cc:     Boris Burkov <boris@....io>, Josef Bacik <josef@...icpanda.com>,
        David Sterba <dsterba@...e.com>,
        linux-btrfs <linux-btrfs@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        kernel-team@...com
Subject: Re: [PATCH btrfs/for-next] btrfs: fix fatal extent_buffer readahead
 vs releasepage race

On Wed, Jun 17, 2020 at 6:43 PM Chris Mason <clm@...com> wrote:
>
> On 17 Jun 2020, at 13:20, Filipe Manana wrote:
>
> > On Wed, Jun 17, 2020 at 5:32 PM Boris Burkov <boris@....io> wrote:
> >
> >> ---
> >>  fs/btrfs/extent_io.c | 45
> >> ++++++++++++++++++++++++++++----------------
> >>  1 file changed, 29 insertions(+), 16 deletions(-)
> >>
> >> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> >> index c59e07360083..f6758ebbb6a2 100644
> >> --- a/fs/btrfs/extent_io.c
> >> +++ b/fs/btrfs/extent_io.c
> >> @@ -3927,6 +3927,11 @@ static noinline_for_stack int
> >> write_one_eb(struct extent_buffer *eb,
> >>         clear_bit(EXTENT_BUFFER_WRITE_ERR, &eb->bflags);
> >>         num_pages = num_extent_pages(eb);
> >>         atomic_set(&eb->io_pages, num_pages);
> >> +       /*
> >> +        * It is possible for releasepage to clear the TREE_REF bit
> >> before we
> >> +        * set io_pages. See check_buffer_tree_ref for a more
> >> detailed comment.
> >> +        */
> >> +       check_buffer_tree_ref(eb);
> >
> > This is a whole different case from the one described in the
> > changelog, as this is in the write path.
> > Why do we need this one?
>
> This was Josef’s idea, but I really like the symmetry.  You set
> io_pages, you do the tree_ref dance.  Everyone fiddling with the write
> back bit right now correctly clears writeback after doing the atomic_dec
> on io_pages, but the race is tiny and prone to getting exposed again by
> shifting code around.  Tree ref checks around io_pages are the most
> reliable way to prevent this bug from coming back again later.

Ok, but that still doesn't answer my question.
Is there an actual race/problem this hunk solves?

Before calling write_one_eb() we increment the ref on the eb and we
also call lock_extent_buffer_for_io(),
which clears the dirty bit and sets the writeback bit on the eb while
holding its ref_locks spin_lock.

Even if we get to try_release_extent_buffer, it calls
extent_buffer_under_io(eb) while holding the ref_locks spin_lock,
so at any time it should return true, as either the dirty or the
writeback bit is set.

Is this purely a safety guard that is being introduced?

Can we at least describe in the changelog why we are adding this hunk
in the write path?
All it mentions is a race between reading and releasing pages, there's
nothing mentioned about races with writeback.

Thanks.

>
> -chris



-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”