lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220210091648.w5wie3llqri5kfw3@quack3.lan>
Date:   Thu, 10 Feb 2022 10:16:48 +0100
From:   Jan Kara <jack@...e.cz>
To:     "Paul E. McKenney" <paulmck@...nel.org>
Cc:     Jan Kara <jack@...e.cz>, Qian Cai <quic_qiancai@...cinc.com>,
        Theodore Ts'o <tytso@....edu>, Jan Kara <jack@...e.com>,
        Neeraj Upadhyay <quic_neeraju@...cinc.com>,
        Joel Fernandes <joel@...lfernandes.org>,
        Boqun Feng <boqun.feng@...il.com>, linux-ext4@...r.kernel.org,
        rcu@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] jbd2: avoid __GFP_ZERO with SLAB_TYPESAFE_BY_RCU

On Wed 09-02-22 12:11:37, Paul E. McKenney wrote:
> On Wed, Feb 09, 2022 at 07:10:10PM +0100, Jan Kara wrote:
> > On Wed 09-02-22 11:57:42, Qian Cai wrote:
> > > Since the linux-next commit 120aa5e57479 (mm: Check for
> > > SLAB_TYPESAFE_BY_RCU and __GFP_ZERO slab allocation), we will get a
> > > boot warning. Avoid it by calling synchronize_rcu() before the zeroing.
> > > 
> > > Signed-off-by: Qian Cai <quic_qiancai@...cinc.com>
> > 
> > No, the performance impact of this would be just horrible. Can you
> > ellaborate a bit why SLAB_TYPESAFE_BY_RCU + __GFP_ZERO is a problem and why
> > synchronize_rcu() would be needed here before the memset() please? I mean
> > how is zeroing here any different from the memory just being used?
> 
> Suppose a reader picks up a pointer to a memory block, then that memory
> is freed.  No problem, given that this is a SLAB_TYPESAFE_BY_RCU slab,
> so the memory won't be freed while the reader is accessing it.  But while
> the reader is in the process of validating the block, it is zeroed.
> 
> How does the validation step handle this in all cases?
> 
> If you have a way of handling this, I will of course drop the patch.
> And learn something new, which is always a good thing.  ;-)

So I maybe missed something when implementing the usage of journal_heads
under RCU but let's have a look. An example of RCU user of journal heads
is fs/jbd2/transaction.c:jbd2_write_access_granted(). It does:

        rcu_read_lock();

	// This part fetches journal_head from buffer_head - not related to
	// our slab RCU discussion

        if (!buffer_jbd(bh))
                goto out;
        /* This should be bh2jh() but that doesn't work with inline functions */
        jh = READ_ONCE(bh->b_private);
        if (!jh)
                goto out;

	// The validation comes here

        /* For undo access buffer must have data copied */
        if (undo && !jh->b_committed_data)
                goto out;
        if (READ_ONCE(jh->b_transaction) != handle->h_transaction &&
            READ_ONCE(jh->b_next_transaction) != handle->h_transaction)
                goto out;
	
	// Then some more checks unrelated to the slab itself.

        /*
         * There are two reasons for the barrier here:
         * 1) Make sure to fetch b_bh after we did previous checks so that we
         * detect when jh went through free, realloc, attach to transaction
         * while we were checking. Paired with implicit barrier in that path.
         * 2) So that access to bh done after jbd2_write_access_granted()
         * doesn't get reordered and see inconsistent state of concurrent
         * do_get_write_access().
         */
        smp_mb();
        if (unlikely(jh->b_bh != bh))
                goto out;

	// If all passed

	rcu_read_unlock();
	return true;

So if we are going to return true from the function, we know that 'jh' was
attached to handle->h_transaction at some point. And when 'jh' was attached
to handle->h_transaction, the transaction was holding reference to the 'jh'
and our 'handle' holds reference to the transaction so 'jh' could not be
freed since that moment. I.e., we are sure our reference to the handle keeps
'jh' alive and we can safely use it.

I don't see how any amount of scribbling over 'jh' could break this
validation. But maybe it is just a lack of my imagination :).

								Honza

-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ