linux-kernel - Re: KASAN: use-after-free Read in percpu_ref_switch_to_atomic

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200306171911.GA2935@paulmck-ThinkPad-P72>
Date:   Fri, 6 Mar 2020 09:19:11 -0800
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Jens Axboe <axboe@...nel.dk>
Cc:     Jann Horn <jannh@...gle.com>, Dmitry Vyukov <dvyukov@...gle.com>,
        syzbot <syzbot+e017e49c39ab484ac87a@...kaller.appspotmail.com>,
        Al Viro <viro@...iv.linux.org.uk>,
        io-uring <io-uring@...r.kernel.org>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
        Thomas Gleixner <tglx@...utronix.de>, tony.luck@...el.com,
        the arch/x86 maintainers <x86@...nel.org>,
        Dan Carpenter <dan.carpenter@...cle.com>
Subject: Re: KASAN: use-after-free Read in percpu_ref_switch_to_atomic_rcu

On Fri, Mar 06, 2020 at 10:00:19AM -0700, Jens Axboe wrote:
> On 3/6/20 9:44 AM, Paul E. McKenney wrote:
> > On Fri, Mar 06, 2020 at 04:36:20PM +0100, Jann Horn wrote:
> >> On Fri, Mar 6, 2020 at 4:34 PM Jens Axboe <axboe@...nel.dk> wrote:
> >>> On 3/6/20 7:57 AM, Jann Horn wrote:
> >>>> +paulmck
> >>>>
> >>>> On Wed, Mar 4, 2020 at 3:40 PM Jens Axboe <axboe@...nel.dk> wrote:
> >>>>> On 3/4/20 12:59 AM, Dmitry Vyukov wrote:
> >>>>>> On Fri, Feb 7, 2020 at 9:14 AM syzbot
> >>>>>> <syzbot+e017e49c39ab484ac87a@...kaller.appspotmail.com> wrote:
> >>>>>>>
> >>>>>>> Hello,
> >>>>>>>
> >>>>>>> syzbot found the following crash on:
> >>>>>>>
> >>>>>>> HEAD commit:    4c7d00cc Merge tag 'pwm/for-5.6-rc1' of git://git.kernel.o..
> >>>>>>> git tree:       upstream
> >>>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=12fec785e00000
> >>>>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=e162021ddededa72
> >>>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=e017e49c39ab484ac87a
> >>>>>>> compiler:       clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81)
> >>>>>>>
> >>>>>>> Unfortunately, I don't have any reproducer for this crash yet.
> >>>>>>>
> >>>>>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> >>>>>>> Reported-by: syzbot+e017e49c39ab484ac87a@...kaller.appspotmail.com
> >>>>>>
> >>>>>> +io_uring maintainers
> >>>>>>
> >>>>>> Here is a repro:
> >>>>>> https://gist.githubusercontent.com/dvyukov/6b340beab6483a036f4186e7378882ce/raw/cd1922185516453c201df8eded1d4b006a6d6a3a/gistfile1.txt
> >>>>>
> >>>>> I've queued up a fix for this:
> >>>>>
> >>>>> https://git.kernel.dk/cgit/linux-block/commit/?h=io_uring-5.6&id=9875fe3dc4b8cff1f1b440fb925054a5124403c3
> >>>>
> >>>> I believe that this fix relies on call_rcu() having FIFO ordering; but
> >>>> <https://www.kernel.org/doc/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html#Callback%20Registry>
> >>>> says:
> >>>>
> >>>> | call_rcu() normally acts only on CPU-local state[...] It simply
> >>>> enqueues the rcu_head structure on a per-CPU list,
> > 
> > Indeed.  For but one example, if there was a CPU-to-CPU migration between
> > the two call_rcu() invocations, it would not be at all surprising for
> > the two callbacks to execute out of order.
> > 
> >>>> Is this fix really correct?
> >>>
> >>> That's a good point, there's a potentially stronger guarantee we need
> >>> here that isn't "nobody is inside an RCU critical section", but rather
> >>> that we're depending on a previous call_rcu() to have happened. Hence I
> >>> think you are right - it'll shrink the window drastically, since the
> >>> previous callback is already queued up, but it's not a full close.
> >>>
> >>> Hmm...
> >>
> >> You could potentially hack up the semantics you want by doing a
> >> call_rcu() whose callback does another call_rcu(), or something like
> >> that - but I'd like to hear paulmck's opinion on this first.
> > 
> > That would work!
> > 
> > Or, alternatively, do an rcu_barrier() between the two calls to
> > call_rcu(), assuming that the use case can tolerate rcu_barrier()
> > overhead and latency.
> 
> If the nested call_rcu() works, that seems greatly preferable to needing
> the rcu_barrier(), even if that would not be a showstopper for me. The
> nested call_rcu() is just a bit odd, but with a comment it should be OK.
> 
> Incremental here I'm going to test, would just fold in of course.
> 
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index f3218fc81943..95ba95b4d8ec 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -5330,7 +5330,7 @@ static void io_file_ref_kill(struct percpu_ref *ref)
>  	complete(&data->done);
>  }
>  
> -static void io_file_ref_exit_and_free(struct rcu_head *rcu)
> +static void __io_file_ref_exit_and_free(struct rcu_head *rcu)
>  {
>  	struct fixed_file_data *data = container_of(rcu, struct fixed_file_data,
>  							rcu);
> @@ -5338,6 +5338,18 @@ static void io_file_ref_exit_and_free(struct rcu_head *rcu)
>  	kfree(data);
>  }
>  
> +static void io_file_ref_exit_and_free(struct rcu_head *rcu)
> +{
> +	/*
> +	 * We need to order our exit+free call again the potentially
> +	 * existing call_rcu() for switching to atomic. One way to do that
> +	 * is to have this rcu callback queue the final put and free, as we
> +	 * could otherwise a pre-existing atomic switch complete _after_
> +	 * the free callback we queued.
> +	 */
> +	call_rcu(rcu, __io_file_ref_exit_and_free);
> +}
> +
>  static int io_sqe_files_unregister(struct io_ring_ctx *ctx)
>  {
>  	struct fixed_file_data *data = ctx->file_data;

Looks good to me!

							Thanx, Paul