linux-kernel - Re: [PATCH] infiniband: fix a subtle race condition

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180615025739.GB29138@mellanox.com>
Date:   Thu, 14 Jun 2018 20:57:39 -0600
From:   Jason Gunthorpe <jgg@...lanox.com>
To:     Cong Wang <xiyou.wangcong@...il.com>
Cc:     Leon Romanovsky <leon@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>, linux-rdma@...r.kernel.org,
        Doug Ledford <dledford@...hat.com>
Subject: Re: [PATCH] infiniband: fix a subtle race condition

On Thu, Jun 14, 2018 at 04:14:13PM -0700, Cong Wang wrote:
> On Thu, Jun 14, 2018 at 10:24 AM, Jason Gunthorpe <jgg@...lanox.com> wrote:
> > On Thu, Jun 14, 2018 at 10:03:09AM -0700, Cong Wang wrote:
> >> On Thu, Jun 14, 2018 at 7:24 AM, Jason Gunthorpe <jgg@...lanox.com> wrote:
> >> >
> >> > This was my brief reaction too, this code path almost certainly has a
> >> > use-after-free, and we should fix the concurrency between the two
> >> > places in some correct way..
> >>
> >> First of all, why use-after-free could trigger an imbalance unlock?
> >> IOW, why do we have to solve use-after-free to fix this imbalance
> >> unlock?
> >
> > The issue syzkaller hit is that accessing ctx->file does not seem
> > locked in any way and can race with other manipulations of ctx->file.
> >
> > So.. for this patch to be correct we need to understand how this
> > statement:
> >
> >    f = ctx->file
> >
> > Avoids f becoming a dangling pointer - and without locking, my
> 
> It doesn't, because this is not the point, this is not the cause
> of the unlock imbalance either. syzbot didn't report use-after-free
> or a kernel segfault here.

No, it *is* the point - you've proposed a solution, one of many, and
we need to see an actual sensible design for how the locking around
ctx->file should work correctly.

We need solutions that solve the underlying problem, not just paper
over the symptoms.

Stated another way, for a syzkaller report like this there are a few
really obvious fixes.

1) Capture the lock pointer on the stack:
  f = ctx->file
  mutex_lock(&f->mut);
  mutex_unlock(&f->mut);

2) Prevent ctx->file from changing, eg add more locking:
  mutex_lock(&mut);
  mutex_lock(&ctx->file->mut);
  mutex_unlock(&ctx->file->mut));
  mutex_unlock(&mut);

3) Prevent ctx->file from being changing/freed by flushing the
   WQ at the right times:

   rdma_addr_cancel(...);
   ctx->file = XYZ;

This patch proposed #1. An explanation is required why that is a
correct locking design for this code. It sure looks like it isn't.

Looking at this *just a bit*, I wonder why not do something like
this:

  mutex_lock(&mut);
  f = ctx->file;
  mutex_lock(&f->mutex);
  mutex_unlock(&mut);
 
? At least that *might* make sense. Though probably it deadlocks as it
looks like we call rdma_addr_cancel() while holding mut. Yuk.

But maybe that sequence could be done before launching the work..

> > I'm not sure that race exists, there should be something that flushes
> > the WQ on the path to close... (though I have another email that
> > perhaps that is broken, sigh)
> 
> This is not related to my patch, but to convince you, let me explain:
> 
> struct ucma_file is not refcnt'ed, I know you cancel the work in
> rdma_destroy_id(), but after ucma_migrate_id() the ctx has already
> been moved to the new file, for the old file, it won't cancel the
> ctx flying with workqueue. So, I think the following use-after-free
> could happen:
> 
> ucma_event_handler():
> cur_file = ctx->file; // old file
> 
> ucma_migrate_id():
> lock();
> list_move_tail(&ctx->list, &new_file->ctx_list);
> ctx->file = new_file;
> unlock();
> 
> ucma_close():
> // retrieve old file via filp->private_data
> // the loop won't cover the ctx moved to the new_file
> kfree(file);

Yep. That sure seems like the right analysis!

> This is _not_ the cause of the unlock imbalance, and is _not_ expected
> to solve by patch either.

What do you mean? Not calling rdma_addr_cancel() prior to freeing the
file is *exactly* the cause of the lock imbalance.

The design of this code *assumes* that rdma_addr_cancel() will be
called before altering/freeing/etc any of the state it is working on,
migration makes a state change that violates that invariant.

Jason