[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrVpxnRvOZ5=bHU99GOYdVgottczZ0sHgYh3nqzA4Y4+_g@mail.gmail.com>
Date: Mon, 24 Oct 2016 18:09:24 -0700
From: Andy Lutomirski <luto@...capital.net>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: David Sterba <dsterba@...e.com>, Al Viro <viro@...iv.linux.org.uk>,
Dave Jones <davej@...emonkey.org.uk>,
Linux Kernel <linux-kernel@...r.kernel.org>,
Jens Axboe <axboe@...com>, Josef Bacik <jbacik@...com>,
Chris Mason <clm@...com>,
linux-btrfs <linux-btrfs@...r.kernel.org>
Subject: Re: bio linked list corruption.
On Oct 24, 2016 5:00 PM, "Linus Torvalds" <torvalds@...ux-foundation.org> wrote:
>
> On Mon, Oct 24, 2016 at 3:42 PM, Andy Lutomirski <luto@...capital.net> wrote:
>
> > Now the fallocate thread catches up and *exits*. Dave's test makes a
> > new thread that reuses the stack (the vmap area or the backing store).
> >
> > Now the shmem_fault thread continues on its merry way and takes
> > q->lock. But oh crap, q->lock is pointing at some random spot on some
> > other thread's stack. Kaboom!
>
> Note that q->lock should be entirely immaterial, since inode->i_lock
> nests outside of it in all uses.
>
> Now, if there is some code that runs *without* the inode->i_lock, then
> that would be a big bug.
>
> But I'm not seeing it.
>
> I do agree that some race on some stack data structure could easily be
> the cause of these issues. And yes, the vmap code obviously starts
> reusing the stack much earlier, and would trigger problems that would
> essentially be hidden by the fact that the kernel stack used to stay
> around not just until exit(), but until the process was reaped.
>
> I just think that in this case i_lock really looks like it should
> serialize things correctly.
>
> Or are you seeing something I'm not?
No, I missed that.
--Andy
Powered by blists - more mailing lists