[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <03bdbcb2-2ed7-1c0a-3c70-89c5c2e582f3@redhat.com>
Date: Fri, 29 Mar 2019 13:53:22 -0400
From: Waiman Long <longman@...hat.com>
To: Ingo Molnar <mingo@...nel.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Alexander Viro <viro@...iv.linux.org.uk>,
Pedro Cuadra Chamorro <pcuadrac@...cmu.edu>,
linux-kernel@...r.kernel.org
Subject: Re: fs/coda oops bisected to (925b9cd1b8) "locking/rwsem: Make owner
store task pointer of last owning reader"
On 03/29/2019 12:10 PM, Jan Harkes wrote:
> I was testing Coda on the 5.1-rc2 kernel and noticed that when I run a
> binary out of /coda, the binary would never exit and the system would
> detect a soft lockup. I narrowed it down to a very simple reproducible
> case of running a statically linked executable (busybox) from /coda with
> the cwd outside of Coda, so the only Coda file reference is from the
> executable itself.
>
> I knew I definitely had never seen this problem with the stable kernel
> on Ubuntu xenial (4.4) so I bisected between v4.4 and v5.1-rc2 and ended
> up at
>
> # first bad commit: [925b9cd1b89a94b7124d128c80dfc48f78a63098]
> # locking/rwsem: Make owner store task pointer of last owning reader
>
> When I revert this particular commit on 5.1-rc2, I am not able to
> reproduce the problem anymore.
>
> The puzzling thing to me is that a lot of that particular patch touches
> codepaths that are not even enabled in the kernels that I run, because I
> do not have CONFIG_RWSEM_DEBUG enabled.
>
> $ grep RWSEM .config
> CONFIG_RWSEM_XCHGADD_ALGORITHM=y
> CONFIG_RWSEM_SPIN_ON_OWNER=y
> # CONFIG_DEBUG_RWSEMS is not set
>
> And this patch is for rwsem, while my soft lockup is on a spinlock.
> So either I have a race in fs/coda that got somehow uncovered by this
> patch, or something else is going on here but I have not been able to
> figure it out.
>
> Jan
Without CONFIG_DEBUG_RWSEMS, the only behavioral change of this patch is
to do an unconditional write of task_structure pointer into sem->owner
after acquiring the read lock in down_read(). Before this patch, it does
conditional write of 0x1 into sem->owner if it was not 0x1. The only
possible scenario that I can think of that can cause the soft lockup you
see is use-after-free of memory objects.
Cheers,
Longman
Powered by blists - more mailing lists