[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <da3d3b39-ede2-1e0c-e7f6-ea918d20d0f9@redhat.com>
Date: Fri, 19 Jul 2019 16:14:33 -0400
From: Waiman Long <longman@...hat.com>
To: Luis Henriques <lhenriques@...e.com>
Cc: Borislav Petkov <bp@...en8.de>, Will Deacon <will.deacon@....com>,
huang ying <huang.ying.caritas@...il.com>,
Peter Zijlstra <peterz@...radead.org>, x86@...nel.org,
Thomas Gleixner <tglx@...utronix.de>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Tim Chen <tim.c.chen@...ux.intel.com>,
Ingo Molnar <mingo@...hat.com>,
Davidlohr Bueso <dave@...olabs.net>,
linux-kernel@...r.kernel.org, "H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH v8 13/19] locking/rwsem: Make rwsem->owner an
atomic_long_t
On 7/19/19 3:45 PM, Luis Henriques wrote:
> Waiman Long <longman@...hat.com> writes:
>
>> On 7/19/19 2:45 PM, Luis Henriques wrote:
>>> On Mon, May 20, 2019 at 04:59:12PM -0400, Waiman Long wrote:
>>>> The rwsem->owner contains not just the task structure pointer, it also
>>>> holds some flags for storing the current state of the rwsem. Some of
>>>> the flags may have to be atomically updated. To reflect the new reality,
>>>> the owner is now changed to an atomic_long_t type.
>>>>
>>>> New helper functions are added to properly separate out the task
>>>> structure pointer and the embedded flags.
>>> I started seeing KASAN use-after-free with current master, and a bisect
>>> showed me that this commit 94a9717b3c40 ("locking/rwsem: Make
>>> rwsem->owner an atomic_long_t") was the problem. Does it ring any
>>> bells? I can easily reproduce it with xfstests (generic/464).
>>>
>>> Cheers,
>>> --
>>> Luís
>> This patch shouldn't change the behavior of the rwsem code. The code
>> only access data within the rw_semaphore structures. I don't know why it
>> will cause a KASAN error. I will have to reproduce it and figure out
>> exactly which statement is doing the invalid access.
> Yeah, screwing the bisection is something I've done in the past so I may
> have got the wrong commit. Another detail is that I was running
> xfstests against CephFS, I didn't tried with any other filesystem. I
> can try to reproduce with btrfs or xfs next week.
>
> Cheers,
Oh, I don't have a CephFS setup. Will you use the
scripts/decode_stacktrace.sh to find what line number is the offending
statement? That will help in figuring out what has gone wrong.
Anyway, it seems like a structure that include a rwsem is freed while
another cpu is still waiting to acquire the lock. It is probably a
hidden bug in the filesystem code somewhere that the recent changes in
rwsem behavior make it easier for the problem to show up.
Cheers,
Longman
Powered by blists - more mailing lists