[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4Y+YJYSo5GtSvMSqh4y_w7jWC1S-RYu+exuR3DU9NyrpJqA@mail.gmail.com>
Date: Thu, 10 Jan 2019 11:21:13 +0100
From: Dmitry Vyukov <dvyukov@...gle.com>
To: Waiman Long <longman@...hat.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Will Deacon <will.deacon@....com>,
LKML <linux-kernel@...r.kernel.org>,
Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
Subject: Re: [PATCH] locking/lockdep: Add debug_locks check in __lock_downgrade()
On Thu, Jan 10, 2019 at 5:04 AM Waiman Long <longman@...hat.com> wrote:
>
> Tetsuo Handa had reported he saw an incorrect "downgrading a read lock"
> warning right after a previous lockdep warning. It is likely that the
> previous warning turned off lock debugging causing the lockdep to have
> inconsistency states leading to the lock downgrade warning.
>
> Fix that by add a check for debug_locks at the beginning of
> __lock_downgrade().
>
> Signed-off-by: Waiman Long <longman@...hat.com>
> Reported-by: Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
Please also add:
Reported-by: syzbot+53383ae265fb161ef488@...kaller.appspotmail.com
for tracking purposes. But Tetsuo deserves lots of credit for debugging it.
> ---
> kernel/locking/lockdep.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 9593233..e805fe3 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -3535,6 +3535,9 @@ static int __lock_downgrade(struct lockdep_map *lock, unsigned long ip)
> unsigned int depth;
> int i;
>
> + if (unlikely(!debug_locks))
> + return 0;
> +
Are we sure this resolves the problem rather than makes the
inconsistency window smaller?
I don't understand all surrounding code, but looking just at this
function it looks like it may just pepper over the problem. Say, we
pass this check when lockdep was still turned on. Then this thread is
preempted for some time (e.g. a virtual CPU), then another thread
started reporting a warning, turned lockdep off, some information
wasn't collected, and this this task resumes and reports a false
warning.
Or we are holding the mutex here, and the fact that we are holding it
ensures that no other task will take it and no information will be
lost?
Quite a tricky moment, perhaps deserves a comment.
> depth = curr->lockdep_depth;
> /*
> * This function is about (re)setting the class of a held lock,
> --
> 1.8.3.1
>
Powered by blists - more mailing lists