linux-kernel - Re: [PATCH] lockdep: fix stack trace caching logic

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56B39997.6040806@hurleysoftware.com>
Date:	Thu, 4 Feb 2016 10:33:59 -0800
From:	Peter Hurley <peter@...leysoftware.com>
To:	Dmitry Vyukov <dvyukov@...gle.com>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Kostya Serebryany <kcc@...gle.com>,
	Alexander Potapenko <glider@...gle.com>,
	Sasha Levin <sasha.levin@...cle.com>
Subject: Re: [PATCH] lockdep: fix stack trace caching logic

On 02/04/2016 09:50 AM, Dmitry Vyukov wrote:
> On Thu, Feb 4, 2016 at 6:44 PM, Peter Hurley <peter@...leysoftware.com> wrote:
>> On 02/04/2016 05:40 AM, Dmitry Vyukov wrote:
>>> check_prev_add() caches saved stack trace in static trace variable
>>> to avoid duplicate save_trace() calls in dependencies involving trylocks.
>>> But that caching logic contains a bug.
>>
>>> We may not save trace on first iteration due to early return from check_prev_add().
>>
>> This commit log should identify the role test instrumentation plays in triggering
>> this bug: is it a recursive read lock dependency injected between existing lock
>> dependencies? What test component triggered this?
> 
> 
> I don't think that it has something to do with instrumentation.
> check_prev_add() has explicitly coded early exits paths which don't
> save stack trace.
> For example, if first pair of locks is: try-read-lock vs read-lock, we
> will not save stack trace; then next pair is: write-lock vs read-lock,
> we will use bogus stack trace.

Ok, but that isn't the case with the triggering stack traces we were getting.

But I see now that there's a existing lock dependency that check_prev_add()
is skipping. IOW, the other early-out path is being taken; lockdep has already seen:

 new dependency:  (&tty->ldisc_sem){++++..} =>  (&buf->lock){+.+...}

So yeah, not related to test instrumentation then.


>>> Then on the second iteration when we actually need the trace we don't save it
>>> because we think that we've already saved it.
>>>
>>> Let check_prev_add() itself control when stack is saved.
>>>
>>> There is another bug. Trace variable is protected by graph lock.
>>> But we can temporary release graph lock during printing.
>>>
>>> Fix this by invalidating cached stack trace when we release graph lock.
>>>
>>> Signed-off-by: Dmitry Vyukov <dvyukov@...gle.com>
>>> ---
>>>  kernel/locking/lockdep.c | 16 ++++++++++------
>>>  1 file changed, 10 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
>>> index 60ace56..c7710e4 100644
>>> --- a/kernel/locking/lockdep.c
>>> +++ b/kernel/locking/lockdep.c
>>> @@ -1822,7 +1822,7 @@ check_deadlock(struct task_struct *curr, struct held_lock *next,
>>>   */
>>>  static int
>>>  check_prev_add(struct task_struct *curr, struct held_lock *prev,
>>> -            struct held_lock *next, int distance, int trylock_loop)
>>> +            struct held_lock *next, int distance, int *stack_saved)
>>>  {
>>>       struct lock_list *entry;
>>>       int ret;
>>> @@ -1883,8 +1883,11 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev,
>>>               }
>>>       }
>>>
>>> -     if (!trylock_loop && !save_trace(&trace))
>>> -             return 0;
>>> +     if (!*stack_saved) {
>>> +             if (!save_trace(&trace))
>>> +                     return 0;
>>> +             *stack_saved = 1;
>>> +     }
>>>
>>>       /*
>>>        * Ok, all validations passed, add the new lock
>>> @@ -1907,6 +1910,8 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev,
>>>        * Debugging printouts:
>>>        */
>>>       if (verbose(hlock_class(prev)) || verbose(hlock_class(next))) {
>>> +             /* We drop graph lock, so another thread can overwrite trace. */
>>> +             *stack_saved = 0;
>>>               graph_unlock();
>>>               printk("\n new dependency: ");
>>>               print_lock_name(hlock_class(prev));
>>> @@ -1929,7 +1934,7 @@ static int
>>>  check_prevs_add(struct task_struct *curr, struct held_lock *next)
>>>  {
>>>       int depth = curr->lockdep_depth;
>>> -     int trylock_loop = 0;
>>> +     int stack_saved = 0;
>>>       struct held_lock *hlock;
>>>
>>>       /*
>>> @@ -1956,7 +1961,7 @@ check_prevs_add(struct task_struct *curr, struct held_lock *next)
>>>                */
>>>               if (hlock->read != 2 && hlock->check) {
>>>                       if (!check_prev_add(curr, hlock, next,
>>> -                                             distance, trylock_loop))
>>> +                                             distance, &stack_saved))
>>>                               return 0;
>>>                       /*
>>>                        * Stop after the first non-trylock entry,
>>> @@ -1979,7 +1984,6 @@ check_prevs_add(struct task_struct *curr, struct held_lock *next)
>>>               if (curr->held_locks[depth].irq_context !=
>>>                               curr->held_locks[depth-1].irq_context)
>>>                       break;
>>> -             trylock_loop = 1;
>>>       }
>>>       return 1;
>>>  out_bug:
>>>
>>