lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56B25050.9070003@hurleysoftware.com>
Date:	Wed, 3 Feb 2016 11:09:04 -0800
From:	Peter Hurley <peter@...leysoftware.com>
To:	Dmitry Vyukov <dvyukov@...gle.com>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	Jiri Slaby <jslaby@...e.com>,
	One Thousand Gnomes <gnomes@...rguk.ukuu.org.uk>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	LKML <linux-kernel@...r.kernel.org>,
	J Freyensee <james_p_freyensee@...ux.intel.com>,
	syzkaller <syzkaller@...glegroups.com>,
	Kostya Serebryany <kcc@...gle.com>,
	Alexander Potapenko <glider@...gle.com>,
	Sasha Levin <sasha.levin@...cle.com>,
	Eric Dumazet <edumazet@...gle.com>
Subject: Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc

On 02/03/2016 09:32 AM, Dmitry Vyukov wrote:
> On Wed, Feb 3, 2016 at 5:24 AM, Peter Hurley <peter@...leysoftware.com> wrote:
>> Hi Dmitry,
>>
>> On 01/21/2016 09:43 AM, Peter Hurley wrote:
>>> On 01/21/2016 02:06 AM, Dmitry Vyukov wrote:
>>>> On Wed, Jan 20, 2016 at 5:08 PM, Peter Hurley <peter@...leysoftware.com> wrote:
>>>>> On 01/20/2016 05:02 AM, Peter Zijlstra wrote:
>>>>>> On Wed, Dec 30, 2015 at 11:44:01AM +0100, Dmitry Vyukov wrote:
>>>>>>> -> #3 (&buf->lock){+.+...}:
>>>>>>>        [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
>>>>>>>        [<     inline     >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:112
>>>>>>>        [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
>>>>>>>        [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502
>>>>>>
>>>>>> So in any recent code that I look at this function tries to acquire
>>>>>> tty->ctrl_lock, not buf->lock. Am I missing something ?!
>>>>>
>>>>> Yes.
>>>>>
>>>>> The tty locks were annotated with __lockfunc so were being elided from lockdep
>>>>> stacktraces. Greg has a patch in his queue from me that removes the __lockfunc
>>>>> annotation ("tty: Remove __lockfunc annotation from tty lock functions").
>>>>>
>>>>> Unfortunately, I think syzkaller's post-processing stack trace isn't helping
>>>>> either, giving the impression that the stack is still inside tty_get_pgrp().
>>>>>
>>>>> It's not.
>>>>
>>>> I've got a new report on commit
>>>> a200dcb34693084e56496960d855afdeaaf9578f  (Jan 18).
>>>> Here is unprocessed version:
>>>> https://gist.githubusercontent.com/dvyukov/428a0c9bfaa867d8ce84/raw/0754db31668602ad07947f9964238b2f9cf63315/gistfile1.txt
>>>> and here is processed one:
>>>> https://gist.githubusercontent.com/dvyukov/42b874213de82d94c35e/raw/2bbced252035821243678de0112e2ed3a766fb5d/gistfile1.txt
>>>>
>>>> Peter, what exactly is wrong with the post-processed version?
>>>
>>> Yeah, ok, I assumed the problem with this report was post-processing
>>> because of the other report that had mixed-up info.
>>>
>>> However, the #3 stacktrace is obviously wrong, as others have already noted.
>>> Plus, the #1 stacktrace is wrong as well.
>>>
>>>> I would be interested in fixing the processing script.
>>>
>>> Not that it's related (since the original, not-edited report has bogus
>>> stacktraces), but how are you doing debug symbol lookup?
>>>
>>> Because below is not correct. Should be kernel/kthread.c:177 (or thereabouts)
>>>
>>>        [<ffffffff813b423f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
>>>
>>>
>>>> As far as I see it contains the same stacks just with line numbers and
>>>> inlined frames.
>>>
>>> Agree, now that I see the original report.
>>>
>>>> I am using a significantly different compilation mode
>>>> (kasan + kcov + very recent gcc), so nobody except me won't be able to
>>>> figure out line numbers based on offsets.
>>>
>>> Weird. Maybe something to do with the compiler.
>>>
>>> Can you get me the dmesg output running the patch below?
>>
>> Wondering if this is still the priority it was not so long ago?
>> If not, that's fine and I'll drop this from my followup list.
> 
> 
> Yes, it is still the priority for me.
> I've tried to apply your debugging patch, but I noticed that it prints
> dependencies stacks as it discovers them.

Yeah, that's the point; I need to understand why lockdep doesn't
store the correct stack trace at dependency discovery.

Since the correct stack trace will be printed instead, it will help
debug the lockdep problem.

Hopefully, once the problem with the bad stacktraces are fixed, the
actual circular lock dependencies will be clear.

> But in my setup I don't have
> all output from machine start (there is just too many of it).

Kernel parameter:

	log_buf_len=1G


> And I don't have a localized reproducer for this.

I really just need the lockdep dependency stacks generated during boot,
and the ctrl+C in a terminal window to trigger one of the dependency
stacks.

> I will try again.

Ok.

> Do you want me to debug with your "tty: Fix lock inversion in
> N_TRACEROUTER" patch applied or not (I still see slightly different
> deadlock reports with it)?

Not.

I think that probably does fix at least one circular dependency, but
I want to figure out the bad stack trace problem first.

There's probably another circular dependency there, as indicated by
your other report.

Regards,
Peter Hurley


>>> Please first just get me the output shortly after boot (please do a
>>> ctrl+C in a terminal window to trigger the expected stacktrace
>>> corresponding to #3 stacktrace from your report).
>>>
>>> Then please run your test that generates the circular lockdep report.
>>>
>>> --- >% ---
>>> Subject: [PATCH] debug: dump stacktraces for tty lock dependencies
>>>
>>> ---
>>>  kernel/locking/lockdep.c | 12 +++++++-----
>>>  1 file changed, 7 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
>>> index 60ace56..b67cafb 100644
>>> --- a/kernel/locking/lockdep.c
>>> +++ b/kernel/locking/lockdep.c
>>> @@ -332,7 +332,7 @@ EXPORT_SYMBOL(lockdep_on);
>>>   * Debugging switches:
>>>   */
>>>
>>> -#define VERBOSE                      0
>>> +#define VERBOSE                      1
>>>  #define VERY_VERBOSE         0
>>>
>>>  #if VERBOSE
>>> @@ -351,13 +351,15 @@ EXPORT_SYMBOL(lockdep_on);
>>>   */
>>>  static int class_filter(struct lock_class *class)
>>>  {
>>> -#if 0
>>> -     /* Example */
>>> +#if 1
>>>       if (class->name_version == 1 &&
>>> -                     !strcmp(class->name, "lockname"))
>>> +                     !strcmp(class->name, "&buf->lock"))
>>>               return 1;
>>>       if (class->name_version == 1 &&
>>> -                     !strcmp(class->name, "&struct->lockfield"))
>>> +                     !strcmp(class->name, "&port->buf.lock/1"))
>>> +             return 1;
>>> +     if (class->name_version == 1 &&
>>> +                     !strcmp(class->name, "routelock"))
>>>               return 1;
>>>  #endif
>>>       /* Filter everything else. 1 would be to allow everything else */
>>>
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ