lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181003133704.43a58cf5@gandalf.local.home>
Date:   Wed, 3 Oct 2018 13:37:04 -0400
From:   Steven Rostedt <rostedt@...dmis.org>
To:     Daniel Wang <wonderfly@...gle.com>
Cc:     Petr Mladek <pmladek@...e.com>, stable@...r.kernel.org,
        Alexander.Levin@...rosoft.com, akpm@...ux-foundation.org,
        byungchul.park@....com, dave.hansen@...el.com, hannes@...xchg.org,
        jack@...e.cz, linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Mel Gorman <mgorman@...e.de>, mhocko@...nel.org, pavel@....cz,
        penguin-kernel@...ove.sakura.ne.jp, peterz@...radead.org,
        tj@...nel.org, torvalds@...ux-foundation.org, vbabka@...e.cz,
        Cong Wang <xiyou.wangcong@...il.com>,
        Peter Feiner <pfeiner@...gle.com>
Subject: Re: 4.14 backport request for dbdda842fe96f: "printk: Add console
 owner and waiter logic to load balance console writes"

On Wed, 3 Oct 2018 10:16:08 -0700
Daniel Wang <wonderfly@...gle.com> wrote:

> On Wed, Oct 3, 2018 at 2:14 AM Petr Mladek <pmladek@...e.com> wrote:
> >
> > On Tue 2018-10-02 21:23:27, Steven Rostedt wrote:  
> > > I don't see the big deal of backporting this. The biggest complaints
> > > about backports are from fixes that were added to late -rc releases
> > > where the fixes didn't get much testing. This commit was added in 4.16,
> > > and hasn't had any issues due to the design. Although a fix has been
> > > added:
> > >
> > > c14376de3a1 ("printk: Wake klogd when passing console_lock owner")  
> >
> > As I said, I am fine with backporting the console_lock owner stuff
> > into the stable release.
> >
> > I just wonder (like Sergey) what the real problem is. The console_lock
> > owner handshake is not fully reliable. It is might be good enough

I'm not sure what you mean by 'not fully reliable'

> > to prevent softlockup. But we should not relay on it to prevent
> > a deadlock.  
> 
> Yes. I myself was curious too. :)
> 
> >
> > My new theory ;-)
> >
> > printk_safe_flush() is called in nmi_trigger_cpumask_backtrace().  
> > => watchdog_timer_fn() is blocked until all backtraces are printed.  
> >
> > Now, the original report complained that the system rebooted before
> > all backtraces were printed. It means that panic() was called
> > on another CPU. My guess is that it is from the hardlockup detector.
> > And the panic() was not able to flush the console because it was
> > not able to take console_lock.
> >
> > IMHO, there was not a real deadlock. The console_lock owner
> > handshake jsut helped to get console_lock in panic() and
> > flush all messages before reboot => it is reasonable
> > and acceptable fix.  

Agreed.


> 
> I had the same speculation. Tried to capture a lockdep snippet with
> CONFIG_PROVE_LOCKING turned on but didn't get anything. But
> maybe I was doing it wrong.
> 
> >
> > Just to be sure. Daniel, could you please send a log with
> > the console_lock owner stuff backported? There we would see
> > who called the panic() and why it rebooted early.  
> 
> Sure. Here is one. It's a bit long but complete. I attached another log
> snippet below it which is what I got when `softlockup_panic` was turned
> off. The log was from the IRQ task that was flushing the printk buffer. I
> will be taking a closer look at it too but in case you'll find it helpful.

Just so I understand correctly. Does the panic hit with and without the
suggested backport patch? The only difference is that you get the full
output with the patch and limited output without it?

-- Steve

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ