lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFgQCTveoz0fOELrwUY5ZSG_iNKkjGJ32QW1POo-OfjvXM=YLQ@mail.gmail.com>
Date:   Sun, 25 Oct 2020 21:11:07 +0800
From:   Pingfan Liu <kernelfans@...il.com>
To:     "Oliver O'Halloran" <oohall@...il.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Maulik Shah <mkshah@...eaurora.org>,
        Petr Mladek <pmladek@...e.com>,
        Oliver Neukum <oneukum@...e.com>,
        Jonathan Corbet <corbet@....net>,
        "Gustavo A. R. Silva" <gustavo@...eddedor.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Marc Zyngier <maz@...nel.org>,
        Linus Walleij <linus.walleij@...aro.org>,
        "Guilherme G. Piccoli" <gpiccoli@...onical.com>,
        linux-doc@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
        Lina Iyer <ilina@...eaurora.org>,
        Jisheng Zhang <Jisheng.Zhang@...aptics.com>,
        Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
        Al Viro <viro@...iv.linux.org.uk>,
        Andrew Morton <akpm@...ux-foundation.org>,
        afzal mohammed <afzal.mohd.ma@...il.com>,
        Kexec Mailing List <kexec@...ts.infradead.org>,
        Mike Kravetz <mike.kravetz@...cle.com>
Subject: Re: [Skiboot] [PATCH 0/3] warn and suppress irqflood

On Sun, Oct 25, 2020 at 8:21 PM Oliver O'Halloran <oohall@...il.com> wrote:
>
> On Sun, Oct 25, 2020 at 10:22 PM Pingfan Liu <kernelfans@...il.com> wrote:
> >
> > On Thu, Oct 22, 2020 at 4:37 PM Thomas Gleixner <tglx@...utronix.de> wrote:
> > >
> > > On Thu, Oct 22 2020 at 13:56, Pingfan Liu wrote:
> > > > I hit a irqflood bug on powerpc platform, and two years ago, on a x86 platform.
> > > > When the bug happens, the kernel is totally occupies by irq.  Currently, there
> > > > may be nothing or just soft lockup warning showed in console. It is better
> > > > to warn users with irq flood info.
> > > >
> > > > In the kdump case, the kernel can move on by suppressing the irq flood.
> > >
> > > You're curing the symptom not the cause and the cure is just magic and
> > > can't work reliably.
> > Yeah, it is magic. But at least, it is better to printk something and
> > alarm users about what happens. With current code, it may show nothing
> > when system hangs.
> > >
> > > Where is that irq flood originated from and why is none of the
> > > mechanisms we have in place to shut it up working?
> > The bug originates from a driver tpm_i2c_nuvoton, which calls i2c-bus
> > driver (i2c-opal.c). After i2c_opal_send_request(), the bug is
> > triggered.
> >
> > But things are complicated by introducing a firmware layer: Skiboot.
> > This software layer hides the detail of manipulating the hardware from
> > Linux.
> >
> > I guess the software logic can not enter a sane state when kernel crashes.
> >
> > Cc Skiboot and ppc64 community to see whether anyone has idea about it.
>
> What system are you using?

Here is the info, if not enough, I will get more.
 Product Name          : OpenPOWER Firmware
 Product Version       : open-power-SUPERMICRO-P9DSU-V1.16-20180531-imp
 Product Extra         : op-build-e4b3eb5
 Product Extra         : skiboot-v6.0-p1da203b
 Product Extra         : hostboot-f911e5c-pda8239f
 Product Extra         : occ-77bb5e6-p623d1cd
 Product Extra         : linux-4.16.7-openpower2-pbc45895
 Product Extra         : petitboot-v1.7.1-pf773c0d
 Product Extra         : machine-xml-218a77a

>
> There's an external interrupt pin which is supposed to be wired to the
> TPM. I think we bounce that interrupt to FW by default since the
> external interrupt is sometimes used for other system-specific
> purposes. Odds are FW doesn't know what to do with it so you
> effectively have an always-on LSI. I fixed a similar bug a while ago
> by having skiboot mask any interrupts it doesn't have a handler for,
This sounds like the root cause. But here Skiboot should have handler,
otherwise the first kernel can not run smoothly.

Do you have any idea about an unexpected re-initialization introducing
an unsane stage?

Thanks,
Pingfan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ