[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230523151642.GA31298@wunner.de>
Date: Tue, 23 May 2023 17:16:42 +0200
From: Lukas Wunner <lukas@...ner.de>
To: Péter Ujfalusi <peter.ujfalusi@...ux.intel.com>
Cc: Lino Sanfilippo <LinoSanfilippo@....de>, peterhuewe@....de,
jarkko@...nel.org, jgg@...pe.ca, jsnitsel@...hat.com,
hdegoede@...hat.com, oe-lkp@...ts.linux.dev, lkp@...el.com,
peterz@...radead.org, linux@...ewoehner.de,
linux-integrity@...r.kernel.org, linux-kernel@...r.kernel.org,
l.sanfilippo@...bus.com, p.rosenberger@...bus.com
Subject: Re: [PATCH 1/2] tpm, tpm_tis: Handle interrupt storm
On Tue, May 23, 2023 at 12:14:28PM +0300, Péter Ujfalusi wrote:
> On 23/05/2023 10:44, Lukas Wunner wrote:
> > On Tue, May 23, 2023 at 09:48:23AM +0300, Péter Ujfalusi wrote:
> >> On 22/05/2023 17:31, Lino Sanfilippo wrote:
> > [...]
> >> This looked promising, however it looks like the UPX-i11 needs the DMI
> >> quirk.
> >
> > Why is that? Is there a fundamental problem with the patch or is it
> > a specific issue with that device?
>
> The flood is not detected (if there is a flood at all), interrupt stops
> working after about 200 interrupts - in the latest boot at 118th.
You've got a variant of the "never asserted interrupt".
That condition is currently tested only once on probe in tpm_tis_core_init().
The solution would be to disable interrupts whenever they're not (or no
longer asserted).
However, that's a distinct issue from the one addressed by the present
patch, which deals with a "never *de*asserted interrupt".
> >>> + dev_err(&chip->dev, HW_ERR
> >>> + "TPM interrupt storm detected, polling instead\n");
> >>
> >> Should this be dev_warn or even dev_info level?
> >
> > The corresponding message emitted in tpm_tis_core_init() for
> > an interrupt that's *never* asserted uses dev_err(), so using
> > dev_err() here as well serves consistency:
> >
> > dev_err(&chip->dev, FW_BUG
> > "TPM interrupt not working, polling instead\n");
> >
> > That way the same severity is used both for the never asserted and
> > the never deasserted interrupt case.
>
> Oh, OK.
> Is there anything the user can do to have a ERROR less boot?
You're right that the user can't do anything about it and that
toning the message down to KERN_WARN or even KERN_NOTICE severity
may be appropriate.
However the above-quoted message for the never asserted interrupt
in tpm_tis_core_init() should then likewise be toned down to the
same severity.
I'm wondering why that message uses FW_BUG. That doesn't make any
sense to me. It's typically not a firmware bug, but a hardware issue,
e.g. an interrupt pin may erroneously not be connected or may be
connected to ground. Lino used HW_ERR, which seems more appropriate
to me.
> >>> rc = tpm_tis_write32(priv, TPM_INT_STATUS(priv->locality), interrupt);
> >>> tpm_tis_relinquish_locality(chip, 0);
> >>> if (rc < 0)
> >>> - return IRQ_NONE;
> >>> + goto unhandled;
> >>
> >> This is more like an error than just unhandled IRQ. Yes, it was ignored,
> >> probably because it is common?
> >
> > The interrupt may be shared and then it's not an error.
>
> but this is tpm_tis_write32() failing, no? If it is shared interrupt and
> we return IRQ_HANDLED unconditionally then I think the core will think
> that the interrupt was for this device and it was handled.
No. The IRQ_HANDLED versus IRQ_NONE return values are merely used
for book-keeping of spurious interrupts. If IRQ_HANDLED is returned,
the other handlers will still be invoked. It is not discernible whether
a shared interrupt was asserted by a single device or by multiple devices,
so all handlers need to be called.
Thanks,
Lukas
Powered by blists - more mailing lists