linux-kernel - Re: [PATCH 1/2] tpm, tpm_tis: Handle interrupt storm

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230523151642.GA31298@wunner.de>
Date:   Tue, 23 May 2023 17:16:42 +0200
From:   Lukas Wunner <lukas@...ner.de>
To:     Péter Ujfalusi <peter.ujfalusi@...ux.intel.com>
Cc:     Lino Sanfilippo <LinoSanfilippo@....de>, peterhuewe@....de,
        jarkko@...nel.org, jgg@...pe.ca, jsnitsel@...hat.com,
        hdegoede@...hat.com, oe-lkp@...ts.linux.dev, lkp@...el.com,
        peterz@...radead.org, linux@...ewoehner.de,
        linux-integrity@...r.kernel.org, linux-kernel@...r.kernel.org,
        l.sanfilippo@...bus.com, p.rosenberger@...bus.com
Subject: Re: [PATCH 1/2] tpm, tpm_tis: Handle interrupt storm

On Tue, May 23, 2023 at 12:14:28PM +0300, Péter Ujfalusi wrote:
> On 23/05/2023 10:44, Lukas Wunner wrote:
> > On Tue, May 23, 2023 at 09:48:23AM +0300, Péter Ujfalusi wrote:
> >> On 22/05/2023 17:31, Lino Sanfilippo wrote:
> > [...]
> >> This looked promising, however it looks like the UPX-i11 needs the DMI
> >> quirk.
> > 
> > Why is that?  Is there a fundamental problem with the patch or is it
> > a specific issue with that device?
> 
> The flood is not detected (if there is a flood at all), interrupt stops
> working after about 200 interrupts - in the latest boot at 118th.

You've got a variant of the "never asserted interrupt".

That condition is currently tested only once on probe in tpm_tis_core_init().
The solution would be to disable interrupts whenever they're not (or no
longer asserted).  

However, that's a distinct issue from the one addressed by the present
patch, which deals with a "never *de*asserted interrupt".

> >>> +	dev_err(&chip->dev, HW_ERR
> >>> +		"TPM interrupt storm detected, polling instead\n");
> >>
> >> Should this be dev_warn or even dev_info level?
> > 
> > The corresponding message emitted in tpm_tis_core_init() for
> > an interrupt that's *never* asserted uses dev_err(), so using
> > dev_err() here as well serves consistency:
> > 
> > 	dev_err(&chip->dev, FW_BUG
> > 		"TPM interrupt not working, polling instead\n");
> > 
> > That way the same severity is used both for the never asserted and
> > the never deasserted interrupt case.
> 
> Oh, OK.
> Is there anything the user can do to have a ERROR less boot?

You're right that the user can't do anything about it and that
toning the message down to KERN_WARN or even KERN_NOTICE severity
may be appropriate.

However the above-quoted message for the never asserted interrupt
in tpm_tis_core_init() should then likewise be toned down to the
same severity.

I'm wondering why that message uses FW_BUG.  That doesn't make any
sense to me.  It's typically not a firmware bug, but a hardware issue,
e.g. an interrupt pin may erroneously not be connected or may be
connected to ground.  Lino used HW_ERR, which seems more appropriate
to me.

> >>>  	rc = tpm_tis_write32(priv, TPM_INT_STATUS(priv->locality), interrupt);
> >>>  	tpm_tis_relinquish_locality(chip, 0);
> >>>  	if (rc < 0)
> >>> -		return IRQ_NONE;
> >>> +		goto unhandled;
> >>
> >> This is more like an error than just unhandled IRQ. Yes, it was ignored,
> >> probably because it is common?
> > 
> > The interrupt may be shared and then it's not an error.
> 
> but this is tpm_tis_write32() failing, no? If it is shared interrupt and
> we return IRQ_HANDLED unconditionally then I think the core will think
> that the interrupt was for this device and it was handled.

No.  The IRQ_HANDLED versus IRQ_NONE return values are merely used
for book-keeping of spurious interrupts.  If IRQ_HANDLED is returned,
the other handlers will still be invoked.  It is not discernible whether
a shared interrupt was asserted by a single device or by multiple devices,
so all handlers need to be called.

Thanks,

Lukas