lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 22 May 2020 15:52:44 -0500
From:   Alex Elder <elder@...e.org>
To:     Luis Chamberlain <mcgrof@...nel.org>
Cc:     jeyu@...nel.org, akpm@...ux-foundation.org, arnd@...db.de,
        rostedt@...dmis.org, mingo@...hat.com, aquini@...hat.com,
        cai@....pw, dyoung@...hat.com, bhe@...hat.com,
        peterz@...radead.org, tglx@...utronix.de, gpiccoli@...onical.com,
        pmladek@...e.com, tiwai@...e.de, schlad@...e.de,
        andriy.shevchenko@...ux.intel.com, keescook@...omium.org,
        daniel.vetter@...ll.ch, will@...nel.org,
        mchehab+samsung@...nel.org, kvalo@...eaurora.org,
        davem@...emloft.net, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org, Alex Elder <elder@...nel.org>
Subject: Re: [PATCH v2 10/15] soc: qcom: ipa: use new
 module_firmware_crashed()

On 5/22/20 12:28 AM, Luis Chamberlain wrote:
> On Tue, May 19, 2020 at 05:34:13PM -0500, Alex Elder wrote:
>> On 5/15/20 4:28 PM, Luis Chamberlain wrote:
>>> This makes use of the new module_firmware_crashed() to help
>>> annotate when firmware for device drivers crash. When firmware
>>> crashes devices can sometimes become unresponsive, and recovery
>>> sometimes requires a driver unload / reload and in the worst cases
>>> a reboot.
>>>
>>> Using a taint flag allows us to annotate when this happens clearly.
>>
>> I don't fully understand what this is meant to do, so I can't
>> fully assess whether it's the right thing to do.
> 
> It is meant to taint the kernel to ensure it is clear that something
> critically bad has happened with the device firmware, it crashed, and
> recovery may or may not happen, we are not 100% certain.
>>
>> But in this particular place in the IPA code, the *modem* has
>> crashed.  And the IPA driver is not responsible for modem
>> firmware, remoteproc is.
> 
> Oi vei. So the device it depends on has crashed.

Yes, more or less.  It could be considered a little more
nuanced than even that, but I won't get into it here.

>> The IPA driver *can* be responsible for loading some other
>> firmware, but even in that case, it only happens on initial
>> boot, and it's basically assumed to never crash.
> 
> OK is this an issue which we can recover from? If for the slightest bit
> this can affect users it is something we should inform them over.

If the IPA driver successfully loads this firmware, it should
be assumed to never crash.  So in that respect, there is no
recovery required.

Again, the modem (with which the IPA hardware and driver
interacts) can crash, or it can be shut down intentionally.
And in either case it can recover, automatically or manually.
But all of that (and the modem's firmware loading) is the
responsibility of the remoteproc subsystem, not IPA.

> This patch set is missing uevents for these issues, but I just added
> support for this.
> 
>> So regardless of whether this module_firmware_crashed() call is
>> appropriate in some places, I believe it should not be used here.
> 
> OK thanks. Can the user be affected by this crash? If so how? Can
> we recover ? Is that always guaranteed?

We can't guarantee anything about recovering from a crash of
an independent entity.  But by design, recovery from a modem
crash is possible and is supposed to leave Linux in a
consistent state.  A modem crash is likely to be observable
to the user.

I'll repeat that I don't believe the new call you inserted
in the IPA driver belongs there.

					-Alex

> 
>    Luis
> 

Powered by blists - more mailing lists