linux-kernel - Re: [PATCH driver-core/master] firmware: Correct handling of fw_state_wait

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAJpBn1yDR+zZJR24F1+s_NrYvqZEHmL_J-DEx89rC4h4KrbNrA@mail.gmail.com>
Date:   Mon, 16 Jan 2017 11:13:33 -0800
From:   Jakub Kicinski <jakub.kicinski@...ronome.com>
To:     "Luis R. Rodriguez" <mcgrof@...nel.org>
Cc:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Bjorn Andersson <bjorn.andersson@...aro.org>,
        Daniel Wagner <daniel.wagner@...-carit.de>,
        Ming Lei <ming.lei@...onical.com>,
        LKML <linux-kernel@...r.kernel.org>, oss-drivers@...ronome.com
Subject: Re: [PATCH driver-core/master] firmware: Correct handling of
 fw_state_wait_timeout() return value

On Mon, Jan 16, 2017 at 10:29 AM, Luis R. Rodriguez <mcgrof@...nel.org> wrote:
> On Mon, Jan 16, 2017 at 02:57:06PM +0000, Jakub Kicinski wrote:
>> Commit 5d47ec02c37e ("firmware: Correct handling of fw_state_wait()
>> return value") made the assumption that any error returned from
>> fw_state_wait_timeout() means FW load has to be aborted.  This is
>> incorrect FW load only has to be aborted when load timed out or
>
> You want a comma before FW -- but also:

Thanks!

>> has been interrupted,
>
> __fw_state_wait_common() returns -ENOENT when:
>
> if (ret != 0 && fw_st->status == FW_STATUS_ABORTED)
>         return -ENOENT;
>
> Why not for when -ENOENT is returned ?

I'm just going back to the pre-5d47ec02c37e behavior, I don't get all
the details of this code.  My understanding is that pre-5d47ec02c37e
we were only aborting on ret == 0 (i.e. timeout) or -ERESTARTSYS.

>> otherwise the waking thread had already
>> cleaned up for us.
>
> What code in what waking thread would have done precisely what cleanup?

That is not clear to me.  The waking is done in
firmware_loading_store().  I don't follow why firmware_loading_store()
is using fw_load_abort() in -1 case and fw_state_aborted() on an error
path of the 0 case (it's pre-git era stuff).  I assume the
fw_load_abort() unlinks the buffer so that next calls to store will
error out in the check on line 716.  I was initially going to change
that fw_load_abort() to *_aborted() but I'm afraid of the slight
change in user-visible behavior.

> And why can't fw_load_abort() handle being called twice and why not just
> instead allow for that?

Personal preference of making sure code is correct and not just able
to handle errors, I guess.

>> Fixes: 5d47ec02c37e ("firmware: Correct handling of fw_state_wait() return value")
>
> What does this fix exactly? A fix should describe the impact, what
> issues are in place without the fix. What also happens after the fix
> and why. In this commit log none of this is clear.

Sorry :S  The bug report was here:
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1310204.html
I should've done a better job, the tl;dr is that calling *_abort()
again in case user helper wrote -1 (FW not found) is causing a
NULL-deref.

>> Signed-off-by: Jakub Kicinski <jakub.kicinski@...ronome.com>
>> ---
>>  drivers/base/firmware_class.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
>> index 4497d263209f..ce142e6b2c72 100644
>> --- a/drivers/base/firmware_class.c
>> +++ b/drivers/base/firmware_class.c
>> @@ -1020,7 +1020,7 @@ static int _request_firmware_load(struct firmware_priv *fw_priv,
>>       }
>>
>>       retval = fw_state_wait_timeout(&buf->fw_st, timeout);
>> -     if (retval < 0) {
>> +     if (retval == -ETIMEDOUT || retval == -ERESTARTSYS) {
>
> Also, if your change is correct I will also note fw_state_wait_timeout()
> is just a wrapper for __fw_state_wait_common(), but we also have
> another wrapper for __fw_state_wait_common() now:
>
> #define fw_state_wait(fw_st)                                    \
>         __fw_state_wait_common(fw_st, MAX_SCHEDULE_TIMEOUT)
>
> Do we need to fix anything for fw_state_wait() ?

I looked at it and I think it's fine.

> Clarifying all this would help review your proposed changes. If you
> consider them a fix please be very clear as to the exact issue and
> what is fixed with your patch.

Sorry again, I hope things are clearer now.