linux-kernel - Re: [PATCH] firmware: wake all waiters

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170626212036.GE21846@wotan.suse.de>
Date:   Mon, 26 Jun 2017 23:20:36 +0200
From:   "Luis R. Rodriguez" <mcgrof@...nel.org>
To:     Jakub Kicinski <jakub.kicinski@...ronome.com>,
        Ming Lei <ming.lei@...hat.com>, yi1.li@...ux.intel.com,
        takahiro.akashi@...aro.org, nbroeking@...com
Cc:     "Luis R. Rodriguez" <mcgrof@...nel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        mfuzzey@...keon.com, ebiederm@...ssion.com,
        dmitry.torokhov@...il.com, wagi@...om.org, dwmw2@...radead.org,
        jewalt@...innovations.com, rafal@...ecki.pl,
        arend.vanspriel@...adcom.com, rjw@...ysocki.net, atull@...nel.org,
        moritz.fischer@...us.com, pmladek@...e.com,
        johannes.berg@...el.com, emmanuel.grumbach@...el.com,
        luciano.coelho@...el.com, luto@...nel.org,
        torvalds@...ux-foundation.org, keescook@...omium.org,
        dhowells@...hat.com, pjones@...hat.com, hdegoede@...hat.com,
        alan@...ux.intel.com, tytso@....edu, paul.gortmaker@...driver.com,
        mtosatti@...hat.com, mawilcox@...rosoft.com,
        stephen.boyd@...aro.org, markivx@...eaurora.org,
        linux-kernel@...r.kernel.org, oss-drivers@...ronome.com
Subject: Re: [PATCH] firmware: wake all waiters

Thank you for your patch!

On Fri, Jun 23, 2017 at 04:37:02PM -0700, Jakub Kicinski wrote:
> Multiple devices may be waiting for firmware with the same name.

This is due to a hidden and not-well understood feature of the firmware API.  I
can trace commit logs loosely documenting this as an intended feature by Ming
Lei since commit 1f2b79599ee8f5f ("firmware loader: always let firmware_buf own
the pages buffer") so at least it would seems this is intended functionality.

Unfortunately this feature also has quite a big of bugs which will need to be
addressed after you patch. I'll address first your patch, and then explain the
rest of the issues lingering.

To expedite things I'll re-submit your patch with a different commit log
describing this mechanism a bit better otherwise this fix will would be hard to
understand.  Understanding the impact is also key as we want this to be
evaluated for for stable as well! I'd prefer your patch to go in after the
pending stable signal fixes as well.

Proposed alternative commit log:
******
Subject: [PATCH] firmware: fix batched requests - wake all waiters

The firmware cache mechanism serves two purposes, the secondary purpose is
not well documented nor understood. This fixes a regression with the secondary
purpose of the firmware cache mechanism: batched requests.

The firmware cache is used for:

1) Addressing races with file lookups during the suspend/resume cycle
   by keeping firmware in memory during the cycle

2) Batched requests for the same file rely only on work from the first file
   lookup, which keeps the firmware in memory until the last release_firmware()
   is called

Batched requests *only* take effect if secondary requests come in prior to the
first user calling release_firmware(). The devres name used for the internal
firmware cache is used as a hint other pending requests are ongoing, the
firmware buffer data is kept in memory until the last user of the buffer
calls release_firmware(), therefore serializing requests and delaying the
release until all requests are done.

Batched requests wait for a wakup or signal (we only accept SIGKILL now) so we
can rely on the first file fetch to write to the pending secondary requests.
Commit 5b029624948d ("firmware: do not use fw_lock for fw_state protection")
ported the firmware API to use swait, and in doing so failed to convert
complete_all() to swake_up_all() -- it used swake_up(), loosing the ability
for *some* batched requests to take effect.

Without this fix it has been reported plugging in two Intel 6260 Wifi cards
on a system will end up enumerating the two devices only 50% of the time
[0]. The ported swake_up() should have actually two devices, however,
*if more than two cards are used* the swake_up() would not suffice. This
change is only part of the required fixes for batched requests. Subsequent
fixes will follow.

This particular change should fix the cases where more than three requests
with the same firmware name is used, otherwise batched requests will wait for
MAX_SCHEDULE_TIMEOUT and just timeout eventually.

[0] https://bugzilla.kernel.org/show_bug.cgi?id=195477

Fixes: 5b029624948d ("firmware: do not use fw_lock for fw_state protection")
CC: <stable@...r.kernel.org>    [4.10+]
******

This was merged on v4.10 and complete_all() was used before older kernels,
so older kernels should not be affected by this particular regression.

> In that case we will make them all use the same struct firmware_buf.
> When wake up happens make sure it's propagated to all of them.
> 
> Signed-off-by: Jakub Kicinski <jakub.kicinski@...ronome.com>

There's a slew of bugs lurking here though!

As noted the reported Intel driver issues still need other fixes, one was the
fw_state_done() on the direct filesystem lookup mechanism [1], and that may be
a regression since direct filesystem loading was added, and even secondary
requests would seem to just wait forever (MAX_SCHEDULE_TIMEOUT); the combination
of both fixes should fix your reported issue.

Do you intend on submitting those changes as well ? There's still *other* bugs
with this feature though... Knowing if you will follow up with further fixes
will be appreciated.

After this patch things we need then:

  0) addressing the remainder of the delta from kernel.org bug 195477 [1]
  1) addressing error paths on the 1st request to wake up waiters
  2) documenting this hidden feature
  3) a test case for this feature

This feature takes effect effect when fw_lookup_and_allocate_buf() returns 1,
when __fw_lookup_buf() finds the firmware requested on the firmware cache list
already.  This was designed to only take effect if release_firmware() was not
called before the secondary lookups, as otherwise kref_put() would be called with
the respective freeing of the buffer used for waiting and data.

If the 1st request did not free the buf with kref_put() and __fw_free_buf(),
that means its up to the batched requests to address the release. This should
be OK today if the 2nd request was successful, but on failure we have nothing
freeing the old buf currently. This fix is lower priority due to how rare it
could be, but given we currently always fail even if we were successful on
load on direct fs lookup this issue should have been more common on systems
with more than 2 cards then.

Yet another stable fix.

Also consider the case of the 1st request is still being processed, and batched
requests are in queue. As noted since fw_state_wait() is used we'll wait for
MAX_SCHEDULE_TIMEOUT, but if a failure on the 1st request happens there are
a slew of cases where we do not issue a wake up! So we're missing some sprinkled
fw_state_aborted() on error paths. Because of these issues a failed request
with batched requets pending will just wait for MAX_SCHEDULE_TIMEOUT.

Yet another stable fix.

[1] https://bugzilla.kernel.org/attachment.cgi?id=256493

  Luis

> ---
>  drivers/base/firmware_class.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
> index ac350c518e0c..c23b58e64b33 100644
> --- a/drivers/base/firmware_class.c
> +++ b/drivers/base/firmware_class.c
> @@ -148,7 +148,7 @@ static void __fw_state_set(struct fw_state *fw_st,
>  	WRITE_ONCE(fw_st->status, status);
>  
>  	if (status == FW_STATUS_DONE || status == FW_STATUS_ABORTED)
> -		swake_up(&fw_st->wq);
> +		swake_up_all(&fw_st->wq);
>  }
>  
>  #define fw_state_start(fw_st)					\