linux-kernel - Re: [PATCH v4] firmware_loader: fix use-after-free in firmware_fallback

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAB=NE6Uq_t9Mbs8zN30QQrwH190p5Oz4M7OwDxmQgUeRRWS_6g@mail.gmail.com>
Date:   Thu, 10 Jun 2021 15:32:53 -0700
From:   Luis Chamberlain <mcgrof@...nel.org>
To:     Anirudh Rayabharam <mail@...rudhrb.com>
Cc:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Junyong Sun <sunjy516@...il.com>,
        linux-kernel-mentees@...ts.linuxfoundation.org,
        syzbot+de271708674e2093097b@...kaller.appspotmail.com,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v4] firmware_loader: fix use-after-free in firmware_fallback_sysfs

On Tue, May 18, 2021 at 8:20 PM Luis Chamberlain <mcgrof@...nel.org> wrote:
>
> On Tue, May 18, 2021 at 09:29:20PM +0530, Anirudh Rayabharam wrote:
> > This use-after-free happens when a fw_priv object has been freed but
> > hasn't been removed from the pending list (pending_fw_head). The next
> > time fw_load_sysfs_fallback tries to insert into the list, it ends up
> > accessing the pending_list member of the previoiusly freed fw_priv.
> >
> > The root cause here is that all code paths that abort the fw load
> > don't delete it from the pending list. For example:
> >
> >       _request_firmware()
> >         -> fw_abort_batch_reqs()
> >             -> fw_state_aborted()
> >
> > To fix this, delete the fw_priv from the list in __fw_set_state() if
> > the new state is DONE or ABORTED. This way, all aborts will remove
> > the fw_priv from the list. Accordingly, remove calls to list_del_init
> > that were being made before calling fw_state_(aborted|done)().
> >
> > Also, in fw_load_sysfs_fallback, don't add the fw_priv to the list
> > if it is already aborted. Instead, just jump out and return early.
> >
> > Fixes: bcfbd3523f3c ("firmware: fix a double abort case with fw_load_sysfs_fallback")
> > Reported-by: syzbot+de271708674e2093097b@...kaller.appspotmail.com
> > Tested-by: syzbot+de271708674e2093097b@...kaller.appspotmail.com
> > Signed-off-by: Anirudh Rayabharam <mail@...rudhrb.com>
> > ---
> >
> > Changes in v4:
> > Documented the reasons behind the error codes returned from
> > fw_sysfs_wait_timeout() as suggested by Luis Chamberlain.
> >
> > Changes in v3:
> > Modified the patch to incorporate suggestions by Luis Chamberlain in
> > order to fix the root cause instead of applying a "band-aid" kind of
> > fix.
> > https://lore.kernel.org/lkml/20210403013143.GV4332@42.do-not-panic.com/
> >
> > Changes in v2:
> > 1. Fixed 1 error and 1 warning (in the commit message) reported by
> > checkpatch.pl. The error was regarding the format for referring to
> > another commit "commit <sha> ("oneline")". The warning was for line
> > longer than 75 chars.
> >
> > ---
> >  drivers/base/firmware_loader/fallback.c | 46 ++++++++++++++++++-------
> >  drivers/base/firmware_loader/firmware.h |  6 +++-
> >  drivers/base/firmware_loader/main.c     |  2 ++
> >  3 files changed, 40 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/base/firmware_loader/fallback.c b/drivers/base/firmware_loader/fallback.c
> > index 91899d185e31..f244c7b89ba5 100644
> > --- a/drivers/base/firmware_loader/fallback.c
> > +++ b/drivers/base/firmware_loader/fallback.c
> > @@ -70,7 +70,31 @@ static inline bool fw_sysfs_loading(struct fw_priv *fw_priv)
> >
> >  static inline int fw_sysfs_wait_timeout(struct fw_priv *fw_priv,  long timeout)
> >  {
> > -     return __fw_state_wait_common(fw_priv, timeout);
> > +     int ret = __fw_state_wait_common(fw_priv, timeout);
> > +
> > +     /*
> > +      * A signal could be sent to abort a wait. Consider Android's init
> > +      * gettting a SIGCHLD, which in turn was the same process issuing the
> > +      * sysfs store call for the fallback. In such cases we want to be able
> > +      * to tell apart in userspace when a signal caused a failure on the
> > +      * wait. In such cases we'd get -ERESTARTSYS.
> > +      *
> > +      * Likewise though another race can happen and abort the load earlier.
> > +      *
> > +      * In either case the situation is interrupted so we just inform
> > +      * userspace of that and we end things right away.
> > +      *
> > +      * When we really time out just tell userspace it should try again,
> > +      * perhaps later.
> > +      */
> > +     if (ret == -ERESTARTSYS || fw_state_is_aborted(fw_priv))
> > +             ret = -EINTR;
> > +     else if (ret == -ETIMEDOUT)
> > +             ret = -EAGAIN;
>
>
> Shuah has explained to me that the only motivation on her part with
> using -EAGAIN on commit 0542ad88fbdd81bb ("firmware loader: Fix
> _request_firmware_load() return val for fw load abort") was to
> distinguish the error from -ENOMEM, and so there was no real
> reason to stick to -EAGAIN. Given -EAGAIN is used typically to
> ask user to retry, but it makes no sense in this case since the
> sysfs interface is ephemeral, I think we should do away with it
> and document this rationale.
>
> I think we should stick to use -ETIMEDOUT. Its more telling of what
> happened. And so I think just removing the check should do it, but
> augmenting the comment should suffice.
>
> Since this change is already big, it would be good for this other
> change to go in as a separate change. If you can test to ensure the
> -ETIMEDOUT does indeed get propagated that'd be appreciated.
>
> Otherwise looks good. Thanks for your patience!

Anirudh, did you get a chance to test?