linux-kernel - Re: [PATCH v6 1/2] firmware_loader: use -ETIMEDOUT instead of -EAGAIN in fw_load_sysfs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20210723172649.t4b2hqlmhk3v7wn5@garbanzo>
Date:   Fri, 23 Jul 2021 10:26:49 -0700
From:   Luis Chamberlain <mcgrof@...nel.org>
To:     Anirudh Rayabharam <mail@...rudhrb.com>
Cc:     gregkh@...uxfoundation.org, rafael@...nel.org,
        skhan@...uxfoundation.org, linux-kernel@...r.kernel.org,
        linux-kernel-mentees@...ts.linuxfoundation.org
Subject: Re: [PATCH v6 1/2] firmware_loader: use -ETIMEDOUT instead of
 -EAGAIN in fw_load_sysfs_fallback

On Fri, Jul 23, 2021 at 07:28:59PM +0530, Anirudh Rayabharam wrote:
> On Thu, Jul 22, 2021 at 12:59:24PM -0700, Luis Chamberlain wrote:
> > On Thu, Jul 22, 2021 at 06:02:28PM +0530, Anirudh Rayabharam wrote:
> > > The only motivation for using -EAGAIN in commit 0542ad88fbdd81bb
> > > ("firmware loader: Fix _request_firmware_load() return val for fw load
> > > abort") was to distinguish the error from -ENOMEM, and so there is no
> > > real reason in keeping it. Keeping -ETIMEDOU is much telling of what the
> > 
> > Since you'll have to respin, a missing here   ^, also add that the
> > -ETIMEDOUT is what we'd get when we do time out on the wait, as its
> > not clear from the conext being changed.
> > 
> > > reason for a failure is, so just use that.
> > > 
> > > The rest is just trying to document a bit more of the motivations for the
> > > error codes, as otherwise we'd lose this information easily.
> > 
> > This is a separate change, and it actually does more than just that.
> > Moving code around should be done separately. The idea is to
> > first just remove the -EAGAIN so that the change is *easy* to review.
> > A remove of a return code *and* a move of code around makes it less
> > obvious for code review. And part of the comment is wrong now that we
> > removed -EAGAIN. When breaking patches up please review each change
> > going into each patch and consider if it makes sense, atomically.
> > 
> > > Suggested-by: Luis Chamberlain <mcgrof@...nel.org>
> > > Signed-off-by: Anirudh Rayabharam <mail@...rudhrb.com>
> > > ---
> > >  drivers/base/firmware_loader/fallback.c | 34 +++++++++++++++++--------
> > >  1 file changed, 24 insertions(+), 10 deletions(-)
> > > 
> > > diff --git a/drivers/base/firmware_loader/fallback.c b/drivers/base/firmware_loader/fallback.c
> > > index 91899d185e31..1db94165feaf 100644
> > > --- a/drivers/base/firmware_loader/fallback.c
> > > +++ b/drivers/base/firmware_loader/fallback.c
> > > @@ -70,7 +70,29 @@ static inline bool fw_sysfs_loading(struct fw_priv *fw_priv)
> > >  
> > >  static inline int fw_sysfs_wait_timeout(struct fw_priv *fw_priv,  long timeout)
> > >  {
> > > -	return __fw_state_wait_common(fw_priv, timeout);
> > > +	int ret = __fw_state_wait_common(fw_priv, timeout);
> > > +
> > > +	/*
> > > +	 * A signal could be sent to abort a wait. Consider Android's init
> > > +	 * gettting a SIGCHLD, which in turn was the same process issuing the
> > > +	 * sysfs store call for the fallback. In such cases we want to be able
> > > +	 * to tell apart in userspace when a signal caused a failure on the
> > > +	 * wait. In such cases we'd get -ERESTARTSYS.
> > > +	 *
> > > +	 * Likewise though another race can happen and abort the load earlier.
> > 
> > This comment is about the check for fw_load_abort() so since the move is
> > not going to happen when you remove -EAGAIN just leave it out. It can be
> > added once you do the move.
> > 
> > > +	 *
> > > +	 * In either case the situation is interrupted so we just inform
> > > +	 * userspace of that and we end things right away.
> > 
> > Be mindful that this is in context of both cases when re-writing the
> > patches.
> > 
> > > +	 *
> > > +	 * When we really time out just tell userspace it should try again,
> > > +	 * perhaps later.
> > 
> > That's the thing, we're getting rid of that -EAGAIN as it made no sense,
> > the goal was to just distinguish the error from -ENOMEM. That's it.
> > Since we are removing the -EAGAIN, this comment makes no sense as we
> > have clarified with Shuah that the goal of her patch was just to
> > distinguish the error.
> > 
> > So "tell userspace to try again" makes no sense since if a timeout
> > happened userspace can't really try again as we have aborted the whole
> > operation to allow firmware to be uploaded.
> > 
> > In fact, please add that to the commit log which removes the -EAGAIN,
> > something like:
> > 
> > "Using -EAGAIN is also not correct as this return code is typically used
> > to tell userspace to try something again, in this case re-using the
> > sysfs loading interface cannot be retried when a timeout happens, so
> > the return value is also bogus."
> > 
> > > +	 */
> > > +	if (ret == -ERESTARTSYS || fw_state_is_aborted(fw_priv))
> > > +		ret = -EINTR;
> > > +	else if (fw_priv->is_paged_buf && !fw_priv->data)
> > > +		ret = -ENOMEM;
> > > +
> > > +	return ret;
> > >  }
> > >  
> > >  struct fw_sysfs {
> > > @@ -526,20 +548,12 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout)
> > >  	}
> > >  
> > >  	retval = fw_sysfs_wait_timeout(fw_priv, timeout);
> > > -	if (retval < 0 && retval != -ENOENT) {
> > > +	if (retval < 0) {
> > >  		mutex_lock(&fw_lock);
> > >  		fw_load_abort(fw_sysfs);
> > >  		mutex_unlock(&fw_lock);
> > >  	}
> > >  
> > > -	if (fw_state_is_aborted(fw_priv)) {
> > > -		if (retval == -ERESTARTSYS)
> > > -			retval = -EINTR;
> > > -		else
> > > -			retval = -EAGAIN;
> > 
> > All we want to do is remove this -EAGAIN line in one patch. We
> > don't want to move code to another place. We do this to make code
> 
> Is the move necessary or should I drop it from this series entirely?

The move is possible, sure. Maybe do that in a separate patch. But
just read each patch as you write it, and make sure they do just *one*
thing at a time. Re-read the patch once done and make sure each
patch makes sense on its own.

  Luis