[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <27240C0AC20F114CBF8149A2696CBE4A1BFA76@SHSMSX101.ccr.corp.intel.com>
Date: Thu, 8 Nov 2012 01:31:33 +0000
From: "Liu, Chuansheng" <chuansheng.liu@...el.com>
To: Ming Lei <ming.lei@...onical.com>
CC: "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH] firmware loader: Fix the race FW_STATUS_DONE is
followed by class_timeout
> -----Original Message-----
> From: Ming Lei [mailto:ming.lei@...onical.com]
> Sent: Wednesday, November 07, 2012 6:40 PM
> To: Liu, Chuansheng
> Cc: gregkh@...uxfoundation.org; linux-kernel@...r.kernel.org
> Subject: Re: [PATCH] firmware loader: Fix the race FW_STATUS_DONE is
> followed by class_timeout
>
> On Thu, Nov 8, 2012 at 12:45 AM, Chuansheng Liu
> <chuansheng.liu@...el.com> wrote:
> >
> > There is a race condition as below when calling request_firmware():
> >
> > CPU1 CPU2
> > write 0 > loading
> > mutex_lock(&fw_lock);
> > ...
> > set_bit FW_STATUS_DONE class_timeout is coming
> > set_bit
> FW_STATUS_ABORT
> > complete_all &completion
> > ...
> > mutex_unlock(&fw_lock)
>
> Good catch, but your fix isn't correct.
>
> >
> > In this time, the bit FW_STATUS_DONE and FW_STATUS_ABORT are set,
> > and request_firmware() will return failure due to condition in
> > _request_firmware_load():
> >
> > if (!buf->size || test_bit(FW_STATUS_ABORT, &buf->status))
> > retval = -ENOENT;
> >
> > But from the above scenerio, it should be a succcessful requesting.
> > So we need judge if the FW_STATUS_DONE is set before calling abort
> > in timeout function firmware_class_timeout().
> >
> > Signed-off-by: liu chuansheng <chuansheng.liu@...el.com>
> > ---
> > drivers/base/firmware_class.c | 7 +++++++
> > 1 files changed, 7 insertions(+), 0 deletions(-)
> >
> > diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
> > index 8945f4e..35fffd8 100644
> > --- a/drivers/base/firmware_class.c
> > +++ b/drivers/base/firmware_class.c
> > @@ -671,6 +671,13 @@ static void firmware_class_timeout(u_long data)
> > {
> > struct firmware_priv *fw_priv = (struct firmware_priv *) data;
> >
> > + mutex_lock(&fw_lock);
>
> mutex can't be used in interrupt context, so one candidate fix is
> to convert the timeout timer into delayed work, and hold the
> 'fw_lock' during the work function.
>
> > + if (test_bit(FW_STATUS_DONE, &(fw_priv->buf->status))) {
> > + mutex_unlock(&fw_lock);
> > + return;
> > + }
> > + mutex_unlock(&fw_lock);
> > +
> > fw_load_abort(fw_priv);
> > }
>
> Also the lock of 'fw_lock' should be held when setting 'FW_STATUS_DONE'
> in _request_firmware_load().
>
> Could you send out a new patch to fix the race?
Thanks your review and proposal, has sent the patch V2 to be reviewed.
>
> Thanks,
> --
> Ming Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists