[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJedcCx9s9HSbJEEb_RUmW7yjSMkYaoXhuH9qrqYo1smpL=iJw@mail.gmail.com>
Date: Thu, 23 Mar 2023 11:44:54 +0800
From: Zheng Hacker <hackerzheng666@...il.com>
To: Mike Christie <michael.christie@...cle.com>
Cc: Zheng Wang <zyytlz.wz@....com>, njavali@...vell.com,
mrangankar@...vell.com, GR-QLogic-Storage-Upstream@...vell.com,
jejb@...ux.ibm.com, martin.petersen@...cle.com,
linux-scsi@...r.kernel.org, linux-kernel@...r.kernel.org,
1395428693sheep@...il.com, alex000young@...il.com
Subject: Re: [PATCH RESEND] scsi: qedi: Fix use after free bug in qedi_remove
due to race condition
Mike Christie <michael.christie@...cle.com> 于2023年3月21日周二 00:11写道:
>
> On 3/18/23 3:13 AM, Zheng Wang wrote:
> > In qedi_probe, it calls __qedi_probe, which bound &qedi->recovery_work
> > with qedi_recovery_handler and bound &qedi->board_disable_work
> > with qedi_board_disable_work.
> >
> > When it calls qedi_schedule_recovery_handler, it will finally
> > call schedule_delayed_work to start the work.
> >
> > When we call qedi_remove to remove the driver, there
> > may be a sequence as follows:
> >
> > Fix it by finishing the work before cleanup in qedi_remove.
> >
> > CPU0 CPU1
> >
> > |qedi_recovery_handler
> > qedi_remove |
> > __qedi_remove |
> > iscsi_host_free |
> > scsi_host_put |
> > //free shost |
> > |iscsi_host_for_each_session
> > |//use qedi->shost
> >
> > Fixes: 4b1068f5d74b ("scsi: qedi: Add MFW error recovery process")
> > Signed-off-by: Zheng Wang <zyytlz.wz@....com>
> > ---
> > drivers/scsi/qedi/qedi_main.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/drivers/scsi/qedi/qedi_main.c b/drivers/scsi/qedi/qedi_main.c
> > index f2ee49756df8..25223f6f5344 100644
> > --- a/drivers/scsi/qedi/qedi_main.c
> > +++ b/drivers/scsi/qedi/qedi_main.c
> > @@ -2414,6 +2414,10 @@ static void __qedi_remove(struct pci_dev *pdev, int mode)
> > int rval;
> > u16 retry = 10;
> >
> > + /*cancel work*/
>
> This comment is not needed. The name of the functions you are calling have
> "cancel" and "work" in them so we know. If you want to add a comment explain
> why the cancel calls are needed here.
>
Hi,
Sorry for my late reply and thanks for your advice. Will remove it in
the next version of patch.
>
> > + cancel_delayed_work_sync(&qedi->recovery_work);
> > + cancel_delayed_work_sync(&qedi->board_disable_work);
>
>
> How do you know after you have called cancel_delayed_work_sync that
> schedule_recovery_handler or schedule_hw_err_handler can't be called?
> I don't know the qed driver well, but it looks like you could have
> operations still running, so after you cancel here one of those ops
> could lead to them scheduling the work again.
>
Sorry I didn't know how to make sure there's no more schedule. But I do think
this is important. Maybe there're someone else who can give us advice.
Best regards,
Zheng
>
> > +
> > if (mode == QEDI_MODE_NORMAL)
> > iscsi_host_remove(qedi->shost, false);
> > else if (mode == QEDI_MODE_SHUTDOWN)
>
Powered by blists - more mailing lists