lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAA5qM4ARCf-28eH6dME6pDc8WBNpMh2jGUfcj-Wp7vfwvVRz8Q@mail.gmail.com>
Date:   Fri, 28 Aug 2020 08:43:32 -0400
From:   Tong Zhang <ztong0001@...il.com>
To:     Keith Busch <kbusch@...nel.org>
Cc:     linux-kernel@...r.kernel.org, linux-nvme@...ts.infradead.org,
        axboe@...com, Christoph Hellwig <hch@....de>, sagi@...mberg.me
Subject: Re: [PATCH] nvme-pci: cancel nvme device request before disabling

Hi Keith,
Thanks for the confirmation. I will send another revision according to
your comments.
Best,
- Tong

On Thu, Aug 27, 2020 at 11:01 AM Keith Busch <kbusch@...nel.org> wrote:
>
> On Fri, Aug 14, 2020 at 12:11:56PM -0400, Tong Zhang wrote:
> > On Fri, Aug 14, 2020 at 11:42 AM Keith Busch <kbusch@...nel.org> wrote:
> > > > > On Fri, Aug 14, 2020 at 03:14:31AM -0400, Tong Zhang wrote:
> > > > > > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> > > > > > index ba725ae47305..c4f1ce0ee1e3 100644
> > > > > > --- a/drivers/nvme/host/pci.c
> > > > > > +++ b/drivers/nvme/host/pci.c
> > > > > > @@ -1249,8 +1249,8 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
> > > > > >               dev_warn_ratelimited(dev->ctrl.device,
> > > > > >                        "I/O %d QID %d timeout, disable controller\n",
> > > > > >                        req->tag, nvmeq->qid);
> > > > > > -             nvme_dev_disable(dev, true);
> > > > > >               nvme_req(req)->flags |= NVME_REQ_CANCELLED;
> > > > > > +             nvme_dev_disable(dev, true);
> > > > > >               return BLK_EH_DONE;
> >
> > > anymore. The driver is not reporting   non-response back for all
> > > cancelled requests, and that is probably not what we should be doing.
> >
> > OK, thanks for the explanation. I think the bottom line here is to let the
> > probe function know and stop proceeding when there's an error.
> > I also don't see an obvious reason to set NVME_REQ_CANCELLED
> > after nvme_dev_disable(dev, true).
>
> The flag was set after disabling when it didn't happen to matter: the
> block layer had a complicated timeout scheme that didn't actually
> complete the request until the timeout handler returned, so the flag set
> where it is was 'ok'. That's clearly not the case anymore, so yes, I
> think we do need your patch.
>
> There is one case you are missing, though:
>
> ---
> @@ -1267,10 +1267,10 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
>                 dev_warn(dev->ctrl.device,
>                          "I/O %d QID %d timeout, reset controller\n",
>                          req->tag, nvmeq->qid);
> +               nvme_req(req)->flags |= NVME_REQ_CANCELLED;
>                 nvme_dev_disable(dev, false);
>                 nvme_reset_ctrl(&dev->ctrl);
>
> -               nvme_req(req)->flags |= NVME_REQ_CANCELLED;
>                 return BLK_EH_DONE;
>         }
> --

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ