[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e7f0ae6e-9a72-0640-12d3-1683f9950a13@grimberg.me>
Date: Wed, 3 Jul 2019 12:20:04 -0700
From: Sagi Grimberg <sagi@...mberg.me>
To: Logan Gunthorpe <logang@...tatee.com>,
linux-kernel@...r.kernel.org, linux-nvme@...ts.infradead.org,
Christoph Hellwig <hch@....de>
Cc: Stephen Bates <sbates@...thlin.com>
Subject: Re: [PATCH 0/2] Fix use-after-free bug when ports are removed
>> Can we handle this in the core instead (also so we'd be consistent
>> across transports)?
>
> Yes, that would be much better if we can sort out some other issues below...
>
>> How about this untested patch instead?
>
> I've found a couple of problems with the patch:
>
> 1) port->subsystems will always be empty before nvmet_disable_port() is
> called. We'd have to restructure things a little perhaps so that when a
> subsystem is removed from a port, all the active controllers for that
> subsys/port combo would be removed.
Yes, that is better.
> 2) loop needs to call flush_workqueue(nvme_delete_wq) somewhere,
> otherwise there's a small window after the port disappears while
> commands can still be submitted. We can actually still hit the bug with
> a tight loop.
We could simply flush the workqueue in nvme_loop_delete_ctrl for
each controller?
Might be an overkill though, and its risking circular locking in case
we are going via the fatal error path (work context flushing a different
workqueue always annoys lockdep even when its perfectly safe)
> Maybe there's other cleanup that could be done to solve this: it does
> seem like all three transports need to keep their own lists of open
> controllers and loops through them to delete them. In theory, that could
> be made common so there's common code to manage the list per transport
> which would remove some boiler plate code. If we want to go this route
> though, I'd suggest using my patches as is for now and doing the cleanup
> in the next cycle.
Then please fix tcp as well.
Powered by blists - more mailing lists