lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 10 Jun 2024 13:33:53 -0600
From: Keith Busch <kbusch@...nel.org>
To: Sagi Grimberg <sagi@...mberg.me>
Cc: Venkat Rao Bagalkote <venkat88@...ux.vnet.ibm.com>,
	linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
	linux-nvme@...ts.infradead.org, sachinp@...ux.vnet.com
Subject: Re: Kernel OOPS while creating a NVMe Namespace

On Mon, Jun 10, 2024 at 10:17:42PM +0300, Sagi Grimberg wrote:
> On 10/06/2024 22:15, Keith Busch wrote:
> > On Mon, Jun 10, 2024 at 10:05:00PM +0300, Sagi Grimberg wrote:
> > > 
> > > On 10/06/2024 21:53, Keith Busch wrote:
> > > > On Mon, Jun 10, 2024 at 01:21:00PM +0530, Venkat Rao Bagalkote wrote:
> > > > > Issue is introduced by the patch: be647e2c76b27f409cdd520f66c95be888b553a3.
> > > > My mistake. The namespace remove list appears to be getting corrupted
> > > > because I'm using the wrong APIs to replace a "list_move_tail". This is
> > > > fixing the issue on my end:
> > > > 
> > > > ---
> > > > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > > > index 7c9f91314d366..c667290de5133 100644
> > > > --- a/drivers/nvme/host/core.c
> > > > +++ b/drivers/nvme/host/core.c
> > > > @@ -3959,9 +3959,10 @@ static void nvme_remove_invalid_namespaces(struct nvme_ctrl *ctrl,
> > > >    	mutex_lock(&ctrl->namespaces_lock);
> > > >    	list_for_each_entry_safe(ns, next, &ctrl->namespaces, list) {
> > > > -		if (ns->head->ns_id > nsid)
> > > > -			list_splice_init_rcu(&ns->list, &rm_list,
> > > > -					     synchronize_rcu);
> > > > +		if (ns->head->ns_id > nsid) {
> > > > +			list_del_rcu(&ns->list);
> > > > +			list_add_tail_rcu(&ns->list, &rm_list);
> > > > +		}
> > > >    	}
> > > >    	mutex_unlock(&ctrl->namespaces_lock);
> > > >    	synchronize_srcu(&ctrl->srcu);
> > > > --
> > > Can we add a reproducer for this in blktests? I'm assuming that we can
> > > easily trigger this
> > > with adding/removing nvmet namespaces?
> > I'm testing this with Namespace Manamgent commands, which nvmet doesn't
> > support. You can recreate the issue by detaching the last namespace.
> > 
> 
> I think the same will happen in a test that creates two namespaces and then
> echo 0 > ns/enable.

Looks like nvme/016 tess this. It's reporting as "passed" on my end, but
I don't think it's actually testing the driver as intended. Still
messing with it.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ