linux-kernel - Re: Re: [PATCH] vhost/scsi: Fix improper cleanup in vhost_scsi_set

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <77e61a56.24e04.19463c158e1.Coremail.wh1sper@zju.edu.cn>
Date: Tue, 14 Jan 2025 15:40:03 +0800 (GMT+08:00)
From: 张浩然 <wh1sper@....edu.cn>
To: michael.christie@...cle.com
Cc: mst@...hat.com, jasowang@...hat.com, pbonzini@...hat.com,
	stefanha@...hat.com, eperezma@...hat.com,
	virtualization@...ts.linux.dev, kvm@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: Re: [PATCH] vhost/scsi: Fix improper cleanup in
 vhost_scsi_set_endpoint()

On 2025-01-13 01:35:20, Michael Christie wrote:

> 
> On 1/10/25 9:34 PM, Haoran Zhang wrote:
> > Since commit 3f8ca2e115e55 ("vhost scsi: alloc cmds per vq instead of session"), a bug can be triggered when the host sends a duplicate VHOST_SCSI_SET_ENDPOINT ioctl command.
> 
> I don't think that git commit is correct. It should be:
> 
> 25b98b64e284 ("vhost scsi: alloc cmds per vq instead of session")
> 

Yes, the correct commit ID is 25b98b64e284. My apologize for the oversight.

> 
> > diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
> > index 718fa4e0b31e..b994138837f2 100644
> > --- a/drivers/vhost/scsi.c
> > +++ b/drivers/vhost/scsi.c
> > @@ -1726,7 +1726,7 @@ vhost_scsi_set_endpoint(struct vhost_scsi *vs,
> >  				mutex_unlock(&tpg->tv_tpg_mutex);
> >  				mutex_unlock(&vhost_scsi_mutex);
> >  				ret = -EEXIST;
> > -				goto undepend;
> > +				goto free_vs_tpg;
> 
> I agree you found a bug, but I'm not sure how you hit it from here. A
> couple lines before this, we do:
> 
> if (tpg->tv_tpg_vhost_count != 0) {
> 	mutex_unlock(&tpg->tv_tpg_mutex);
> 	continue;
> }
> 
> If the tpg was already in the vs_tpg, then tv_tpg_vhost_count would be
> non-zero and we would hit the check above.
> 

Thanks for point that out, I wasn't aware of it before. However, I don't believe I identified the wrong root cause. I tested my patch again, which only affects this specific path, it does resolve the issue. That said, maybe the triggering path somehow bypasses the aforementioned check, and my patch might not be sufficient.

> Could you describe the target and tpg mapping for how you hit this?
> 

After reevaluating the PoC, I realized that my initial claim was incorrect. The target WWN in the second vhost_scsi_set_endpoint() call is not the same as in the first one. Below is my targetcli status:

o- vhost ......................................... [Targets: 3]
  | o- naa.500140501e23be28 ......................... [TPGs: 1]
  | | o- tpg1 ............. [naa.50014058f7da10b7, no-gen-acls]
  | |   o- acls ..................................... [ACLs: 0]
  | |   o- luns ..................................... [LUNs: 0]
  | o- naa.500140562c8936fa ......................... [TPGs: 2]
  | | o- tpg1 ............. [naa.50014058d133f962, no-gen-acls]
  | | | o- acls ..................................... [ACLs: 0]
  | | | o- luns ..................................... [LUNs: 3]
  | | |   o- lun0 ... [block/disk0 (/dev/disk/...) (default_tg_pt_gp)]
  | | |   o- lun1 .... [fileio/vhost-fileio (/root/fileio-vhost) (default_tg_pt_gp)]
  | | |   o- lun2 ............. [ramdisk/rd (default_tg_pt_gp)]
  | | o- tpg2 ............. [naa.50014055c6fb4182, no-gen-acls]
  | |   o- acls ..................................... [ACLs: 0]
  | |   o- luns ..................................... [LUNs: 0]

The bug occurs when `naa.500140562c8936fa` has already been set as an endpoint, and I send a VHOST_SCSI_SET_ENDPOINT ioctl command with `naa.500140501e23be28`. The ioctl returns -1 EEXIST (File exists), and the kernel logs a BUG message in dmesg.

> 
> >  			}
> >  			/*
> >  			 * In order to ensure individual vhost-scsi configfs
> 
> 
> However, I was able to replicate the bug by hitting the chunk below this
> comment where we do:
> 
> ret = target_depend_item(&se_tpg->tpg_group.cg_item);
> if (ret) {
> 	pr_warn("target_depend_item() failed: %d\n", ret);

I suspected this part of code to be the issue, but I did not observe the WARN message when the bug occurred. 

> 	mutex_unlock(&tpg->tv_tpg_mutex);
> 	mutex_unlock(&vhost_scsi_mutex);
> 	goto undepend;
> 
> 
> > @@ -1802,6 +1802,7 @@ vhost_scsi_set_endpoint(struct vhost_scsi *vs,
> >  			target_undepend_item(&tpg->se_tpg.tpg_group.cg_item);
> >  		}
> >  	}
> > +free_vs_tpg:
> >  	kfree(vs_tpg);
> 
> To fix the bug, I don't think we can just free the vs_tpg. There's 2
> cases we can hit the error path:
> 
> 1. First time calling vhost_scsi_set_endpoint.
> 
> If we have a target with 2 tpgs, and for tpg1 we did a successful
> target_depend_item call, but then we looped and for tpg2
> target_depend_item failed, if we just did a kfree then we would leave a
> refcount on tpg1.
> 
> So for this case, we need to do the "goto undepend".
> 
> 2. N > 1 time calling vhost_scsi_set_endpoint.
> 
> This one is more complicated because let's say we started with 1 tpg and
> on the first call to vhost_scsi_set_endpoint we successfully did
> target_depend_item on it. Before the 2nd call to vhost_scsi_set_endpoint
> we added tpg2 and tpg3. We then do vhost_scsi_set_endpoint for the 2nd
> time, we successfully do target_depend_item on tpg2, but it fails for tpg3.
> 
> In this case, we want to unwind what we did on this 2nd call, so we want
> to do target_undepend_item on tpg2. And, we don't want to call it for
> tpg1 or we will hit the bug you found.
> 
> So I think to fix the issue, we would want to:
> 
> 1. move the
> 
> memcpy(vs_tpg, vs->vs_tpg, len);
> 
> to the end of the function after we do the vhost_scsi_flush. This will
> be more complicated than the current memcpy though. We will want to
> merge the local vs_tpg and the vs->vs_tpg like:
> 
> for (i = 0; i < VHOST_SCSI_MAX_TARGET; i++) {
> 	if (vs_tpg[i])
> 		vs->vs_tpg[i] = vs_tpg[i])
> }
> 
> 2. We want to leave the "goto undepend" calls as is. For the the
> undepend goto handling we also want to leave the code as is. We want to
> continue to loop over the local vs_tpg because after we moved the memcpy
> for #1 it now only contains the tpgs we updated on the current
> vhost_scsi_set_endpoint call.

I think so.