lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZICs+WYCPYdu2yoI@itl-email>
Date:   Wed, 7 Jun 2023 12:14:46 -0400
From:   Demi Marie Obenour <demi@...isiblethingslab.com>
To:     Roger Pau Monné <roger.pau@...rix.com>
Cc:     Jens Axboe <axboe@...nel.dk>, Alasdair Kergon <agk@...hat.com>,
        Mike Snitzer <snitzer@...nel.org>, dm-devel@...hat.com,
        Marek Marczykowski-Górecki 
        <marmarek@...isiblethingslab.com>, linux-block@...r.kernel.org,
        linux-kernel@...r.kernel.org, xen-devel@...ts.xenproject.org
Subject: Re: [PATCH v2 13/16] xen-blkback: Implement diskseq checks

On Wed, Jun 07, 2023 at 10:20:08AM +0200, Roger Pau Monné wrote:
> On Tue, Jun 06, 2023 at 01:01:20PM -0400, Demi Marie Obenour wrote:
> > On Tue, Jun 06, 2023 at 10:25:47AM +0200, Roger Pau Monné wrote:
> > > On Tue, May 30, 2023 at 04:31:13PM -0400, Demi Marie Obenour wrote:
> > > > This allows specifying a disk sequence number in XenStore.  If it does
> > > > not match the disk sequence number of the underlying device, the device
> > > > will not be exported and a warning will be logged.  Userspace can use
> > > > this to eliminate race conditions due to major/minor number reuse.
> > > > Old kernels do not support the new syntax, but a later patch will allow
> > > > userspace to discover that the new syntax is supported.
> > > > 
> > > > Signed-off-by: Demi Marie Obenour <demi@...isiblethingslab.com>
> > > > ---
> > > >  drivers/block/xen-blkback/xenbus.c | 112 +++++++++++++++++++++++------
> > > >  1 file changed, 89 insertions(+), 23 deletions(-)
> > > > 
> > > > diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
> > > > index 4807af1d58059394d7a992335dabaf2bc3901721..9c3eb148fbd802c74e626c3d7bcd69dcb09bd921 100644
> > > > --- a/drivers/block/xen-blkback/xenbus.c
> > > > +++ b/drivers/block/xen-blkback/xenbus.c
> > > > @@ -24,6 +24,7 @@ struct backend_info {
> > > >  	struct xenbus_watch	backend_watch;
> > > >  	unsigned		major;
> > > >  	unsigned		minor;
> > > > +	unsigned long long	diskseq;
> > > 
> > > Since diskseq is declared as u64 in gendisk, better use the same type
> > > here too?
> > 
> > simple_strtoull() returns an unsigned long long, and C permits unsigned
> > long long to be larger than 64 bits.
> 
> Right, but the type of gendisk is u64.  It's fine if you want to store
> the result of simple_strtoull() into an unsigned long long and do
> whatever checks to assert it matches the format expected by gendisk,
> but ultimately the field type would better use u64 for consistency IMO.

I changed my mind on this, not least because the 16-byte length limit
means that the value is limited to UINT64_MAX anyway.

> > > > @@ -725,10 +749,46 @@ static void backend_changed(struct xenbus_watch *watch,
> > > >  		return;
> > > >  	}
> > > >  
> > > > -	if (be->major | be->minor) {
> > > > -		if (be->major != major || be->minor != minor)
> > > > -			pr_warn("changing physical device (from %x:%x to %x:%x) not supported.\n",
> > > > -				be->major, be->minor, major, minor);
> > > > +	diskseq_str = xenbus_read(XBT_NIL, dev->nodename, "diskseq", &diskseq_len);
> > > > +	if (IS_ERR(diskseq_str)) {
> > > > +		int err = PTR_ERR(diskseq_str);
> > > > +		diskseq_str = NULL;
> > > > +
> > > > +		/*
> > > > +		 * If this does not exist, it means legacy userspace that does not
> > > > +		 * support diskseq.
> > > > +		 */
> > > > +		if (unlikely(!XENBUS_EXIST_ERR(err))) {
> > > > +			xenbus_dev_fatal(dev, err, "reading diskseq");
> > > > +			return;
> > > > +		}
> > > > +		diskseq = 0;
> > > > +	} else if (diskseq_len <= 0) {
> > > > +		xenbus_dev_fatal(dev, -EFAULT, "diskseq must not be empty");
> > > > +		goto fail;
> > > > +	} else if (diskseq_len > 16) {
> > > > +		xenbus_dev_fatal(dev, -ERANGE, "diskseq too long: got %d but limit is 16",
> > > > +				 diskseq_len);
> > > > +		goto fail;
> > > > +	} else if (diskseq_str[0] == '0') {
> > > > +		xenbus_dev_fatal(dev, -ERANGE, "diskseq must not start with '0'");
> > > > +		goto fail;
> > > > +	} else {
> > > > +		char *diskseq_end;
> > > > +		diskseq = simple_strtoull(diskseq_str, &diskseq_end, 16);
> > > > +		if (diskseq_end != diskseq_str + diskseq_len) {
> > > > +			xenbus_dev_fatal(dev, -EINVAL, "invalid diskseq");
> > > > +			goto fail;
> > > > +		}
> > > > +		kfree(diskseq_str);
> > > > +		diskseq_str = NULL;
> > > > +	}
> > > 
> > > Won't it be simpler to use xenbus_scanf() with %llx formatter?
> > 
> > xenbus_scanf() doesn’t check for overflow and accepts lots of junk it
> > really should not.  Should this be fixed in xenbus_scanf()?
> 
> That would be my preference, so that you can use it here instead of
> kind of open-coding it.

This winds up being a much more invasive patch as it requires changing
sscanf().  It also has a risk (probably mostly theoretical) of breaking
buggy userspace that passes garbage values here.

> > > Also, we might want to fetch "physical-device" and "diskseq" inside
> > > the same xenstore transaction.
> > 
> > Should the rest of the xenstore reads be included in the same
> > transaction?
> 
> I guess it would make the code simpler to indeed fetch everything
> inside the same transaction.

Okay, will change in v3.

> > > Also, you tie this logic to the "physical-device" watch, which
> > > strictly implies that the "diskseq" node must be written to xenstore
> > > before the "physical-device" node.  This seems fragile, but I don't
> > > see much better optiono since the "diskseq" is optional.
> > 
> > What about including the diskseq in the "physical-device" node?  Perhaps
> > use diskseq@...or:minor syntax?
> 
> Hm, how would you know whether the blkback instance in the kernel
> supports the diskseq syntax in physical-device?

That’s what the next patch is for 🙂.

> Can you fetch a disk using a diskseq identifier?

Not yet, although I have considered adding this ability.  It would be
one step towards a “diskseqfs” that userspace could use to open a device
by diskseq.

> Why I understand that this is an extra safety check in order to assert
> blkback is opening the intended device, is this attempting to fix some
> existing issue?

Yes, it is.  I have a block script (written in C) that validates the
device it has opened before passing the information to blkback.  It uses
the diskseq to do this, but for that protection to be complete, blkback
must also be aware of it.

> I'm not sure I see how the major:minor numbers would point to a
> different device than the one specified by the toolstack unless the
> admin explicitly messes with the devices before blkback has got time
> to open them.  But then the admin can already do pretty much
> everything it wants with the system.

Admins typically refer to e.g. device-mapper devices by name, not by
major:minor number.  If a device is destroyed and recreated right as the
block script is running, this race condition can occur.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ