[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240729170553.GE3625856@nvidia.com>
Date: Mon, 29 Jul 2024 14:05:53 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Jonathan Cameron <Jonathan.Cameron@...wei.com>
Cc: Jonathan Corbet <corbet@....net>, Itay Avraham <itayavr@...dia.com>,
Jakub Kicinski <kuba@...nel.org>, Leon Romanovsky <leon@...nel.org>,
linux-doc@...r.kernel.org, linux-rdma@...r.kernel.org,
netdev@...r.kernel.org, Paolo Abeni <pabeni@...hat.com>,
Saeed Mahameed <saeedm@...dia.com>,
Tariq Toukan <tariqt@...dia.com>,
Andy Gospodarek <andrew.gospodarek@...adcom.com>,
Aron Silverton <aron.silverton@...cle.com>,
Dan Williams <dan.j.williams@...el.com>,
David Ahern <dsahern@...nel.org>,
Christoph Hellwig <hch@...radead.org>, Jiri Pirko <jiri@...dia.com>,
Leonid Bloch <lbloch@...dia.com>,
Leon Romanovsky <leonro@...dia.com>, linux-cxl@...r.kernel.org,
patches@...ts.linux.dev
Subject: Re: [PATCH v2 2/8] fwctl: Basic ioctl dispatch for the character
device
On Fri, Jul 26, 2024 at 04:01:57PM +0100, Jonathan Cameron wrote:
> > +struct fwctl_ioctl_op {
> > + unsigned int size;
> > + unsigned int min_size;
> > + unsigned int ioctl_num;
> > + int (*execute)(struct fwctl_ucmd *ucmd);
> > +};
> > +
> > +#define IOCTL_OP(_ioctl, _fn, _struct, _last) \
> > + [_IOC_NR(_ioctl) - FWCTL_CMD_BASE] = { \
>
> If this is always zero indexed, maybe just drop the - FWCTL_CMD_BASE here
> and elsewhere? Maybe through in a BUILD_BUG to confirm it is always 0.
I left it like this in case someone had different ideas for the number
space (iommufd uses a non 0 base also). I think either is fine, and I
slightly prefer keeping it rather than a static_assert.
> > +static long fwctl_fops_ioctl(struct file *filp, unsigned int cmd,
> > + unsigned long arg)
> > +{
> > + struct fwctl_uctx *uctx = filp->private_data;
> > + const struct fwctl_ioctl_op *op;
> > + struct fwctl_ucmd ucmd = {};
> > + union ucmd_buffer buf;
> > + unsigned int nr;
> > + int ret;
> > +
> > + nr = _IOC_NR(cmd);
> > + if ((nr - FWCTL_CMD_BASE) >= ARRAY_SIZE(fwctl_ioctl_ops))
> > + return -ENOIOCTLCMD;
>
> I'd add a blank line here as two unconnected set and error check
> blocks.
Done
> > static int fwctl_fops_open(struct inode *inode, struct file *filp)
> > {
> > struct fwctl_device *fwctl =
> > container_of(inode->i_cdev, struct fwctl_device, cdev);
> > + int ret;
> > +
> > + guard(rwsem_read)(&fwctl->registration_lock);
> > + if (!fwctl->ops)
> > + return -ENODEV;
> > +
> > + struct fwctl_uctx *uctx __free(kfree) =
> > + kzalloc(fwctl->ops->uctx_size, GFP_KERNEL | GFP_KERNEL_ACCOUNT);
>
> GFP_KERNEL_ACCOUNT seems to include GFP_KERNEL already.
> Did I miss some racing change?
I'm sure I copy and pasted this carelessly from someplace else
> > + if (!uctx)
> > + return -ENOMEM;
> > +
> > + uctx->fwctl = fwctl;
> > + ret = fwctl->ops->open_uctx(uctx);
> > + if (ret)
> > + return ret;
> > +
> > + scoped_guard(mutex, &fwctl->uctx_list_lock) {
> > + list_add_tail(&uctx->uctx_list_entry, &fwctl->uctx_list);
> > + }
>
> I guess more may come later but do we need {}?
I guessed the extra {} would be style guide for this construct?
> > static int fwctl_fops_release(struct inode *inode, struct file *filp)
> > {
> > - struct fwctl_device *fwctl = filp->private_data;
> > + struct fwctl_uctx *uctx = filp->private_data;
> > + struct fwctl_device *fwctl = uctx->fwctl;
> >
> > + scoped_guard(rwsem_read, &fwctl->registration_lock) {
> > + if (fwctl->ops) {
>
> Maybe a comment on when this path happens to help the reader
> along. (when the file is closed and device is still alive).
> Otherwise was cleaned up already in fwctl_unregister()
scoped_guard(rwsem_read, &fwctl->registration_lock) {
/*
* fwctl_unregister() has already removed the driver and
* destroyed the uctx.
*/
if (fwctl->ops) {
> > void fwctl_unregister(struct fwctl_device *fwctl)
> > {
> > + struct fwctl_uctx *uctx;
> > +
> > cdev_device_del(&fwctl->cdev, &fwctl->dev);
> >
> > + /* Disable and free the driver's resources for any still open FDs. */
> > + guard(rwsem_write)(&fwctl->registration_lock);
> > + guard(mutex)(&fwctl->uctx_list_lock);
> > + while ((uctx = list_first_entry_or_null(&fwctl->uctx_list,
> > + struct fwctl_uctx,
> > + uctx_list_entry)))
> > + fwctl_destroy_uctx(uctx);
> > +
>
> Obviously it's a little more heavy weight but I'd just use
> list_for_each_entry_safe()
>
> Less effort for reviewers than consider the custom iteration
> you are doing instead.
For these constructs the goal is the make the list empty, it is a
tinsy bit safer/clearer to drive the list to empty purposefully rather
than iterate over it and hope it is empty once done.
However there is no possible way that list_for_each_entry_safe() would
be an unsafe construct here. I can change it if you feel strongly
> > @@ -26,6 +39,10 @@ struct fwctl_device {
> > struct device dev;
> > /* private: */
> > struct cdev cdev;
> > +
> > + struct rw_semaphore registration_lock;
> > + struct mutex uctx_list_lock;
>
> Even for private locks, a scope statement would
> be good to have.
Like so?
/*
* Protect ops, held for write when ops becomes NULL during unregister,
* held for read whenver ops is loaded or an ops function is running.
*/
struct rw_semaphore registration_lock;
/* Protect uctx_list */
struct mutex uctx_list_lock;
> > +#ifndef _UAPI_FWCTL_H
> > +#define _UAPI_FWCTL_H
> > +
> > +#include <linux/types.h>
>
> Not used yet.
>
> > +#include <linux/ioctl.h>
>
> Arguably nor is this, but at least this related to the code
> here.
Sure, lets move them
> > +/**
> > + * DOC: General ioctl format
> > + *
> > + * The ioctl interface follows a general format to allow for extensibility. Each
> > + * ioctl is passed in a structure pointer as the argument providing the size of
> > + * the structure in the first u32. The kernel checks that any structure space
> > + * beyond what it understands is 0. This allows userspace to use the backward
> > + * compatible portion while consistently using the newer, larger, structures.
>
> Is that particularly helpful? Userspace needs to know not to put anything in
> those fields, not hard for it to also know what the size it should send is?
> The two will change together.
It is very helpful for a practical userspace.
Lets say we have an ioctl struct:
struct fwctl_info {
__u32 size;
__u32 flags;
__u32 out_device_type;
__u32 device_data_len;
__aligned_u64 out_device_data;
};
And the basic userspace pattern is:
struct fwctl_info info = {.size = sizeof(info), ...);
ioctl(fd, FWCTL_INFO, &info);
This works today and generates the 24 byte command.
Tomorrow the kernel adds a new member:
struct fwctl_info {
__u32 size;
__u32 flags;
__u32 out_device_type;
__u32 device_data_len;
__aligned_u64 out_device_data;
__aligned_u64 new_thing;
};
Current builds of the userpace use a 24 byte command. A new kernel
will see the 24 bytes and behave as before.
When I recompile the userspace with the updated header it will issue a
32 byte command with no source change.
Old kernel will see a 32 byte command with the trailing bytes it
doesn't understand as 0 and keep working.
The new kernel will see the new_thing bytes are zero and behave the
same as before.
If then the userspace decides to set new_thing the old kernel will
stop working. Userspace can use some 'try and fail' approach to try
again with new_thing = 0.
It gives a whole bunch of easy paths for userspace, otherwise
userspace has to be very careful to match the size of the struct to
the ABI it is targetting. Realistically nobody will do that right.
Thanks,
Jason
Powered by blists - more mailing lists