[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210331121929.GX2710221@ziepe.ca>
Date: Wed, 31 Mar 2021 09:19:29 -0300
From: Jason Gunthorpe <jgg@...pe.ca>
To: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc: Bjorn Helgaas <helgaas@...nel.org>,
Alexander Duyck <alexander.duyck@...il.com>,
Keith Busch <kbusch@...nel.org>,
Leon Romanovsky <leon@...nel.org>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Saeed Mahameed <saeedm@...dia.com>,
Jakub Kicinski <kuba@...nel.org>,
linux-pci <linux-pci@...r.kernel.org>,
linux-rdma@...r.kernel.org, Netdev <netdev@...r.kernel.org>,
Don Dutile <ddutile@...hat.com>,
Alex Williamson <alex.williamson@...hat.com>,
"David S . Miller" <davem@...emloft.net>
Subject: Re: [PATCH mlx5-next v7 0/4] Dynamically assign MSI-X vectors count
On Wed, Mar 31, 2021 at 08:38:39AM +0200, Greg Kroah-Hartman wrote:
> On Tue, Mar 30, 2021 at 07:43:41PM -0300, Jason Gunthorpe wrote:
> > > With 0000:01:00.0/sriov/BB:DD.F/vf_msix_count, sriov/ will contain
> > > 1 file and 1K subdirectories.
> >
> > The smallest directory sizes is with the current patch since it
> > re-uses the existing VF directory. Do we care about directory size at
> > the sysfs level?
>
> No, that should not matter.
>
> The "issue" here is that you "broke" the device chain here by adding a
> random kobject to the directory tree: "BB:DD.F"
>
> Again, devices are allowed to have attributes associated with it to be
> _ONE_ subdirectory level deep.
>
> So, to use your path above, this is allowed:
> 0000:01:00.0/sriov/vf_msix_count
>
> as these are sriov attributes for the 0000:01:00.0 device, but this is
> not:
> 0000:01:00.0/sriov/BB:DD.F/vf_msix_count
> as you "threw" a random kobject called BB:DD.F into the middle.
>
> If you want to have "BB:DD.F" in there, then it needs to be a real
> struct device and _THEN_ it needs to point its parent to "0000:01:00.0",
> another struct device, as "sriov" is NOT ANYTHING in the heirachy here
> at all.
It isn't a struct device object at all though, it just organizing
attributes.
> Does that help? The rules are:
> - Once you use a 'struct device', all subdirs below that device
> are either an attribute group for that device or a child
> device.
> - A struct device can NOT have an attribute group as a parent,
> it can ONLY have another struct device as a parent.
>
> If you break those rules, the kernel has the ability to get really
> confused unless you are very careful, and userspace will be totally lost
> as you can not do anything special there.
The kernel gets confused?
I'm not sure I understand why userspace gets confused. I can guess
udev has some issue, but everything else seems OK, it is just a path.
> > > I'm dense and don't fully understand Greg's subdirectory comment.
> >
> > I also don't know udev well enough. I've certainly seen drivers
> > creating extra subdirectories using kobjects.
>
> And those drivers are broken. Please point them out to me and I will be
> glad to go fix them. Or tell their authors why they are broken :)
How do you fix them? It is uAPI at this point so we can't change the
directory names. Can't make them struct devices (userspace would get
confused if we add *more* sysfs files)
Grep for kobject_init_and_add() under drivers/ and I think you get a
pretty good overview of the places.
Since it seems like kind of a big problem can we make this allowed
somehow?
> > > But it doesn't seem like that level of control would be in a udev rule
> > > anyway. A PF udev rule might *start* a program to manage MSI-X
> > > vectors, but such a program should be able to deal with whatever
> > > directory structure we want.
> >
> > Yes, I can't really see this being used from udev either.
>
> It doesn't matter if you think it could be used, it _will_ be used as
> you are exposing this stuff to userspace.
Well, from what I understand, it wont be used because udev can't do
three level deep attributes, and if that hasn't been a problem in that
last 10 years for the existing places, it might not ever be needed in
udev at all.
> > I assume there is also the usual race about triggering the uevent
> > before the subdirectories are created, but we have the
> > dev_set_uevent_suppress() thing now for that..
>
> Unless you are "pci bus code" you shouldn't be using that :)
There are over 40 users now.
Jason
Powered by blists - more mailing lists