[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <38bba039-aa2e-42fc-aae1-26d131cf081b@nvidia.com>
Date: Thu, 20 Jun 2024 08:47:51 +0300
From: Shay Drori <shayd@...dia.com>
To: Greg KH <gregkh@...uxfoundation.org>
CC: <netdev@...r.kernel.org>, <pabeni@...hat.com>, <davem@...emloft.net>,
<kuba@...nel.org>, <edumazet@...gle.com>, <david.m.ertman@...el.com>,
<rafael@...nel.org>, <ira.weiny@...el.com>, <linux-rdma@...r.kernel.org>,
<leon@...nel.org>, <tariqt@...dia.com>, Parav Pandit <parav@...dia.com>
Subject: Re: [PATCH net-next v7 1/2] driver core: auxiliary bus: show
auxiliary device IRQs
On 19/06/2024 9:45, Greg KH wrote:
> External email: Use caution opening links or attachments
>
>
> On Wed, Jun 19, 2024 at 09:33:12AM +0300, Shay Drori wrote:
>>
>>
>> On 18/06/2024 19:13, Greg KH wrote:
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> On Tue, Jun 18, 2024 at 06:09:01PM +0300, Shay Drory wrote:
>>>> diff --git a/drivers/base/auxiliary_sysfs.c b/drivers/base/auxiliary_sysfs.c
>>>> new file mode 100644
>>>> index 000000000000..3f112fd26e72
>>>> --- /dev/null
>>>> +++ b/drivers/base/auxiliary_sysfs.c
>>>> @@ -0,0 +1,110 @@
>>>> +// SPDX-License-Identifier: GPL-2.0
>>>> +/*
>>>> + * Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES
>>>> + */
>>>> +
>>>> +#include <linux/auxiliary_bus.h>
>>>> +#include <linux/slab.h>
>>>> +
>>>> +struct auxiliary_irq_info {
>>>> + struct device_attribute sysfs_attr;
>>>> +};
>>>> +
>>>> +static struct attribute *auxiliary_irq_attrs[] = {
>>>> + NULL
>>>> +};
>>>> +
>>>> +static const struct attribute_group auxiliary_irqs_group = {
>>>> + .name = "irqs",
>>>> + .attrs = auxiliary_irq_attrs,
>>>> +};
>>>> +
>>>> +static int auxiliary_irq_dir_prepare(struct auxiliary_device *auxdev)
>>>> +{
>>>> + int ret = 0;
>>>> +
>>>> + mutex_lock(&auxdev->lock);
>>>> + if (auxdev->dir_exists)
>>>> + goto unlock;
>>>
>>> You do know about cleanup.h, right? Please use it.
>>>
>>> But what exactly are you trying to protect here? How will you race and
>>> add two irqs at the same time? Driver probe is always single threaded,
>>> so what would be calling this at the same time from multiple places?
>>
>>
>> mlx5 driver requests IRQs on demand for PCI PF, VF, SFs.
>> And it occurs from multiple threads, hence we need to protect it.
>
> How are irqs asked for, for the same device, from multiple threads?
> What threads exactly? What is causing these irqs to be asked for?
>
> But ok, that's fine, if you want to do this, then properly protect the
> allocation, don't just half-protect it like you did here :(
Thanks for the comment, will protect all the allocations
>
>>>> +
>>>> + xa_init(&auxdev->irqs);
>>>> + ret = devm_device_add_group(&auxdev->dev, &auxiliary_irqs_group);
>>>> + if (!ret)
>>>> + auxdev->dir_exists = 1;
>>>> +
>>>> +unlock:
>>>> + mutex_unlock(&auxdev->lock);
>>>> + return ret;
>>>> +}
>>>> +
>>>> +/**
>>>> + * auxiliary_device_sysfs_irq_add - add a sysfs entry for the given IRQ
>>>> + * @auxdev: auxiliary bus device to add the sysfs entry.
>>>> + * @irq: The associated interrupt number.
>>>> + *
>>>> + * This function should be called after auxiliary device have successfully
>>>> + * received the irq.
>>>> + *
>>>> + * Return: zero on success or an error code on failure.
>>>> + */
>>>> +int auxiliary_device_sysfs_irq_add(struct auxiliary_device *auxdev, int irq)
>>>> +{
>>>> + struct device *dev = &auxdev->dev;
>>>> + struct auxiliary_irq_info *info;
>>>> + int ret;
>>>> +
>>>> + ret = auxiliary_irq_dir_prepare(auxdev);
>>>> + if (ret)
>>>> + return ret;
>>>> +
>>>> + info = kzalloc(sizeof(*info), GFP_KERNEL);
>>>> + if (!info)
>>>> + return -ENOMEM;
>>>> +
>>>> + sysfs_attr_init(&info->sysfs_attr.attr);
>>>> + info->sysfs_attr.attr.name = kasprintf(GFP_KERNEL, "%d", irq);
>>>> + if (!info->sysfs_attr.attr.name) {
>>>> + ret = -ENOMEM;
>>>> + goto name_err;
>>>> + }
>>>> +
>>>> + ret = xa_insert(&auxdev->irqs, irq, info, GFP_KERNEL);
>>>
>>> So no lock happening here, either use it always, or not at all?
>>
>>
>> the lock is only needed to protect the group (directory) creation, which
>> will be used by all the IRQs of this auxdev.
>> parallel calls to this API will always be with different IRQs, which
>> means each IRQ have a unique index.
>
> You are inserting into the sysfs group at the same time? You are
> calling xa_insert() at the same time? Is that protected with some
> internal lock? If so, this needs to be documented a bunch here.
>
> Allocating irqs is NOT a fast path, just grab a lock and do it right
> please, don't make us constantly have to stare at the code to ensure it
> is correct.
like I said above, I will protect all the allocations
>
>>>> + if (ret)
>>>> + goto auxdev_xa_err;
>>>> +
>>>> + ret = sysfs_add_file_to_group(&dev->kobj, &info->sysfs_attr.attr,
>>>> + auxiliary_irqs_group.name);
>>>
>>> You do know that you are never going to see these files from the
>>> userspace library tools that watch sysfs, right? libudev will never see
>>> them as you are adding them AFTER the device is created.
>>>
>>> So, because of that, who is really going to use these files?
>>
>> To learn about the interrupt mapping of the SF IRQs.
>
> Who is going to "learn"? Again, you are creating files that our
> userspace tools will miss, so what userspace tools are going to be able
> to learn anything here?
>
> This is strongly implying that all of this is just a debugging aid. So
> please, put this in debugfs where that type of thing belongs.
It is certainly a debugging aid as I described in the commit log.
But it is one of the purpose.
The motivation was clear but probably I should have better written.
The irq affinity setting code [1] needs to read the irqs number of the
device.
Tools like irqbalance [1] are using the sysfs.
And one should be able to do the same for the PCI SF too.
They cannot rely on the debugfs.
[1]
https://github.com/Irqbalance/irqbalance/blob/ba44a683cdfaa688e89e0d887952032766fb89aa/classify.c#L631
>
>>>> + if (ret)
>>>> + goto sysfs_add_err;
>>>> +
>>>> + return 0;
>>>> +
>>>> +sysfs_add_err:
>>>> + xa_erase(&auxdev->irqs, irq);
>>>> +auxdev_xa_err:
>>>> + kfree(info->sysfs_attr.attr.name);
>>>> +name_err:
>>>> + kfree(info);
>>>
>>> Again, cleanup.h is your friend.
>>>
>>>> + return ret;
>>>> +}
>>>> +EXPORT_SYMBOL_GPL(auxiliary_device_sysfs_irq_add);
>>>> +
>>>> +/**
>>>> + * auxiliary_device_sysfs_irq_remove - remove a sysfs entry for the given IRQ
>>>> + * @auxdev: auxiliary bus device to add the sysfs entry.
>>>> + * @irq: the IRQ to remove.
>>>> + *
>>>> + * This function should be called to remove an IRQ sysfs entry.
>>>> + */
>>>> +void auxiliary_device_sysfs_irq_remove(struct auxiliary_device *auxdev, int irq)
>>>> +{
>>>> + struct auxiliary_irq_info *info = xa_load(&auxdev->irqs, irq);
>>>> + struct device *dev = &auxdev->dev;
>>>> +
>>>> + sysfs_remove_file_from_group(&dev->kobj, &info->sysfs_attr.attr,
>>>> + auxiliary_irqs_group.name);
>>>> + xa_erase(&auxdev->irqs, irq);
>>>> + kfree(info->sysfs_attr.attr.name);
>>>> + kfree(info);
>>>> +}
>>>> +EXPORT_SYMBOL_GPL(auxiliary_device_sysfs_irq_remove);
>>>> diff --git a/include/linux/auxiliary_bus.h b/include/linux/auxiliary_bus.h
>>>> index de21d9d24a95..96be140bd1ff 100644
>>>> --- a/include/linux/auxiliary_bus.h
>>>> +++ b/include/linux/auxiliary_bus.h
>>>> @@ -58,6 +58,7 @@
>>>> * in
>>>> * @name: Match name found by the auxiliary device driver,
>>>> * @id: unique identitier if multiple devices of the same name are exported,
>>>> + * @irqs: irqs xarray contains irq indices which are used by the device,
>>>> *
>>>> * An auxiliary_device represents a part of its parent device's functionality.
>>>> * It is given a name that, combined with the registering drivers
>>>> @@ -138,7 +139,10 @@
>>>> struct auxiliary_device {
>>>> struct device dev;
>>>> const char *name;
>>>> + struct xarray irqs;
>>>> + struct mutex lock; /* Protects "irqs" directory creation */
>>>
>>> Protects it from what?
>>
>> please look the answer above
>
> You need to document it here. Or somewhere. Don't rely on an email
> thread from 10 years ago for when you look at this in 10 years and
> wonder what is going on...
Thanks, I will document it better in next version
>
>>>> u32 id;
>>>> + u8 dir_exists:1;
>>>
>>> I don't think this is needed, but if it really is, just use a bool.
>>
>>
>> If you know of an API that query whether a specific group is exists on
>> some device, can you please share it with me?
>> I came out empty when I looked for one :(
>
> Normally sysfs groups are NOT created this way at all. Oh wait, they
> can be now, why not use the new feature where a group is created by the
> core but only exposed if an attribute is added there?
>
> Will that work here? See commit d87c295f599c ("sysfs: Introduce a
> mechanism to hide static attribute_groups") for details. That should
> solve the issue of trying to figure out if the directory is present or
> not logic.
thank for the suggestion:)
will give it a shoot
>
> thanks,
>
> greg k-h
Powered by blists - more mailing lists