lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <38bba039-aa2e-42fc-aae1-26d131cf081b@nvidia.com>
Date: Thu, 20 Jun 2024 08:47:51 +0300
From: Shay Drori <shayd@...dia.com>
To: Greg KH <gregkh@...uxfoundation.org>
CC: <netdev@...r.kernel.org>, <pabeni@...hat.com>, <davem@...emloft.net>,
	<kuba@...nel.org>, <edumazet@...gle.com>, <david.m.ertman@...el.com>,
	<rafael@...nel.org>, <ira.weiny@...el.com>, <linux-rdma@...r.kernel.org>,
	<leon@...nel.org>, <tariqt@...dia.com>, Parav Pandit <parav@...dia.com>
Subject: Re: [PATCH net-next v7 1/2] driver core: auxiliary bus: show
 auxiliary device IRQs



On 19/06/2024 9:45, Greg KH wrote:
> External email: Use caution opening links or attachments
> 
> 
> On Wed, Jun 19, 2024 at 09:33:12AM +0300, Shay Drori wrote:
>>
>>
>> On 18/06/2024 19:13, Greg KH wrote:
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> On Tue, Jun 18, 2024 at 06:09:01PM +0300, Shay Drory wrote:
>>>> diff --git a/drivers/base/auxiliary_sysfs.c b/drivers/base/auxiliary_sysfs.c
>>>> new file mode 100644
>>>> index 000000000000..3f112fd26e72
>>>> --- /dev/null
>>>> +++ b/drivers/base/auxiliary_sysfs.c
>>>> @@ -0,0 +1,110 @@
>>>> +// SPDX-License-Identifier: GPL-2.0
>>>> +/*
>>>> + * Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES
>>>> + */
>>>> +
>>>> +#include <linux/auxiliary_bus.h>
>>>> +#include <linux/slab.h>
>>>> +
>>>> +struct auxiliary_irq_info {
>>>> +     struct device_attribute sysfs_attr;
>>>> +};
>>>> +
>>>> +static struct attribute *auxiliary_irq_attrs[] = {
>>>> +     NULL
>>>> +};
>>>> +
>>>> +static const struct attribute_group auxiliary_irqs_group = {
>>>> +     .name = "irqs",
>>>> +     .attrs = auxiliary_irq_attrs,
>>>> +};
>>>> +
>>>> +static int auxiliary_irq_dir_prepare(struct auxiliary_device *auxdev)
>>>> +{
>>>> +     int ret = 0;
>>>> +
>>>> +     mutex_lock(&auxdev->lock);
>>>> +     if (auxdev->dir_exists)
>>>> +             goto unlock;
>>>
>>> You do know about cleanup.h, right?  Please use it.
>>>
>>> But what exactly are you trying to protect here?  How will you race and
>>> add two irqs at the same time?  Driver probe is always single threaded,
>>> so what would be calling this at the same time from multiple places?
>>
>>
>> mlx5 driver requests IRQs on demand for PCI PF, VF, SFs.
>> And it occurs from multiple threads, hence we need to protect it.
> 
> How are irqs asked for, for the same device, from multiple threads?
> What threads exactly?  What is causing these irqs to be asked for?
> 
> But ok, that's fine, if you want to do this, then properly protect the
> allocation, don't just half-protect it like you did here :(


Thanks for the comment, will protect all the allocations

> 
>>>> +
>>>> +     xa_init(&auxdev->irqs);
>>>> +     ret = devm_device_add_group(&auxdev->dev, &auxiliary_irqs_group);
>>>> +     if (!ret)
>>>> +             auxdev->dir_exists = 1;
>>>> +
>>>> +unlock:
>>>> +     mutex_unlock(&auxdev->lock);
>>>> +     return ret;
>>>> +}
>>>> +
>>>> +/**
>>>> + * auxiliary_device_sysfs_irq_add - add a sysfs entry for the given IRQ
>>>> + * @auxdev: auxiliary bus device to add the sysfs entry.
>>>> + * @irq: The associated interrupt number.
>>>> + *
>>>> + * This function should be called after auxiliary device have successfully
>>>> + * received the irq.
>>>> + *
>>>> + * Return: zero on success or an error code on failure.
>>>> + */
>>>> +int auxiliary_device_sysfs_irq_add(struct auxiliary_device *auxdev, int irq)
>>>> +{
>>>> +     struct device *dev = &auxdev->dev;
>>>> +     struct auxiliary_irq_info *info;
>>>> +     int ret;
>>>> +
>>>> +     ret = auxiliary_irq_dir_prepare(auxdev);
>>>> +     if (ret)
>>>> +             return ret;
>>>> +
>>>> +     info = kzalloc(sizeof(*info), GFP_KERNEL);
>>>> +     if (!info)
>>>> +             return -ENOMEM;
>>>> +
>>>> +     sysfs_attr_init(&info->sysfs_attr.attr);
>>>> +     info->sysfs_attr.attr.name = kasprintf(GFP_KERNEL, "%d", irq);
>>>> +     if (!info->sysfs_attr.attr.name) {
>>>> +             ret = -ENOMEM;
>>>> +             goto name_err;
>>>> +     }
>>>> +
>>>> +     ret = xa_insert(&auxdev->irqs, irq, info, GFP_KERNEL);
>>>
>>> So no lock happening here, either use it always, or not at all?
>>
>>
>> the lock is only needed to protect the group (directory) creation, which
>> will be used by all the IRQs of this auxdev.
>> parallel calls to this API will always be with different IRQs, which
>> means each IRQ have a unique index.
> 
> You are inserting into the sysfs group at the same time?  You are
> calling xa_insert() at the same time?  Is that protected with some
> internal lock?  If so, this needs to be documented a bunch here.
> 
> Allocating irqs is NOT a fast path, just grab a lock and do it right
> please, don't make us constantly have to stare at the code to ensure it
> is correct.


like I said above, I will protect all the allocations


> 
>>>> +     if (ret)
>>>> +             goto auxdev_xa_err;
>>>> +
>>>> +     ret = sysfs_add_file_to_group(&dev->kobj, &info->sysfs_attr.attr,
>>>> +                                   auxiliary_irqs_group.name);
>>>
>>> You do know that you are never going to see these files from the
>>> userspace library tools that watch sysfs, right?  libudev will never see
>>> them as you are adding them AFTER the device is created.
>>>
>>> So, because of that, who is really going to use these files?
>>
>> To learn about the interrupt mapping of the SF IRQs.
> 
> Who is going to "learn"?  Again, you are creating files that our
> userspace tools will miss, so what userspace tools are going to be able
> to learn anything here?
> 
> This is strongly implying that all of this is just a debugging aid.  So
> please, put this in debugfs where that type of thing belongs.


It is certainly a debugging aid as I described in the commit log.
But it is one of the purpose.
The motivation was clear but probably I should have better written.
The irq affinity setting code [1] needs to read the irqs number of the
device.
Tools like irqbalance [1] are using the sysfs.
And one should be able to do the same for the PCI SF too.
They cannot rely on the debugfs.

[1] 
https://github.com/Irqbalance/irqbalance/blob/ba44a683cdfaa688e89e0d887952032766fb89aa/classify.c#L631


> 
>>>> +     if (ret)
>>>> +             goto sysfs_add_err;
>>>> +
>>>> +     return 0;
>>>> +
>>>> +sysfs_add_err:
>>>> +     xa_erase(&auxdev->irqs, irq);
>>>> +auxdev_xa_err:
>>>> +     kfree(info->sysfs_attr.attr.name);
>>>> +name_err:
>>>> +     kfree(info);
>>>
>>> Again, cleanup.h is your friend.
>>>
>>>> +     return ret;
>>>> +}
>>>> +EXPORT_SYMBOL_GPL(auxiliary_device_sysfs_irq_add);
>>>> +
>>>> +/**
>>>> + * auxiliary_device_sysfs_irq_remove - remove a sysfs entry for the given IRQ
>>>> + * @auxdev: auxiliary bus device to add the sysfs entry.
>>>> + * @irq: the IRQ to remove.
>>>> + *
>>>> + * This function should be called to remove an IRQ sysfs entry.
>>>> + */
>>>> +void auxiliary_device_sysfs_irq_remove(struct auxiliary_device *auxdev, int irq)
>>>> +{
>>>> +     struct auxiliary_irq_info *info = xa_load(&auxdev->irqs, irq);
>>>> +     struct device *dev = &auxdev->dev;
>>>> +
>>>> +     sysfs_remove_file_from_group(&dev->kobj, &info->sysfs_attr.attr,
>>>> +                                  auxiliary_irqs_group.name);
>>>> +     xa_erase(&auxdev->irqs, irq);
>>>> +     kfree(info->sysfs_attr.attr.name);
>>>> +     kfree(info);
>>>> +}
>>>> +EXPORT_SYMBOL_GPL(auxiliary_device_sysfs_irq_remove);
>>>> diff --git a/include/linux/auxiliary_bus.h b/include/linux/auxiliary_bus.h
>>>> index de21d9d24a95..96be140bd1ff 100644
>>>> --- a/include/linux/auxiliary_bus.h
>>>> +++ b/include/linux/auxiliary_bus.h
>>>> @@ -58,6 +58,7 @@
>>>>     *       in
>>>>     * @name: Match name found by the auxiliary device driver,
>>>>     * @id: unique identitier if multiple devices of the same name are exported,
>>>> + * @irqs: irqs xarray contains irq indices which are used by the device,
>>>>     *
>>>>     * An auxiliary_device represents a part of its parent device's functionality.
>>>>     * It is given a name that, combined with the registering drivers
>>>> @@ -138,7 +139,10 @@
>>>>    struct auxiliary_device {
>>>>         struct device dev;
>>>>         const char *name;
>>>> +     struct xarray irqs;
>>>> +     struct mutex lock; /* Protects "irqs" directory creation */
>>>
>>> Protects it from what?
>>
>> please look the answer above
> 
> You need to document it here.  Or somewhere.  Don't rely on an email
> thread from 10 years ago for when you look at this in 10 years and
> wonder what is going on...

Thanks, I will document it better in next version

> 
>>>>         u32 id;
>>>> +     u8 dir_exists:1;
>>>
>>> I don't think this is needed, but if it really is, just use a bool.
>>
>>
>> If you know of an API that query whether a specific group is exists on
>> some device, can you please share it with me?
>> I came out empty when I looked for one :(
> 
> Normally sysfs groups are NOT created this way at all.  Oh wait, they
> can be now, why not use the new feature where a group is created by the
> core but only exposed if an attribute is added there?
> 
> Will that work here?  See commit d87c295f599c ("sysfs: Introduce a
> mechanism to hide static attribute_groups") for details.  That should
> solve the issue of trying to figure out if the directory is present or
> not logic.


thank for the suggestion:)
will give it a shoot

> 
> thanks,
> 
> greg k-h

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ