[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <054b84a1-fd70-4abe-b1aa-18c0882c05f5@molgen.mpg.de>
Date: Wed, 16 Oct 2024 07:28:03 +0200
From: Paul Menzel <pmenzel@...gen.mpg.de>
To: Aleksandr Loktionov <aleksandr.loktionov@...el.com>
Cc: Anthony L Nguyen <anthony.l.nguyen@...el.com>,
intel-wired-lan@...ts.osuosl.org, netdev@...r.kernel.org
Subject: Re: [Intel-wired-lan] [PATCH iwl-net v1] i40e: fix "Error
I40E_AQ_RC_ENOSPC adding RX filters on VF" issue
Dear Aleksandr,
Am 15.10.24 um 13:05 schrieb Aleksandr Loktionov:
>> -----Original Message-----
>> From: Paul Menzel <pmenzel@...gen.mpg.de>
>> Sent: Tuesday, October 15, 2024 12:33 PM
[…]
>> Am 15.10.24 um 11:45 schrieb Loktionov, Aleksandr:
>>
>>>> -----Original Message-----
>>>> From: Paul Menzel <pmenzel@...gen.mpg.de>
>>>> Sent: Tuesday, October 15, 2024 10:24 AM
>>
>>>> Thank you for your patch. For the summary I’d make it more about the
>>>> action of the patch like “Add intermediate filter to …”.
>>>
>>> Sorry, I don't get your point. This patch fixes bug that can stop
>>> vfs to receive any traffic making them useless. The first and most
>>> visible effect of the bug is a lot of "Error I40E_AQ_RC_ENOSPC
>>> adding RX filters on VF XX,..." errors from F/W complaining to add
>>> MAC/VLAN filter. So I've mentioned it in the title to be easy found.
>>> I don't add any filter in the driver, we have to add one more
>>> intermediate state of the filter to avoid the race condition.
>>
>> In my opinion, having the log in the body is good enough for search
>> engines and the summary should be optimized for `git log --oneline`
>> consumption. I am sorry about the confusion with my example. Maybe:
>>
>> Add intermediate sync state to fix race condition
>
> I just wonder why do you insist on "ADD" which usually means
> implementing a new feature. When this patch FIXes the bug?
Some projects use this interpretation/structure, but to my knowledge not
the Linux kernel.
> If I'd want to add a new feature I'd rather send my patch to net-next,
> isn't it?
*Add*, how I used it, does not imply a new feature addition.
> For me 'fix race condition by adding filter's intermediate sync
> state'
> Can you explain your strong argument for the Add?
I am fine with your suggestion, and do not oppose to start with *Fix*.
Also, thank you for your elaborate answer, so I could understand the
point of view you came from.
>>>> Am 15.10.24 um 09:04 schrieb Aleksandr Loktionov:
>>>>> Fix a race condition in the i40e driver that leads to MAC/VLAN filters
>>>>> becoming corrupted and leaking. Address the issue that occurs under
>>>>> heavy load when multiple threads are concurrently modifying MAC/VLAN
>>>>> filters by setting mac and port VLAN.
>>>>>
>>>>> 1. Thread T0 allocates a filter in i40e_add_filter() within
>>>>> i40e_ndo_set_vf_port_vlan().
>>>>> 2. Thread T1 concurrently frees the filter in __i40e_del_filter() within
>>>>> i40e_ndo_set_vf_mac().
>>>>> 3. Subsequently, i40e_service_task() calls i40e_sync_vsi_filters(), which
>>>>> refers to the already freed filter memory, causing corruption.
>>>>>
>>>>> Reproduction steps:
>>>>> 1. Spawn multiple VFs.
>>>>> 2. Apply a concurrent heavy load by running parallel operations to change
>>>>> MAC addresses on the VFs and change port VLANs on the host.
>>>>
>>>> It’d be great if you shared your commands.
>>
>>> Sorry, I'm pretty sure it's quite impossible to reproduce the issue
>>> with bash scripts /*I've tried really hard*/. Reproduction is
>>> related to user-space/kernel code which might be not open-sourced.
>>> So as I've explained in the commit title the race condition
>>> possibility that was introduced from the very beginning.
>>
>> Could you please ask to get clearance to publish it. My naive view
>> wonders, why legal(?) should oppose publication.
>
> Simply the defect can come from development tools or from external
> customer. And being tract as a commercial secret, which even doesn't
> belong to Intel sometimes.
I know, that these are general problems. But in this specific case you
should know the circumstances, and be able to document it in the commit
message, why a test case cannot be published.
Kind regards,
Paul
Powered by blists - more mailing lists