lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4435dade-5c41-43a1-aeab-58e2d262545f@molgen.mpg.de>
Date: Tue, 15 Oct 2024 12:32:38 +0200
From: Paul Menzel <pmenzel@...gen.mpg.de>
To: Aleksandr Loktionov <aleksandr.loktionov@...el.com>
Cc: "intel-wired-lan@...ts.osuosl.org" <intel-wired-lan@...ts.osuosl.org>,
 Anthony L Nguyen <anthony.l.nguyen@...el.com>, netdev@...r.kernel.org
Subject: Re: [Intel-wired-lan] [PATCH iwl-net v1] i40e: fix "Error
 I40E_AQ_RC_ENOSPC adding RX filters on VF" issue

Dear Aleksandr,


Thank you for your reply. Just a note at the beginning, that your mailer 
seems to wrap lines without preserving the right citation level. It’d be 
great if you used a mailer following standards. (I know, it’s hard in a 
corporate environment, but other Intel developers seem to have found 
solutions for this.)


Am 15.10.24 um 11:45 schrieb Loktionov, Aleksandr:

>> -----Original Message-----
>> From: Paul Menzel <pmenzel@...gen.mpg.de>
>> Sent: Tuesday, October 15, 2024 10:24 AM

>> Thank you for your patch. For the summary I’d make it more about the
>> action of the patch like “Add intermediate filter to …”.
> 
> Sorry, I don't get your point. This patch fixes bug that can stop
> vfs to receive any traffic making them useless. The first and most
> visible effect of the bug is a lot of "Error I40E_AQ_RC_ENOSPC
> adding RX filters on VF XX,..." errors from F/W complaining to add
> MAC/VLAN filter. So I've mentioned it in the title to be easy found. 
> I don't add any filter in the driver, we have to add one more
> intermediate state of the filter to avoid the race condition.

In my opinion, having the log in the body is good enough for search 
engines and the summary should be optimized for `git log --oneline` 
consumption. I am sorry about the confusion with my example. Maybe:

Add intermediate sync state to fix race condition

>> Am 15.10.24 um 09:04 schrieb Aleksandr Loktionov:
>>> Fix a race condition in the i40e driver that leads to MAC/VLAN filters
>>> becoming corrupted and leaking. Address the issue that occurs under
>>> heavy load when multiple threads are concurrently modifying MAC/VLAN
>>> filters by setting mac and port VLAN.
>>>
>>> 1. Thread T0 allocates a filter in i40e_add_filter() within
>>>           i40e_ndo_set_vf_port_vlan().
>>> 2. Thread T1 concurrently frees the filter in __i40e_del_filter() within
>>>           i40e_ndo_set_vf_mac().
>>> 3. Subsequently, i40e_service_task() calls i40e_sync_vsi_filters(), which
>>>           refers to the already freed filter memory, causing corruption.
>>>
>>> Reproduction steps:
>>> 1. Spawn multiple VFs.
>>> 2. Apply a concurrent heavy load by running parallel operations to change
>>>           MAC addresses on the VFs and change port VLANs on the host.
>>
>> It’d be great if you shared your commands.

> Sorry, I'm pretty sure it's quite impossible to reproduce the issue
> with bash scripts /*I've tried really hard*/. Reproduction is
> related to user-space/kernel code which might be not open-sourced.
> So as I've explained in the commit title the race condition
> possibility that was introduced from the very beginning.

Could you please ask to get clearance to publish it. My naive view 
wonders, why legal(?) should oppose publication.

>>> 3. Observe errors in dmesg:
>>> "Error I40E_AQ_RC_ENOSPC adding RX filters on VF XX,
>>>    please set promiscuous on manually for VF XX".
>>
>> I’d indent it by eight spaces and put it in one line.
> Ok, I'll fix it in v2
> 
>>> The fix involves implementing a new intermediate filter state,
>>> I40E_FILTER_NEW_SYNC, for the time when a filter is on a tmp_add_list.
>>> These filters cannot be deleted from the hash list directly but
>>> must be removed using the full process.
>>
>> Please excuse my ignorance. Where is that done in the diff? For me it
>> looks like you only replace `I40E_FILTER_NEW` by `I40E_FILTER_NEW_SYNC`
>> in certain places, but no new condition for this case.
>
> Here are below the code which adds new I40E_FILTER_NEW_SYNC enum. 
> And additional conditions for this I40E_FILTER_NEW_SYNC state. All
> other places in the driver just tract I40E_FILTER_NEW_SYNC as not
> just I40E_FILTER_NEW by default.
Thank you. For me it’s not so obvious from the diff, and indeed, it’s 
done in `i40e_sync_vsi_filters()`. Thank you again.

>>> Fixes: 278e7d0b9d68 ("i40e: store MAC/VLAN filters in a hash with the MAC Address as key")
>>> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@...el.com>
>>> ---
>>>    drivers/net/ethernet/intel/i40e/i40e.h         |  2 ++
>>>    drivers/net/ethernet/intel/i40e/i40e_debugfs.c |  2 ++
>>>    drivers/net/ethernet/intel/i40e/i40e_main.c    | 12 ++++++++++--
>>>    3 files changed, 14 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/intel/i40e/i40e.h
>> b/drivers/net/ethernet/intel/i40e/i40e.h
>>> index 2089a0e..a1842dd 100644
>>> --- a/drivers/net/ethernet/intel/i40e/i40e.h
>>> +++ b/drivers/net/ethernet/intel/i40e/i40e.h
>>> @@ -755,6 +755,8 @@ enum i40e_filter_state {
>>>    	I40E_FILTER_ACTIVE,		/* Added to switch by FW */
>>>    	I40E_FILTER_FAILED,		/* Rejected by FW */
>>>    	I40E_FILTER_REMOVE,		/* To be removed */
>>> +	/* RESERVED */
>>
>> Why the reserved comment? Please elaborate in the commit message.
> 
> This is for not breaking compatibility with different driver
> versions. Between OOT, net-next and plain old net. Isn't it obvious
> from the comment, it's "RESERVERD"?

Apparently not, otherwise I wouldn’t ask. ;-)

> Can you provide me example commit message what I should follow?

There are people reading the code not familiar with the ecosystem, that 
there is an out of tree driver fore example. So the code or the commit 
message should have an explanation why `I40E_FILTER_NEW_SYNC = 6` and 
what the reserved is used for.

>>> +	I40E_FILTER_NEW_SYNC = 6,	/* New, not sent, in sync task */

Also mention the hash list in the comment?

>>>    /* There is no 'removed' state; the filter struct is freed */
>>>    };
>>>    struct i40e_mac_filter {
>>> diff --git a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
>>> index abf624d..1c439b1 100644
>>> --- a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
>>> +++ b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
>>> @@ -89,6 +89,8 @@ static char *i40e_filter_state_string[] = {
>>>    	"ACTIVE",
>>>    	"FAILED",
>>>    	"REMOVE",
>>> +	"<RESERVED>",
>>> +	"NEW_SYNC",
>>>    };
>>>
>>>    /**
>>> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
>>> index 25295ae..55fb362 100644
>>> --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
>>> +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
>>> @@ -1255,6 +1255,7 @@ int i40e_count_filters(struct i40e_vsi *vsi)
>>>
>>>    	hash_for_each_safe(vsi->mac_filter_hash, bkt, h, f, hlist) {
>>>    		if (f->state == I40E_FILTER_NEW ||
>>> +		    f->state == I40E_FILTER_NEW_SYNC ||
>>>    		    f->state == I40E_FILTER_ACTIVE)
>>>    			++cnt;
>>>    	}
>>> @@ -1441,6 +1442,8 @@ static int i40e_correct_mac_vlan_filters(struct i40e_vsi *vsi,
>>>
>>>    			new->f = add_head;
>>>    			new->state = add_head->state;
>>> +			if (add_head->state == I40E_FILTER_NEW)
>>> +				add_head->state = I40E_FILTER_NEW_SYNC;
>>>
>>>    			/* Add the new filter to the tmp list */
>>>    			hlist_add_head(&new->hlist, tmp_add_list);
>>> @@ -1550,6 +1553,8 @@ static int i40e_correct_vf_mac_vlan_filters(struct i40e_vsi *vsi,
>>>    				return -ENOMEM;
>>>    			new_mac->f = add_head;
>>>    			new_mac->state = add_head->state;
>>> +			if (add_head->state == I40E_FILTER_NEW)
>>> +				add_head->state = I40E_FILTER_NEW_SYNC;
>>>
>>>    			/* Add the new filter to the tmp list */
>>>    			hlist_add_head(&new_mac->hlist, tmp_add_list);
>>> @@ -2437,7 +2442,8 @@ static int
>>>    i40e_aqc_broadcast_filter(struct i40e_vsi *vsi, const char *vsi_name,
>>>    			  struct i40e_mac_filter *f)
>>>    {
>>> -	bool enable = f->state == I40E_FILTER_NEW;
>>> +	bool enable = f->state == I40E_FILTER_NEW ||
>>> +		      f->state == I40E_FILTER_NEW_SYNC;
>>>    	struct i40e_hw *hw = &vsi->back->hw;
>>>    	int aq_ret;
>>>
>>> @@ -2611,6 +2617,7 @@ int i40e_sync_vsi_filters(struct i40e_vsi *vsi)
>>>
>>>    				/* Add it to the hash list */
>>>    				hlist_add_head(&new->hlist, &tmp_add_list);
>>> +				f->state = I40E_FILTER_NEW_SYNC;
>>>    			}
>>>
>>>    			/* Count the number of active (current and new) VLAN
>>> @@ -2762,7 +2769,8 @@ int i40e_sync_vsi_filters(struct i40e_vsi *vsi)
>>>    		spin_lock_bh(&vsi->mac_filter_hash_lock);
>>>    		hlist_for_each_entry_safe(new, h, &tmp_add_list, hlist) {
>>>    			/* Only update the state if we're still NEW */
>>> -			if (new->f->state == I40E_FILTER_NEW)
>>> +			if (new->f->state == I40E_FILTER_NEW ||
>>> +			    new->f->state == I40E_FILTER_NEW_SYNC)
>>>    				new->f->state = new->state;
>>>    			hlist_del(&new->hlist);
>>>    			netdev_hw_addr_refcnt(new->f, vsi->netdev, -1);
Thank you again for your work and explanations.


Kind regards,

Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ