[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADEbmW2NRmYZvx7+yki8MR0tX+OzZovAYO-u+o6adHdYyaFn9w@mail.gmail.com>
Date: Wed, 23 Oct 2024 18:34:21 +0200
From: Michal Schmidt <mschmidt@...hat.com>
To: Aleksandr Loktionov <aleksandr.loktionov@...el.com>
Cc: intel-wired-lan@...ts.osuosl.org, anthony.l.nguyen@...el.com,
netdev@...r.kernel.org
Subject: Re: [Intel-wired-lan] [PATCH iwl-net v2] i40e: fix race condition by
adding filter's intermediate sync state
On Wed, Oct 16, 2024 at 11:30 AM Aleksandr Loktionov
<aleksandr.loktionov@...el.com> wrote:
>
> Fix a race condition in the i40e driver that leads to MAC/VLAN filters
> becoming corrupted and leaking. Address the issue that occurs under
> heavy load when multiple threads are concurrently modifying MAC/VLAN
> filters by setting mac and port VLAN.
>
> 1. Thread T0 allocates a filter in i40e_add_filter() within
> i40e_ndo_set_vf_port_vlan().
> 2. Thread T1 concurrently frees the filter in __i40e_del_filter() within
> i40e_ndo_set_vf_mac().
> 3. Subsequently, i40e_service_task() calls i40e_sync_vsi_filters(), which
> refers to the already freed filter memory, causing corruption.
I think an important detail is missing from the race description. I am
guessing it could happen like this:
1. A thread allocates a filter with i40e_add_filter().
2. i40e_service_task() calls i40e_sync_vsi_filters(), which adds an
entry to its local tmp_add_list referencing the filter. It releases
vsi->mac_filter_hash_lock before it processes tmp_add_list.
3. A thread frees the filter in __i40e_del_filter(). This is while
holding vsi->mac_filter_hash_lock.
4. The service task processes tmp_add_list and dereferences the
already freed filter.
Do I understand the race right?
Michal
> Reproduction steps:
> 1. Spawn multiple VFs.
> 2. Apply a concurrent heavy load by running parallel operations to change
> MAC addresses on the VFs and change port VLANs on the host.
> 3. Observe errors in dmesg:
> "Error I40E_AQ_RC_ENOSPC adding RX filters on VF XX,
> please set promiscuous on manually for VF XX".
>
> Exact code for stable reproduction Intel can't open-source now.
>
> The fix involves implementing a new intermediate filter state,
> I40E_FILTER_NEW_SYNC, for the time when a filter is on a tmp_add_list.
> These filters cannot be deleted from the hash list directly but
> must be removed using the full process.
>
> Fixes: 278e7d0b9d68 ("i40e: store MAC/VLAN filters in a hash with the MAC Address as key")
> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@...el.com>
> ---
> v1->v2 change commit title, removed RESERVED state byt request in review
> ---
> drivers/net/ethernet/intel/i40e/i40e.h | 2 ++
> drivers/net/ethernet/intel/i40e/i40e_debugfs.c | 1 +
> drivers/net/ethernet/intel/i40e/i40e_main.c | 12 ++++++++++--
> 3 files changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
> index 2089a0e..2e205218 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e.h
> +++ b/drivers/net/ethernet/intel/i40e/i40e.h
> @@ -755,6 +755,8 @@ enum i40e_filter_state {
> I40E_FILTER_ACTIVE, /* Added to switch by FW */
> I40E_FILTER_FAILED, /* Rejected by FW */
> I40E_FILTER_REMOVE, /* To be removed */
> + I40E_FILTER_NEW_SYNC, /* New, not sent yet, is in
> + i40e_sync_vsi_filters() */
> /* There is no 'removed' state; the filter struct is freed */
> };
> struct i40e_mac_filter {
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
> index abf624d..208c2f0 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
> @@ -89,6 +89,7 @@ static char *i40e_filter_state_string[] = {
> "ACTIVE",
> "FAILED",
> "REMOVE",
> + "NEW_SYNC",
> };
>
> /**
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
> index 25295ae..55fb362 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
> @@ -1255,6 +1255,7 @@ int i40e_count_filters(struct i40e_vsi *vsi)
>
> hash_for_each_safe(vsi->mac_filter_hash, bkt, h, f, hlist) {
> if (f->state == I40E_FILTER_NEW ||
> + f->state == I40E_FILTER_NEW_SYNC ||
> f->state == I40E_FILTER_ACTIVE)
> ++cnt;
> }
> @@ -1441,6 +1442,8 @@ static int i40e_correct_mac_vlan_filters(struct i40e_vsi *vsi,
>
> new->f = add_head;
> new->state = add_head->state;
> + if (add_head->state == I40E_FILTER_NEW)
> + add_head->state = I40E_FILTER_NEW_SYNC;
>
> /* Add the new filter to the tmp list */
> hlist_add_head(&new->hlist, tmp_add_list);
> @@ -1550,6 +1553,8 @@ static int i40e_correct_vf_mac_vlan_filters(struct i40e_vsi *vsi,
> return -ENOMEM;
> new_mac->f = add_head;
> new_mac->state = add_head->state;
> + if (add_head->state == I40E_FILTER_NEW)
> + add_head->state = I40E_FILTER_NEW_SYNC;
>
> /* Add the new filter to the tmp list */
> hlist_add_head(&new_mac->hlist, tmp_add_list);
> @@ -2437,7 +2442,8 @@ static int
> i40e_aqc_broadcast_filter(struct i40e_vsi *vsi, const char *vsi_name,
> struct i40e_mac_filter *f)
> {
> - bool enable = f->state == I40E_FILTER_NEW;
> + bool enable = f->state == I40E_FILTER_NEW ||
> + f->state == I40E_FILTER_NEW_SYNC;
> struct i40e_hw *hw = &vsi->back->hw;
> int aq_ret;
>
> @@ -2611,6 +2617,7 @@ int i40e_sync_vsi_filters(struct i40e_vsi *vsi)
>
> /* Add it to the hash list */
> hlist_add_head(&new->hlist, &tmp_add_list);
> + f->state = I40E_FILTER_NEW_SYNC;
> }
>
> /* Count the number of active (current and new) VLAN
> @@ -2762,7 +2769,8 @@ int i40e_sync_vsi_filters(struct i40e_vsi *vsi)
> spin_lock_bh(&vsi->mac_filter_hash_lock);
> hlist_for_each_entry_safe(new, h, &tmp_add_list, hlist) {
> /* Only update the state if we're still NEW */
> - if (new->f->state == I40E_FILTER_NEW)
> + if (new->f->state == I40E_FILTER_NEW ||
> + new->f->state == I40E_FILTER_NEW_SYNC)
> new->f->state = new->state;
> hlist_del(&new->hlist);
> netdev_hw_addr_refcnt(new->f, vsi->netdev, -1);
> --
> 2.25.1
>
Powered by blists - more mailing lists