lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <SA2PR11MB488943E04BE15FCE55DE1881F56B9@SA2PR11MB4889.namprd11.prod.outlook.com>
Date:   Fri, 28 Apr 2023 09:12:56 +0000
From:   "Maziarz, Kamil" <kamil.maziarz@...el.com>
To:     Paolo Abeni <pabeni@...hat.com>,
        "Keller, Jacob E" <jacob.e.keller@...el.com>,
        Leon Romanovsky <leon@...nel.org>,
        "Nguyen, Anthony L" <anthony.l.nguyen@...el.com>
CC:     "davem@...emloft.net" <davem@...emloft.net>,
        "kuba@...nel.org" <kuba@...nel.org>,
        "edumazet@...gle.com" <edumazet@...gle.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "Wesierski, DawidX" <dawidx.wesierski@...el.com>,
        "Romanowski, Rafal" <rafal.romanowski@...el.com>
Subject: RE: [PATCH net 2/3] ice: Fix ice VF reset during iavf initialization

On Wed, 2023-04-26 at 16:22 +0000, Keller, Jacob E wrote:
> > > From: Leon Romanovsky <leon@...nel.org>
> >
> > > But what I see is that ICE_VF_STATE_ACTIVE bit check is racy and you 
> > > don't really fix the root cause of calling to reset without proper 
> > > locking.
> > > 
>  > 
> > I think there's some confusing re-use of words going on in the commit 
> > message. It describes what the VF does while recovering and re- 
> > initializing from a reset. I think the goal is to prevent starting 
> > another reset until the first one has recovered.

> Uhmm... it looks like the current patch does not prevent two concurrent resets, I think the goal of this patch is let other vf related ndo restart gracefully when a VF reset is running.

> > I am not sure we can use a standard lock here because we likely do 
> > want to be able to recover if the VF driver doesn't respond in a 
> > sufficient time.
> > 
> > I don't know exactly what problem this commit claims to fix.

> I think this patch could benefit from at least a more descriptive/clearer commit message.

> Thanks,

> Paolo

Dear All,

Forwarding message from the author:

"Hi, this patch addresses an issue where during concurrent resets from the pf-side iavf driver was unable to finish the reset because the pf started the reset procedure again unaware that the iavf reset completion is not done. 

the pf is setting the hw registers responsible for adminq length among other things which causes the iavf flow to throw errors when the iavf is unable to setup adminq's. 

this issue looks something like this

you can see the iavf reset flow on the 28 core 

28) ! 550.242 us  |      iavf_free_rx_resources [iavf]();
 34)               |  /* WE TAKE RTNL_LOCK HERE */
 34)   1.385 us    |  i40e_get_phys_port_id [i40e]();
 34)               |  i40e_get_netdev_stats_struct [i40e]() {
 34)   0.410 us    |    __rcu_read_lock();
 34)   0.346 us    |    __rcu_read_unlock();
 34)   3.821 us    |  }
 34)               |  i40e_ndo_get_vf_config [i40e]() {
 34)               |    i40e_validate_vf [i40e]() {
 34)   0.738 us    |      i40e_find_vsi_from_id [i40e]();
 34)   1.654 us    |    }
 34)   2.755 us    |  }
 34)               |  i40e_get_vf_stats [i40e]() {
 34)               |    i40e_validate_vf [i40e]() {
 34)   0.546 us    |      i40e_find_vsi_from_id [i40e]();
 34)   1.194 us    |    }
 34)               |    i40e_update_eth_stats [i40e]() {
 34)   1.456 us    |      i40e_stat_update48 [i40e]();
 34)   1.238 us    |      i40e_stat_update48 [i40e]();
 34)   1.231 us    |      i40e_stat_update48 [i40e]();
 34)   1.157 us    |      i40e_stat_update48 [i40e]();
 34)   1.375 us    |      i40e_stat_update48 [i40e]();
 34)   1.311 us    |      i40e_stat_update48 [i40e]();
 34)   1.167 us    |      i40e_stat_update48 [i40e]();
 34)   1.139 us    |      i40e_stat_update48 [i40e]();
 34) + 19.299 us   |    }
 34) + 22.060 us   |  }
 34)               |  /* WE UNLOCK RTNL_UNLOCK HERE */
 34)               |  /* WE TAKE RTNL_LOCK HERE */
 34)               |  i40e_ndo_set_vf_trust [i40e]() {
 34)               |    i40e_vc_reset_vf [i40e]() {
 34)               |      i40e_vc_notify_vf_reset [i40e]() {
 34)               |        /* i40e_vc_notify_vf_reset PASS 
test  &vf->vf_states == 1  */
 34) ! 104.964 us  |      }
 34)               |      /* we are i40e_vc_reset_vf with state== 0 */
 34)               |      i40e_reset_vf [i40e]() {
 34)               |        /* i40e_reset_vf -- test_and_set_bit(I40E_VF_STATE_RESETTING, &vf->vf_states)  == false */
 34)               |        /* i40e_trigger_vf_reset unsets I40E_VF_STATE_ACTIVE flag */
 ------------------------------------------
 33) kworker-22304  =>    <idle>-0   
 ------------------------------------------

 33)               |  i40e_intr [i40e]() {
 33)               |    i40e_service_event_schedule [i40e]() {
 33) + 14.017 us   |      queue_work_on();
 33) + 15.544 us   |    }
 33) + 22.424 us   |  }
 ------------------------------------------
 33)    <idle>-0    => kworker-22304 
 ------------------------------------------

 33)               |  i40e_service_task [i40e]() {
 33)   5.570 us    |    i40e_detect_recover_hung [i40e]();
 33)               |    i40e_sync_filters_subtask [i40e]() {
 33) + 11.219 us   |      i40e_sync_vsi_filters [i40e]();
 33) + 13.328 us   |    }
 33)   0.616 us    |    i40e_reset_subtask [i40e]();
 33)               |    i40e_vc_process_vflr_event [i40e]() {
 33)               |      /* i40e_vc_process_vflr_event -> writes into hw reg 0x64f90000 */
 33)   4.683 us    |    }
 33)   1.522 us    |    i40e_fdir_check_and_reenable [i40e]();
 33)   0.516 us    |    i40e_client_subtask [i40e]();
 33)   0.511 us    |    i40e_sync_filters_subtask [i40e]();
 33)               |    kmalloc_trace() {
 33)   1.564 us    |      __kmem_cache_alloc_node();
 28) ! 750.262 us  |      iavf_free_rx_resources [iavf]();
 33)   2.913 us    |    }
 33)               |    i40e_clean_arq_element [i40e]() {
 33)   0.644 us    |      mutex_lock();
 33)   0.353 us    |      mutex_unlock();
 33)   2.914 us    |    }
 33)               |    kfree() {
 33)   0.571 us    |      __kmem_cache_free();
 33)   3.267 us    |    }
 33) + 45.828 us   |  }
 28) ! 628.357 us  |      iavf_free_rx_resources [iavf]();
 28) ! 618.235 us  |      iavf_free_rx_resources [iavf]();
 28) # 2549.539 us |    }
 28)               |    iavf_free_all_tx_resources.part.49 [iavf]() {
 28) + 60.257 us   |      iavf_free_tx_resources [iavf]();
 28) + 59.489 us   |      iavf_free_tx_resources [iavf]();
 28) + 59.960 us   |      iavf_free_tx_resources [iavf]();
 28) + 60.755 us   |      iavf_free_tx_resources [iavf]();
 28) ! 242.568 us  |    }
 28)               |    iavf_shutdown_adminq [iavf]() {
 28)   1.450 us    |      iavf_check_asq_alive [iavf]();
 28)   2.275 us    |      iavf_aq_queue_shutdown [iavf]();
 28) ! 137.843 us  |      iavf_shutdown_asq [iavf]();
 28)   0.597 us    |      mutex_lock();
 28)   4.232 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.562 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.553 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.521 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.498 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.514 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.602 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.455 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.564 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.543 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.574 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.701 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.998 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.482 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.542 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.522 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.472 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.580 us    |      iavf_free_dma_mem_d [iavf]();
 28)   3.143 us    |      iavf_free_dma_mem_d [iavf]();
 28)   3.015 us    |      iavf_free_dma_mem_d [iavf]();
 28)   3.072 us    |      iavf_free_dma_mem_d [iavf]();
 28)   3.104 us    |      iavf_free_dma_mem_d [iavf]();
 28)   3.033 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.458 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.486 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.518 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.482 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.564 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.906 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.843 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.828 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.911 us    |      iavf_free_dma_mem_d [iavf]();
 28)   2.879 us    |      iavf_free_dma_mem_d [iavf]();
 28)   0.637 us    |      iavf_free_virt_mem_d [iavf]();
 28)   0.393 us    |      mutex_unlock();
 28) ! 250.148 us  |    }
 28)               |    iavf_init_adminq [iavf]() {
 28)               |      /* iavf_init_adminq(hw->hw->aq.asq.bal == 0xdeadbeef */
 28) * 30059.69 us |      usleep_range_state();
 34)               |        /* i40e_reset_vf we read the VFRSTAT and got 0x1  */
 28)               |      /* iavf_init_asq(hw->hw->aq.asq.bal == 0xdeadbeef */
 28)   6.043 us    |      iavf_allocate_dma_mem_d [iavf]();
 28) + 37.481 us   |      iavf_allocate_virt_mem_d [iavf]();
 28)               |      /* allocate the ring memory(hw->hw->aq.asq.bal == 0xdeadbeef */
 28) + 11.787 us   |      _printk();
 28)               |      /* iavf_init_asq_1(hw->hw->aq.asq.bal == 0xdeadbeef */
 28)   1.243 us    |      iavf_allocate_virt_mem_d [iavf]();

thread 28 - is doing the iavf restet
while the 33 thread is cutting in and starts another pf triggerd vf reset 

As you can see this means that the pf isn't able to recognize when the vf reset is done it thinks that the vf reset already finished initialization. 
Im using existing messaging architecture to address that.

the whole thought process behind addressing this could be seen inside the attached pdf.
The flows are quite complicated and there are 2 major challenges here.

Resets induced by the iavf as well as i40e driver.

As I understand the wording of the patch is confusing and should be corrected ?

Thanks for your input "

Thanks,
Kamil

Download attachment "reset_pf_reset_change_and_bug.pdf" of type "application/pdf" (166749 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ