[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <SJ0PR11MB586649F8417F261993A547CFE5242@SJ0PR11MB5866.namprd11.prod.outlook.com>
Date: Mon, 11 Mar 2024 14:30:02 +0000
From: "Loktionov, Aleksandr" <aleksandr.loktionov@...el.com>
To: Paul Menzel <pmenzel@...gen.mpg.de>
CC: "intel-wired-lan@...ts.osuosl.org" <intel-wired-lan@...ts.osuosl.org>,
"Nguyen, Anthony L" <anthony.l.nguyen@...el.com>, "netdev@...r.kernel.org"
<netdev@...r.kernel.org>
Subject: RE: [Intel-wired-lan] [PATCH iwl-net] i40e: fix vf may be used
uninitialized in this function warning
> -----Original Message-----
> From: Paul Menzel <pmenzel@...gen.mpg.de>
> Sent: Monday, March 11, 2024 3:15 PM
> To: Loktionov, Aleksandr <aleksandr.loktionov@...el.com>
> Cc: intel-wired-lan@...ts.osuosl.org; Nguyen, Anthony L
> <anthony.l.nguyen@...el.com>; netdev@...r.kernel.org
> Subject: Re: [Intel-wired-lan] [PATCH iwl-net] i40e: fix vf may be
> used uninitialized in this function warning
>
> Dear Aleksandr,
>
>
> Thank you for the patch.
>
>
> Am 11.03.24 um 12:25 schrieb Aleksandr Loktionov:
> > To fix the regression introduced by 52424f974bc5 commit, wchich
> causes
>
> 1. by commit 52424f974bc5
> 2. s/wchich/which/
>
> > servers hang in very hard to reproduce conditions with resets
> races.
>
> Is there a public report for this?
No public reports, sorry.
> > Remove redundant "v" variable and iterate via single VF pointer
> across
> > whole function instead to guarantee VF pointer validity.
>
> Could you please elaborate how the VF pointer currently gets
> invalid?
>
>
> Kind regards,
>
> Paul
>
Using two sources for the information is the root cause.
In this function before the fix bumping v didn't mean bumping vf pointer. But the code used this variables interchangeably, so staled vf could point to different/not intended vf.
>
> > Fixes: 52424f974bc5 ("i40e: Fix VF hang when reset is triggered
> on
> > another VF")
> > Signed-off-by: Aleksandr Loktionov
> <aleksandr.loktionov@...el.com>
> > ---
> > .../ethernet/intel/i40e/i40e_virtchnl_pf.c | 34 +++++++++---
> -------
> > 1 file changed, 16 insertions(+), 18 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
> > b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
> > index b34c717..f7c4654 100644
> > --- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
> > +++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
> > @@ -1628,105 +1628,103 @@ bool i40e_reset_all_vfs(struct i40e_pf
> *pf, bool flr)
> > {
> > struct i40e_hw *hw = &pf->hw;
> > struct i40e_vf *vf;
> > - int i, v;
> > u32 reg;
> > + int i;
> >
> > /* If we don't have any VFs, then there is nothing to reset
> */
> > if (!pf->num_alloc_vfs)
> > return false;
> >
> > /* If VFs have been disabled, there is no need to reset */
> > if (test_and_set_bit(__I40E_VF_DISABLE, pf->state))
> > return false;
> >
> > /* Begin reset on all VFs at once */
> > - for (v = 0; v < pf->num_alloc_vfs; v++) {
> > - vf = &pf->vf[v];
> > + for (vf = &pf->vf[0]; vf < &pf->vf[pf->num_alloc_vfs]; ++vf)
> {
>
> Shouldn’t pointer arithmetic be avoided?
>
> > /* If VF is being reset no need to trigger reset again
> */
> > if (!test_bit(I40E_VF_STATE_RESETTING, &vf->vf_states))
> > - i40e_trigger_vf_reset(&pf->vf[v], flr);
> > + i40e_trigger_vf_reset(vf, flr);
> > }
> >
> > /* HW requires some time to make sure it can flush the FIFO
> for a VF
> > * when it resets it. Poll the VPGEN_VFRSTAT register for
> each VF in
> > * sequence to make sure that it has completed. We'll keep
> track of
> > * the VFs using a simple iterator that increments once that
> VF has
> > * finished resetting.
> > */
> > - for (i = 0, v = 0; i < 10 && v < pf->num_alloc_vfs; i++) {
> > + for (i = 0, vf = &pf->vf[0]; i < 10 && vf <
> > +&pf->vf[pf->num_alloc_vfs]; ++i) {
> > usleep_range(10000, 20000);
> >
> > /* Check each VF in sequence, beginning with the VF to
> fail
> > * the previous check.
> > */
> > - while (v < pf->num_alloc_vfs) {
> > - vf = &pf->vf[v];
> > + while (vf < &pf->vf[pf->num_alloc_vfs]) {
> > if (!test_bit(I40E_VF_STATE_RESETTING, &vf-
> >vf_states)) {
> > reg = rd32(hw, I40E_VPGEN_VFRSTAT(vf-
> >vf_id));
> > if (!(reg & I40E_VPGEN_VFRSTAT_VFRD_MASK))
> > break;
> > }
> >
> > /* If the current VF has finished resetting, move
> on
> > * to the next VF in sequence.
> > */
> > - v++;
> > + ++vf;
> > }
> > }
> >
> > if (flr)
> > usleep_range(10000, 20000);
> >
> > /* Display a warning if at least one VF didn't manage to
> reset in
> > * time, but continue on with the operation.
> > */
> > - if (v < pf->num_alloc_vfs)
> > + if (vf < &pf->vf[pf->num_alloc_vfs])
> > dev_err(&pf->pdev->dev, "VF reset check timeout on VF
> %d\n",
> > - pf->vf[v].vf_id);
> > + vf->vf_id);
> > usleep_range(10000, 20000);
> >
> > /* Begin disabling all the rings associated with VFs, but do
> not wait
> > * between each VF.
> > */
> > - for (v = 0; v < pf->num_alloc_vfs; v++) {
> > + for (vf = &pf->vf[0]; vf < &pf->vf[pf->num_alloc_vfs]; ++vf)
> {
> > /* On initial reset, we don't have any queues to
> disable */
> > - if (pf->vf[v].lan_vsi_idx == 0)
> > + if (vf->lan_vsi_idx == 0)
> > continue;
> >
> > /* If VF is reset in another thread just continue */
> > if (test_bit(I40E_VF_STATE_RESETTING, &vf->vf_states))
> > continue;
> >
> > - i40e_vsi_stop_rings_no_wait(pf->vsi[pf-
> >vf[v].lan_vsi_idx]);
> > + i40e_vsi_stop_rings_no_wait(pf->vsi[vf->lan_vsi_idx]);
> > }
> >
> > /* Now that we've notified HW to disable all of the VF rings,
> wait
> > * until they finish.
> > */
> > - for (v = 0; v < pf->num_alloc_vfs; v++) {
> > + for (vf = &pf->vf[0]; vf < &pf->vf[pf->num_alloc_vfs]; ++vf)
> {
> > /* On initial reset, we don't have any queues to
> disable */
> > - if (pf->vf[v].lan_vsi_idx == 0)
> > + if (vf->lan_vsi_idx == 0)
> > continue;
> >
> > /* If VF is reset in another thread just continue */
> > if (test_bit(I40E_VF_STATE_RESETTING, &vf->vf_states))
> > continue;
> >
> > - i40e_vsi_wait_queues_disabled(pf->vsi[pf-
> >vf[v].lan_vsi_idx]);
> > + i40e_vsi_wait_queues_disabled(pf->vsi[vf-
> >lan_vsi_idx]);
> > }
> >
> > /* Hw may need up to 50ms to finish disabling the RX queues.
> We
> > * minimize the wait by delaying only once for all VFs.
> > */
> > mdelay(50);
> >
> > /* Finish the reset on each VF */
> > - for (v = 0; v < pf->num_alloc_vfs; v++) {
> > + for (vf = &pf->vf[0]; vf < &pf->vf[pf->num_alloc_vfs]; ++vf)
> {
> > /* If VF is reset in another thread just continue */
> > if (test_bit(I40E_VF_STATE_RESETTING, &vf->vf_states))
> > continue;
> >
> > - i40e_cleanup_reset_vf(&pf->vf[v]);
> > + i40e_cleanup_reset_vf(vf);
> > }
> >
> > i40e_flush(hw);
Powered by blists - more mailing lists