[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <SJ0PR11MB58666BD324254B3B39024A22E53D2@SJ0PR11MB5866.namprd11.prod.outlook.com>
Date: Wed, 3 Apr 2024 09:27:17 +0000
From: "Loktionov, Aleksandr" <aleksandr.loktionov@...el.com>
To: Marcin Szycik <marcin.szycik@...ux.intel.com>,
"intel-wired-lan@...ts.osuosl.org" <intel-wired-lan@...ts.osuosl.org>
CC: "Drewek, Wojciech" <wojciech.drewek@...el.com>, "Wang, Liang-min"
<liang-min.wang@...el.com>, "netdev@...r.kernel.org"
<netdev@...r.kernel.org>, "Chmielewski, Pawel" <pawel.chmielewski@...el.com>,
"Nguyen, Anthony L" <anthony.l.nguyen@...el.com>, "horms@...nel.org"
<horms@...nel.org>, "Kitszel, Przemyslaw" <przemyslaw.kitszel@...el.com>
Subject: RE: [Intel-wired-lan] [PATCH iwl-next v4] ice: Reset VF on Tx MDD
event
> -----Original Message-----
> From: Marcin Szycik <marcin.szycik@...ux.intel.com>
> Sent: Wednesday, April 3, 2024 11:21 AM
> To: Loktionov, Aleksandr <aleksandr.loktionov@...el.com>; intel-
> wired-lan@...ts.osuosl.org
> Cc: Drewek, Wojciech <wojciech.drewek@...el.com>; Wang, Liang-min
> <liang-min.wang@...el.com>; netdev@...r.kernel.org; Chmielewski,
> Pawel <pawel.chmielewski@...el.com>; Nguyen, Anthony L
> <anthony.l.nguyen@...el.com>; horms@...nel.org; Kitszel, Przemyslaw
> <przemyslaw.kitszel@...el.com>
> Subject: Re: [Intel-wired-lan] [PATCH iwl-next v4] ice: Reset VF on
> Tx MDD event
>
>
>
> On 03.04.2024 08:37, Loktionov, Aleksandr wrote:
> >
> >
> >> -----Original Message-----
> >> From: Intel-wired-lan <intel-wired-lan-bounces@...osl.org> On
> Behalf
> >> Of Marcin Szycik
> >> Sent: Tuesday, April 2, 2024 6:52 PM
> >> To: intel-wired-lan@...ts.osuosl.org
> >> Cc: Drewek, Wojciech <wojciech.drewek@...el.com>; Wang, Liang-
> min
> >> <liang-min.wang@...el.com>; netdev@...r.kernel.org; Chmielewski,
> >> Pawel <pawel.chmielewski@...el.com>; Marcin Szycik
> >> <marcin.szycik@...ux.intel.com>; Nguyen, Anthony L
> >> <anthony.l.nguyen@...el.com>; horms@...nel.org; Kitszel,
> Przemyslaw
> >> <przemyslaw.kitszel@...el.com>
> >> Subject: [Intel-wired-lan] [PATCH iwl-next v4] ice: Reset VF on
> Tx
> >> MDD event
> >>
> > Please state in the title explicitly the purpose of the patch:
> > Do you fix a bug? say fix
> > Do you add functionality? say add
> > Do you refactor? say refactor
>
> IMO it's neither of those.
>
> It's not a bugfix because the driver itself doesn't do anything
> wrong; the patch provides the user with a kind of optional
> workaround for an userspace app that's behaving incorrectly.
>
> It's not really new functionality, since the mdd-auto-reset-vf flag
> was already implemented and auto reset was already happening for Rx
> events. I'm just extending this to Tx as well.
For me it looks like adding new feature, but I'm not sure yet.
>
> And it's not strictly a refactor because I'm changing behaviour.
Please describe explicitly what behavior do you change, and the motivation, especially if it's not a bugfix.
> Please also note that I've picked up a patch not originally made by
> me [2], and kept the original title and most of the description.
> I'm open to suggestions to title though.
>
> >> In cases when VF sends malformed packets that are classified as
> >> malicious, sometimes it causes Tx queue to freeze. This frozen
> queue
> >> can be stuck for several minutes being unusable. This behavior
> can be
> >> reproduced with a faulty userspace app running on VF.
> >>
> >> When any Malicious Driver Detection event occurs and the mdd-
> auto-
> >> reset-vf private flag is set, perform a graceful VF reset to
> quickly
> >> bring VF back to operational state. Add a log message to notify
> about
> >> the cause of the reset. Add a helper for this to be reused for
> both
> >> TX and RX events.
> > Please describe your changes explicitly:
> > Do you just add a helper function without adding a new
> functionality?
> > Do you add functionality?
> > Do you fix?
> > Please explain what are the changes in driver behavior and the
> changes in the driver sources you make.
> > Thank you
>
> I'll try to make the description more explicit.
Thank you
>
> Thanks,
> Marcin
>
> >> Reviewed-by: Wojciech Drewek <wojciech.drewek@...el.com>
> >> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@...el.com>
> >> Co-developed-by: Liang-Min Wang <liang-min.wang@...el.com>
> >> Signed-off-by: Liang-Min Wang <liang-min.wang@...el.com>
> >> Signed-off-by: Marcin Szycik <marcin.szycik@...ux.intel.com>
> >> ---
> >> v4: Only perform auto-reset once per VF
> >> v3 [1]: Only auto reset VF if the mdd-auto-reset-vf flag is set
> >> v2 [2]: Revert an unneeded formatting change, fix commit
> message, fix
> >> a log
> >> message with a correct event name
> >>
> >> [1] https://lore.kernel.org/intel-wired-
> lan/20240326164455.735739-
> >> 1-marcin.szycik@...ux.intel.com
> >> [2] https://lore.kernel.org/netdev/20231102155149.2574209-1-
> >> pawel.chmielewski@...el.com
> >> ---
> >> drivers/net/ethernet/intel/ice/ice_main.c | 57
> +++++++++++++++++-
> >> ---- drivers/net/ethernet/intel/ice/ice_sriov.c | 25 +++++++---
> >> drivers/net/ethernet/intel/ice/ice_sriov.h | 2 +
> >> 3 files changed, 67 insertions(+), 17 deletions(-)
> >>
> >> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c
> >> b/drivers/net/ethernet/intel/ice/ice_main.c
> >> index 185c9b13efcf..80bc83f6e1ab 100644
> >> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> >> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> >> @@ -1745,6 +1745,39 @@ static void ice_service_timer(struct
> >> timer_list *t)
> >> ice_service_task_schedule(pf);
> >> }
> >>
> >> +/**
> >> + * ice_mdd_maybe_reset_vf - reset VF after MDD event
> >> + * @pf: pointer to the PF structure
> >> + * @vf: pointer to the VF structure
> >> + * @reset_vf_tx: whether Tx MDD has occurred
> >> + * @reset_vf_rx: whether Rx MDD has occurred
> >> + *
> >> + * Since the queue can get stuck on VF MDD events, the PF can
> be
> >> +configured to
> >> + * automatically reset the VF by enabling the private ethtool
> flag
> >> + * mdd-auto-reset-vf.
> >> + */
> >> +static void ice_mdd_maybe_reset_vf(struct ice_pf *pf, struct
> >> ice_vf *vf,
> >> + bool reset_vf_tx, bool reset_vf_rx) {
> >> + struct device *dev = ice_pf_to_dev(pf);
> >> +
> >> + if (!test_bit(ICE_FLAG_MDD_AUTO_RESET_VF, pf->flags))
> >> + return;
> >> +
> >> + /* VF MDD event counters will be cleared by reset, so print
> >> the event
> >> + * prior to reset.
> >> + */
> >> + if (reset_vf_tx)
> >> + ice_print_vf_tx_mdd_event(vf);
> >> +
> >> + if (reset_vf_rx)
> >> + ice_print_vf_rx_mdd_event(vf);
> >> +
> >> + dev_info(dev, "PF-to-VF reset on PF %d VF %d due to MDD
> >> event\n",
> >> + pf->hw.pf_id, vf->vf_id);
> >> + ice_reset_vf(vf, ICE_VF_RESET_NOTIFY | ICE_VF_RESET_LOCK); }
> >> +
> >> /**
> >> * ice_handle_mdd_event - handle malicious driver detect event
> >> * @pf: pointer to the PF structure
> >> @@ -1838,6 +1871,8 @@ static void ice_handle_mdd_event(struct
> ice_pf
> >> *pf)
> >> */
> >> mutex_lock(&pf->vfs.table_lock);
> >> ice_for_each_vf(pf, bkt, vf) {
> >> + bool reset_vf_tx = false, reset_vf_rx = false;
> >> +
> >> reg = rd32(hw, VP_MDET_TX_PQM(vf->vf_id));
> >> if (reg & VP_MDET_TX_PQM_VALID_M) {
> >> wr32(hw, VP_MDET_TX_PQM(vf->vf_id), 0xFFFF); @@ -
> >> 1846,6 +1881,8 @@ static void ice_handle_mdd_event(struct ice_pf
> >> *pf)
> >> if (netif_msg_tx_err(pf))
> >> dev_info(dev, "Malicious Driver Detection
> event TX_PQM detected
> >> on VF %d\n",
> >> vf->vf_id);
> >> +
> >> + reset_vf_tx = true;
> >> }
> >>
> >> reg = rd32(hw, VP_MDET_TX_TCLAN(vf->vf_id)); @@ -1856,6
> >> +1893,8 @@ static void ice_handle_mdd_event(struct ice_pf *pf)
> >> if (netif_msg_tx_err(pf))
> >> dev_info(dev, "Malicious Driver Detection
> event TX_TCLAN
> >> detected on VF %d\n",
> >> vf->vf_id);
> >> +
> >> + reset_vf_tx = true;
> >> }
> >>
> >> reg = rd32(hw, VP_MDET_TX_TDPU(vf->vf_id)); @@ -1866,6
> >> +1905,8 @@ static void ice_handle_mdd_event(struct ice_pf *pf)
> >> if (netif_msg_tx_err(pf))
> >> dev_info(dev, "Malicious Driver Detection
> event TX_TDPU detected
> >> on VF %d\n",
> >> vf->vf_id);
> >> +
> >> + reset_vf_tx = true;
> >> }
> >>
> >> reg = rd32(hw, VP_MDET_RX(vf->vf_id)); @@ -1877,18
> >> +1918,12 @@ static void ice_handle_mdd_event(struct ice_pf *pf)
> >> dev_info(dev, "Malicious Driver Detection
> event RX detected on
> >> VF %d\n",
> >> vf->vf_id);
> >>
> >> - /* Since the queue is disabled on VF Rx MDD
> >> events, the
> >> - * PF can be configured to reset the VF through
> >> ethtool
> >> - * private flag mdd-auto-reset-vf.
> >> - */
> >> - if (test_bit(ICE_FLAG_MDD_AUTO_RESET_VF, pf-
> >>> flags)) {
> >> - /* VF MDD event counters will be cleared by
> >> - * reset, so print the event prior to
> >> reset.
> >> - */
> >> - ice_print_vf_rx_mdd_event(vf);
> >> - ice_reset_vf(vf, ICE_VF_RESET_LOCK);
> >> - }
> >> + reset_vf_rx = true;
> >> }
> >> +
> >> + if (reset_vf_tx || reset_vf_rx)
> >> + ice_mdd_maybe_reset_vf(pf, vf, reset_vf_tx,
> >> + reset_vf_rx);
> >> }
> >> mutex_unlock(&pf->vfs.table_lock);
> >>
> >> diff --git a/drivers/net/ethernet/intel/ice/ice_sriov.c
> >> b/drivers/net/ethernet/intel/ice/ice_sriov.c
> >> index fb2e96db647e..a60dacf8942a 100644
> >> --- a/drivers/net/ethernet/intel/ice/ice_sriov.c
> >> +++ b/drivers/net/ethernet/intel/ice/ice_sriov.c
> >> @@ -1861,6 +1861,24 @@ void ice_print_vf_rx_mdd_event(struct
> ice_vf
> >> *vf)
> >> ? "on" : "off");
> >> }
> >>
> >> +/**
> >> + * ice_print_vf_tx_mdd_event - print VF Tx malicious driver
> detect
> >> +event
> >> + * @vf: pointer to the VF structure
> >> + */
> >> +void ice_print_vf_tx_mdd_event(struct ice_vf *vf) {
> >> + struct ice_pf *pf = vf->pf;
> >> + struct device *dev;
> >> +
> >> + dev = ice_pf_to_dev(pf);
> >> +
> >> + dev_info(dev, "%d Tx Malicious Driver Detection events
> >> detected on PF %d VF %d MAC %pM. mdd-auto-reset-vfs=%s\n",
> >> + vf->mdd_tx_events.count, pf->hw.pf_id, vf->vf_id,
> >> + vf->dev_lan_addr,
> >> + test_bit(ICE_FLAG_MDD_AUTO_RESET_VF, pf->flags)
> >> + ? "on" : "off");
> >> +}
> >> +
> >> /**
> >> * ice_print_vfs_mdd_events - print VFs malicious driver detect
> >> event
> >> * @pf: pointer to the PF structure
> >> @@ -1869,8 +1887,6 @@ void ice_print_vf_rx_mdd_event(struct
> ice_vf
> >> *vf)
> >> */
> >> void ice_print_vfs_mdd_events(struct ice_pf *pf) {
> >> - struct device *dev = ice_pf_to_dev(pf);
> >> - struct ice_hw *hw = &pf->hw;
> >> struct ice_vf *vf;
> >> unsigned int bkt;
> >>
> >> @@ -1897,10 +1913,7 @@ void ice_print_vfs_mdd_events(struct
> ice_pf
> >> *pf)
> >> if (vf->mdd_tx_events.count != vf-
> >>> mdd_tx_events.last_printed) {
> >> vf->mdd_tx_events.last_printed =
> >> vf->mdd_tx_events.count;
> >> -
> >> - dev_info(dev, "%d Tx Malicious Driver Detection
> >> events detected on PF %d VF %d MAC %pM.\n",
> >> - vf->mdd_tx_events.count, hw->pf_id, vf-
> >>> vf_id,
> >> - vf->dev_lan_addr);
> >> + ice_print_vf_tx_mdd_event(vf);
> >> }
> >> }
> >> mutex_unlock(&pf->vfs.table_lock);
> >> diff --git a/drivers/net/ethernet/intel/ice/ice_sriov.h
> >> b/drivers/net/ethernet/intel/ice/ice_sriov.h
> >> index 4ba8fb53aea1..8f22313474d6 100644
> >> --- a/drivers/net/ethernet/intel/ice/ice_sriov.h
> >> +++ b/drivers/net/ethernet/intel/ice/ice_sriov.h
> >> @@ -58,6 +58,7 @@ void
> >> ice_vf_lan_overflow_event(struct ice_pf *pf, struct
> >> ice_rq_event_info *event); void ice_print_vfs_mdd_events(struct
> >> ice_pf *pf); void ice_print_vf_rx_mdd_event(struct ice_vf *vf);
> >> +void ice_print_vf_tx_mdd_event(struct ice_vf *vf);
> >> bool
> >> ice_vc_validate_pattern(struct ice_vf *vf, struct
> >> virtchnl_proto_hdrs *proto);
> >> u32 ice_sriov_get_vf_total_msix(struct pci_dev *pdev); @@ -69,6
> >> +70,7 @@ static inline void ice_vf_lan_overflow_event(struct
> >> ice_pf *pf, struct ice_rq_event_info *event) { } static inline
> void
> >> ice_print_vfs_mdd_events(struct ice_pf *pf) { } static inline
> void
> >> ice_print_vf_rx_mdd_event(struct ice_vf *vf) { }
> >> +static inline void ice_print_vf_tx_mdd_event(struct ice_vf *vf)
> {
> >> }
> >> static inline void ice_restore_all_vfs_msi_state(struct ice_pf
> >> *pf) { }
> >>
> >> static inline int
> >> --
> >> 2.41.0
> >
Powered by blists - more mailing lists