lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240729200716.681496-1-anthony.l.nguyen@intel.com>
Date: Mon, 29 Jul 2024 13:07:06 -0700
From: Tony Nguyen <anthony.l.nguyen@...el.com>
To: davem@...emloft.net,
	kuba@...nel.org,
	pabeni@...hat.com,
	edumazet@...gle.com,
	netdev@...r.kernel.org
Cc: Tony Nguyen <anthony.l.nguyen@...el.com>,
	maciej.fijalkowski@...el.com,
	magnus.karlsson@...el.com,
	aleksander.lobakin@...el.com,
	ast@...nel.org,
	daniel@...earbox.net,
	hawk@...nel.org,
	john.fastabend@...il.com,
	bpf@...r.kernel.org
Subject: [PATCH net v2 0/8][pull request] ice: fix AF_XDP ZC timeout and concurrency issues

Maciej Fijalkowski says:

Changes included in this patchset address an issue that customer has
been facing when AF_XDP ZC Tx sockets were used in combination with flow
control and regular Tx traffic.

After executing:
ethtool --set-priv-flags $dev link-down-on-close on
ethtool -A $dev rx on tx on

launching multiple ZC Tx sockets on $dev + pinging remote interface (so
that regular Tx traffic is present) and then going through down/up of
$dev, Tx timeout occurred and then most of the time ice driver was unable
to recover from that state.

These patches combined together solve the described above issue on
customer side. Main focus here is to forbid producing Tx descriptors when
either carrier is not yet initialized or process of bringing interface
down has already started.

v2:
* in patch 6, use a single READ_ONCE against xsk_pool within napi [Jakub]

v1: https://lore.kernel.org/netdev/20240708221416.625850-1-anthony.l.nguyen@intel.com/
---
Olek,
we decided not to check IFF_UP as you initially suggested. Reason is
that when link goes down netif_running() has broader scope than IFF_UP
being set as the former (the __LINK_STATE_START bit) is cleared earlier
in the core.

The following are changes since commit 039564d2fd37b122ec0d268e2ee6334e7169e225:
  Merge branch 'mptcp-endpoint-readd-fixes' into main
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue 100GbE

Maciej Fijalkowski (7):
  ice: don't busy wait for Rx queue disable in ice_qp_dis()
  ice: replace synchronize_rcu with synchronize_net
  ice: modify error handling when setting XSK pool in ndo_bpf
  ice: toggle netif_carrier when setting up XSK pool
  ice: improve updating ice_{t,r}x_ring::xsk_pool
  ice: add missing WRITE_ONCE when clearing ice_rx_ring::xdp_prog
  ice: xsk: fix txq interrupt mapping

Michal Kubiak (1):
  ice: respect netif readiness in AF_XDP ZC related ndo's

 drivers/net/ethernet/intel/ice/ice.h      |  11 +-
 drivers/net/ethernet/intel/ice/ice_base.c |   4 +-
 drivers/net/ethernet/intel/ice/ice_main.c |   2 +-
 drivers/net/ethernet/intel/ice/ice_txrx.c |  10 +-
 drivers/net/ethernet/intel/ice/ice_xsk.c  | 184 +++++++++++++---------
 drivers/net/ethernet/intel/ice/ice_xsk.h  |  14 +-
 6 files changed, 135 insertions(+), 90 deletions(-)

-- 
2.42.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ