lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <481f7799-0f1c-efa3-bf2c-e22961e5f376@intel.com> Date: Thu, 6 Oct 2022 09:57:19 -0500 From: "Samudrala, Sridhar" <sridhar.samudrala@...el.com> To: Maciej Fijalkowski <maciej.fijalkowski@...el.com>, Joe Damato <jdamato@...tly.com> CC: <intel-wired-lan@...ts.osuosl.org>, <netdev@...r.kernel.org>, <kuba@...nel.org>, <davem@...emloft.net>, <anthony.l.nguyen@...el.com>, <jesse.brandeburg@...el.com> Subject: Re: [next-queue v2 2/4] i40e: Record number TXes cleaned during NAPI On 10/6/2022 8:03 AM, Maciej Fijalkowski wrote: > On Wed, Oct 05, 2022 at 06:00:24PM -0700, Joe Damato wrote: >> On Wed, Oct 05, 2022 at 05:31:04PM -0700, Joe Damato wrote: >>> On Wed, Oct 05, 2022 at 07:16:56PM -0500, Samudrala, Sridhar wrote: >>>> On 10/5/2022 4:21 PM, Joe Damato wrote: >>>>> Update i40e_clean_tx_irq to take an out parameter (tx_cleaned) which stores >>>>> the number TXs cleaned. >>>>> >>>>> Likewise, update i40e_clean_xdp_tx_irq and i40e_xmit_zc to do the same. >>>>> >>>>> Care has been taken to avoid changing the control flow of any functions >>>>> involved. >>>>> >>>>> Signed-off-by: Joe Damato <jdamato@...tly.com> >>>>> --- >>>>> drivers/net/ethernet/intel/i40e/i40e_txrx.c | 16 +++++++++++----- >>>>> drivers/net/ethernet/intel/i40e/i40e_xsk.c | 15 +++++++++++---- >>>>> drivers/net/ethernet/intel/i40e/i40e_xsk.h | 3 ++- >>>>> 3 files changed, 24 insertions(+), 10 deletions(-) >>>>> >>>>> diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c >>>>> index b97c95f..a2cc98e 100644 >>>>> --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c >>>>> +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c >>>>> @@ -923,11 +923,13 @@ void i40e_detect_recover_hung(struct i40e_vsi *vsi) >>>>> * @vsi: the VSI we care about >>>>> * @tx_ring: Tx ring to clean >>>>> * @napi_budget: Used to determine if we are in netpoll >>>>> + * @tx_cleaned: Out parameter set to the number of TXes cleaned >>>>> * >>>>> * Returns true if there's any budget left (e.g. the clean is finished) >>>>> **/ >>>>> static bool i40e_clean_tx_irq(struct i40e_vsi *vsi, >>>>> - struct i40e_ring *tx_ring, int napi_budget) >>>>> + struct i40e_ring *tx_ring, int napi_budget, >>>>> + unsigned int *tx_cleaned) >>>>> { >>>>> int i = tx_ring->next_to_clean; >>>>> struct i40e_tx_buffer *tx_buf; >>>>> @@ -1026,7 +1028,7 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi, >>>>> i40e_arm_wb(tx_ring, vsi, budget); >>>>> if (ring_is_xdp(tx_ring)) >>>>> - return !!budget; >>>>> + goto out; >>>>> /* notify netdev of completed buffers */ >>>>> netdev_tx_completed_queue(txring_txq(tx_ring), >>>>> @@ -1048,6 +1050,8 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi, >>>>> } >>>>> } >>>>> +out: >>>>> + *tx_cleaned = total_packets; >>>>> return !!budget; >>>>> } >>>>> @@ -2689,10 +2693,12 @@ int i40e_napi_poll(struct napi_struct *napi, int budget) >>>>> container_of(napi, struct i40e_q_vector, napi); >>>>> struct i40e_vsi *vsi = q_vector->vsi; >>>>> struct i40e_ring *ring; >>>>> + bool tx_clean_complete = true; >>>>> bool clean_complete = true; >>>>> bool arm_wb = false; >>>>> int budget_per_ring; >>>>> int work_done = 0; >>>>> + unsigned int tx_cleaned = 0; >>>>> if (test_bit(__I40E_VSI_DOWN, vsi->state)) { >>>>> napi_complete(napi); >>>>> @@ -2704,11 +2710,11 @@ int i40e_napi_poll(struct napi_struct *napi, int budget) >>>>> */ >>>>> i40e_for_each_ring(ring, q_vector->tx) { >>>>> bool wd = ring->xsk_pool ? >>>>> - i40e_clean_xdp_tx_irq(vsi, ring) : >>>>> - i40e_clean_tx_irq(vsi, ring, budget); >>>>> + i40e_clean_xdp_tx_irq(vsi, ring, &tx_cleaned) : >>>>> + i40e_clean_tx_irq(vsi, ring, budget, &tx_cleaned); >>>>> if (!wd) { >>>>> - clean_complete = false; >>>>> + clean_complete = tx_clean_complete = false; >>>>> continue; >>>>> } >>>>> arm_wb |= ring->arm_wb; >>>>> diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c >>>>> index 790aaeff..f98ce7e4 100644 >>>>> --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c >>>>> +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c >>>>> @@ -530,18 +530,22 @@ static void i40e_set_rs_bit(struct i40e_ring *xdp_ring) >>>>> * i40e_xmit_zc - Performs zero-copy Tx AF_XDP >>>>> * @xdp_ring: XDP Tx ring >>>>> * @budget: NAPI budget >>>>> + * @tx_cleaned: Out parameter of the TX packets processed >>>>> * >>>>> * Returns true if the work is finished. >>>>> **/ >>>>> -static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget) >>>>> +static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget, >>>>> + unsigned int *tx_cleaned) >>>>> { >>>>> struct xdp_desc *descs = xdp_ring->xsk_pool->tx_descs; >>>>> u32 nb_pkts, nb_processed = 0; >>>>> unsigned int total_bytes = 0; >>>>> nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, budget); >>>>> - if (!nb_pkts) >>>>> + if (!nb_pkts) { >>>>> + *tx_cleaned = 0; >>>>> return true; >>>>> + } >>>>> if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) { >>>>> nb_processed = xdp_ring->count - xdp_ring->next_to_use; >>>>> @@ -558,6 +562,7 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget) >>>>> i40e_update_tx_stats(xdp_ring, nb_pkts, total_bytes); >>>>> + *tx_cleaned = nb_pkts; >>>> With XDP, I don't think we should count these as tx_cleaned packets. These are transmitted >>>> packets. The tx_cleaned would be the xsk_frames counter in i40e_clean_xdp_tx_irq >>>> May be we need 2 counters for xdp. >>> I think there's two issues you are describing, which are separate in my >>> mind. >>> >>> 1.) The name "tx_cleaned", and >>> 2.) Whether nb_pkts is the right thing to write as the out param. >>> >>> For #1: I'm OK to change the name if that's the blocker here; please >>> suggest a suitable alternative that you'll accept. >>> >>> For #2: nb_pkts is, IMO, the right value to bubble up to the tracepoint because >>> nb_pkts affects clean_complete in i40e_napi_poll which in turn determines >>> whether or not polling mode is entered. >>> >>> The purpose of the tracepoint is to determine when/why/how you are entering >>> polling mode, so if nb_pkts plays a role in that calculation, it's the >>> right number to output. >> I suppose the alternative is to only fire the tracepoint when *not* in XDP. >> Then the changes to the XDP stuff can be dropped and a separate set of >> tracepoints for XDP can be created in the future. > Let's be clear that it's the AF_XDP quirk that we have in here that actual > xmit happens within NAPI polling routine. > > Sridhar is right with having xsk_frames as tx_cleaned but you're also > right that nb_pkts affects napi polling. But then if you look at Rx side > there is an analogous case with buffer allocation affecting napi polling. To be correct, I would suggest 2 out parameters to i40e_clean_xdp_tx_irq() tx_cleaned and xdp_transmitted. tx_cleaned should be filled in with xsk_frames. Add xdp_transmitted as an out parameter to i40e_xmit_zc() and fill it with nb_pkts. I am not completely clear on the reasoning behind setting clean_complete based on number of packets transmitted in case of XDP. > >> That might reduce the complexity a bit, and will probably still be pretty >> useful for people tuning their non-XDP workloads. This option is fine too.
Powered by blists - more mailing lists