netdev - Re: [PATCH net-next] net: avoid irqsave in skb_defer_free

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <937ba89a-42e1-813c-9d1e-975b8dc9616a@intel.com>
Date:   Tue, 17 Jan 2023 11:29:15 -0800
From:   Jacob Keller <jacob.e.keller@...el.com>
To:     Jesper Dangaard Brouer <brouer@...hat.com>,
        <netdev@...r.kernel.org>
CC:     Jakub Kicinski <kuba@...nel.org>,
        "David S. Miller" <davem@...emloft.net>, <edumazet@...gle.com>,
        <pabeni@...hat.com>
Subject: Re: [PATCH net-next] net: avoid irqsave in skb_defer_free_flush



On 1/17/2023 4:29 AM, Jesper Dangaard Brouer wrote:
> The spin_lock irqsave/restore API variant in skb_defer_free_flush can
> be replaced with the faster spin_lock irq variant, which doesn't need
> to read and restore the CPU flags.
> 
> Using the unconditional irq "disable/enable" API variant is safe,
> because the skb_defer_free_flush() function is only called during
> NAPI-RX processing in net_rx_action(), where it is known the IRQs
> are enabled.
> 

Did you mean disabled here? If IRQs are enabled that would mean the
interrupt could be triggered and we would need to irqsave, no?

> Expected gain is 14 cycles from avoiding reading and restoring CPU
> flags in a spin_lock_irqsave/restore operation, measured via a
> microbencmark kernel module[1] on CPU E5-1650 v4 @ 3.60GHz.
> 
> Microbenchmark overhead of spin_lock+unlock:
>  - spin_lock_unlock_irq     cost: 34 cycles(tsc)  9.486 ns
>  - spin_lock_unlock_irqsave cost: 48 cycles(tsc) 13.567 ns
> 

Fairly minor change in perf, and..

> We don't expect to see a measurable packet performance gain, as
> skb_defer_free_flush() is called infrequently once per NIC device NAPI
> bulk cycle and conditionally only if SKBs have been deferred by other
> CPUs via skb_attempt_defer_free().
> 

Not really measurable as its not called enough, but..

> [1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/time_bench_sample.c
> 
> Signed-off-by: Jesper Dangaard Brouer <brouer@...hat.com>
> ---
>  net/core/dev.c |    5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index cf78f35bc0b9..9c60190fe352 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -6616,17 +6616,16 @@ static int napi_threaded_poll(void *data)
>  static void skb_defer_free_flush(struct softnet_data *sd)
>  {
>  	struct sk_buff *skb, *next;
> -	unsigned long flags;
>  
>  	/* Paired with WRITE_ONCE() in skb_attempt_defer_free() */
>  	if (!READ_ONCE(sd->defer_list))
>  		return;
>  
> -	spin_lock_irqsave(&sd->defer_lock, flags);
> +	spin_lock_irq(&sd->defer_lock);
>  	skb = sd->defer_list;
>  	sd->defer_list = NULL;
>  	sd->defer_count = 0;
> -	spin_unlock_irqrestore(&sd->defer_lock, flags);
> +	spin_unlock_irq(&sd->defer_lock);
>  

It's also less code and makes it more clear what dependency this section
has.

Seems ok to me, with the minor nit I think in the commit message:

Reviewed-by: Jacob Keller <jacob.e.keller@...el.com>

>  	while (skb != NULL) {
>  		next = skb->next;
> 
>