[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1421350095.11734.88.camel@edumazet-glaptop2.roam.corp.google.com>
Date: Thu, 15 Jan 2015 11:28:15 -0800
From: Eric Dumazet <eric.dumazet@...il.com>
To: subashab@...eaurora.org
Cc: netdev@...r.kernel.org
Subject: Re: [PATCH] net: core: Fix race by protecting process_queues at
CPU hotplug
On Thu, 2015-01-15 at 19:03 +0000, subashab@...eaurora.org wrote:
> When a CPU is hotplugged while processing incoming packets,
> dev_cpu_callback() will copy the poll list from the offline
> CPU and raise the softIRQ. In the same context, it will also
> process process_queue of the offline CPU by de-queueing and
> calling netif_rx. Due to this there is a potential for race
> condition between process_backlog() and dev_cpu_callback()
> accessing the same process_queue resource. This patch
> protects this concurrent access by locking.
>
> Signed-off-by: Prasad Sodagudi <psodagud@...eaurora.org>
> Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@...eaurora.org>
> ---
> net/core/dev.c | 22 ++++++++++++++++------
> 1 file changed, 16 insertions(+), 6 deletions(-)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index df0b522..aa8f503 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3640,6 +3640,7 @@ static void flush_backlog(void *arg)
> struct net_device *dev = arg;
> struct softnet_data *sd = &__get_cpu_var(softnet_data);
> struct sk_buff *skb, *tmp;
> + unsigned long flags;
>
> rps_lock(sd);
> skb_queue_walk_safe(&sd->input_pkt_queue, skb, tmp) {
> @@ -3651,6 +3652,7 @@ static void flush_backlog(void *arg)
> }
> rps_unlock(sd);
>
> + spin_lock_irqsave(&sd->process_queue.lock, flags);
> skb_queue_walk_safe(&sd->process_queue, skb, tmp) {
> if (skb->dev == dev) {
> __skb_unlink(skb, &sd->process_queue);
> @@ -3658,6 +3660,7 @@ static void flush_backlog(void *arg)
> input_queue_head_incr(sd);
> }
> }
> + spin_unlock_irqrestore(&sd->process_queue.lock, flags);
> }
>
> static int napi_gro_complete(struct sk_buff *skb)
> @@ -4021,7 +4024,7 @@ static int process_backlog(struct napi_struct *napi,
> int quota)
> {
> int work = 0;
> struct softnet_data *sd = container_of(napi, struct softnet_data, backlog);
> -
> + unsigned long flags;
> #ifdef CONFIG_RPS
> /* Check if we have pending ipi, its better to send them now,
> * not waiting net_rx_action() end.
> @@ -4032,18 +4035,19 @@ static int process_backlog(struct napi_struct
> *napi, int quota)
> }
> #endif
> napi->weight = weight_p;
> - local_irq_disable();
> + spin_lock_irqsave(&sd->process_queue.lock, flags);
> while (work < quota) {
> struct sk_buff *skb;
> unsigned int qlen;
>
> while ((skb = __skb_dequeue(&sd->process_queue))) {
> - local_irq_enable();
> + spin_unlock_irqrestore(&sd->process_queue.lock, flags);
> __netif_receive_skb(skb);
> - local_irq_disable();
> + spin_lock_irqsave(&sd->process_queue.lock, flags);
> input_queue_head_incr(sd);
> if (++work >= quota) {
> - local_irq_enable();
> + spin_unlock_irqrestore(&sd->process_queue.lock,
> + flags);
> return work;
> }
This would be a terrible change in term of performance.
There is something wrong here : process_queue is local to the cpu.
A cpu does not have to use a spin lock to protect this queue.
Can you reproduce the bug on x86 ?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists