lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1271321358.16881.2240.camel@edumazet-laptop>
Date:	Thu, 15 Apr 2010 10:49:18 +0200
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Changli Gao <xiaosuo@...il.com>
Cc:	David Miller <davem@...emloft.net>, therbert@...gle.com,
	eparis@...hat.com, netdev@...r.kernel.org
Subject: Re: BUG: using smp_processor_id() in preemptible [00000000] code: 
 avahi-daemon: caller is netif_rx

Le jeudi 15 avril 2010 à 15:47 +0800, Changli Gao a écrit :
> On Thu, Apr 15, 2010 at 3:37 PM, David Miller <davem@...emloft.net> wrote:
> > From: Changli Gao <xiaosuo@...il.com>
> > Date: Thu, 15 Apr 2010 15:30:44 +0800
> >
> >> Should netif_rx() be used only when preemption is disabled? If not,
> >> netif_rx_ni() should be used instead.?
> >
> > netif_rx() must be invoked from a hardware or software interrupt,
> > which implies preemption disabled.
> >
> > In netif_rx_ni(), the "ni" means "not interrupt".
> >
> 
> yea, I know netif_rx_ni()'s meaning. It means that the following
> changes aren't necessary.
> 
>  #else
> -       cpu = smp_processor_id();
> +       ret = enqueue_to_backlog(skb, get_cpu());
> +       put_cpu();
> 
> ret = enqueue_to_backlog(skb, smp_processor_id()); should be OK.
> 
>  #endif
> -
> -       return enqueue_to_backlog(skb, cpu);
> +       return ret;
>  }

netif_rx is meant to be called from interrupts because it doesn't wake
up ksoftirqd.  For calling from outside interrupts, netif_rx_ni exists,
to make _sure_ do_softirq() is called.

However, netif_rx() _could_ be called from process context, it was safe,
but sofirq was a bit delayed.

Now, after RPS changes this can trigger this :

[   14.203970] BUG: using smp_processor_id() in preemptible [00000000]
code: avahi-daemon/2093
[   14.204025] caller is netif_rx+0xfa/0x110
[   14.204032] Pid: 2093, comm: avahi-daemon Tainted: G        W
2.6.34-rc3-next-20100412+ #65
[   14.204035] Call Trace:
[   14.204064]  [<ffffffff81278fe5>] debug_smp_processor_id+0x105/0x110
[   14.204070]  [<ffffffff8142163a>] netif_rx+0xfa/0x110
[   14.204090]  [<ffffffff8145b631>] ip_dev_loopback_xmit+0x71/0xa0
[   14.204095]  [<ffffffff8145b892>] ip_mc_output+0x192/0x2c0
[   14.204099]  [<ffffffff8145d610>] ip_local_out+0x20/0x30
[   14.204105]  [<ffffffff8145d8ad>] ip_push_pending_frames+0x28d/0x3d0
[   14.204119]  [<ffffffff8147f1cc>] udp_push_pending_frames+0x14c/0x400
[   14.204125]  [<ffffffff814803fc>] udp_sendmsg+0x39c/0x790
[   14.204137]  [<ffffffff814891d5>] inet_sendmsg+0x45/0x80
[   14.204149]  [<ffffffff8140af91>] sock_sendmsg+0xf1/0x110
[   14.204177]  [<ffffffff810e3a89>] ? might_fault+0xb9/0xd0
[   14.204184]  [<ffffffff810e3a3e>] ? might_fault+0x6e/0xd0
[   14.204189]  [<ffffffff8140dc6c>] sys_sendmsg+0x20c/0x380
[   14.204205]  [<ffffffff811107f1>] ? do_sync_write+0xd1/0x110
[   14.204211]  [<ffffffff810e3a3e>] ? might_fault+0x6e/0xd0
[   14.204233]  [<ffffffff8100ad82>] system_call_fastpath+0x16/0x1b


We have two possibilities :

1) Make sure no netif_rx() caller is in process context, preemption
enabled.

2) Change netif_rx() to meet its initial behavior : It _can_ be called
from process context, preemption enabled. Frame might be delayed a bit
as before.

We chose 2), but David, I am not sure this is OK given git history, some
calling points were changed to avoid "'NOHZ: local_softirq_pending 08' "
messages...

----------------------------------------------------------------------------------------

commit 481a8199142c050b72bff8a1956a49fd0a75bbe0
Author: Oliver Hartkopp <oliver@...tkopp.net>
Date:   Tue Sep 15 01:31:34 2009 -0700

    can: fix NOHZ local_softirq_pending 08 warning
    
    When using nanosleep() in an userspace application we get a
ratelimit warning
    
    NOHZ: local_softirq_pending 08
    
    for 10 times.
    
    The echo of CAN frames is done from process context and softirq
context only.
    Therefore the usage of netif_rx() was wrong (for years).
    
    This patch replaces netif_rx() with netif_rx_ni() which has to be
used from
    process/softirq context. It also adds a missing comment that
can_send() must
    no be used from hardirq context.
    
    Signed-off-by: Oliver Hartkopp <oliver@...tkopp.net>
    Signed-off-by: Urs Thuermann <urs@...ogud.escape.de>
    Signed-off-by: David S. Miller <davem@...emloft.net>

diff --git a/drivers/net/can/vcan.c b/drivers/net/can/vcan.c
index 6971f6c..80ac563 100644
--- a/drivers/net/can/vcan.c
+++ b/drivers/net/can/vcan.c
@@ -80,7 +80,7 @@ static void vcan_rx(struct sk_buff *skb, struct
net_device *dev)
        skb->dev       = dev;
        skb->ip_summed = CHECKSUM_UNNECESSARY;
 
-       netif_rx(skb);
+       netif_rx_ni(skb);
 }
 
 static netdev_tx_t vcan_tx(struct sk_buff *skb, struct net_device *dev)
diff --git a/net/can/af_can.c b/net/can/af_can.c
index ef1c43a..6068321 100644
--- a/net/can/af_can.c
+++ b/net/can/af_can.c
@@ -199,6 +199,8 @@ static int can_create(struct net *net, struct socket
*sock, int protocol)
  * @skb: pointer to socket buffer with CAN frame in data section
  * @loop: loopback for listeners on local CAN sockets (recommended
default!)
  *
+ * Due to the loopback this routine must not be called from hardirq
context.
+ *
  * Return:
  *  0 on success
  *  -ENETDOWN when the selected interface is down
@@ -278,7 +280,7 @@ int can_send(struct sk_buff *skb, int loop)
        }
 
        if (newskb)
-               netif_rx(newskb);
+               netif_rx_ni(newskb);
 
        /* update statistics */
        can_stats.tx_frames++;


commit ae3e0fcf901e4b7df87aef7ab39093e142a8de8b
Author: Holger Schurig <hs4233@...l.mn-solutions.de>
Date:   Wed Jan 16 15:48:44 2008 +0100

    libertas cs/sdio: fix 'NOHZ: local_softirq_pending 08' message
    
    netif_rx should be called only from interrupt context. if_cs and
if_sdio receive
    packets from other contexts, and thus should call netif_rx_ni.
    
    Signed-off-by: Marc Pignat <marc.pignat@...s.ch>
    Acked-by: Holger Schurig <hs4233@...l.mn-solutions.de>
    Signed-off-by: John W. Linville <linville@...driver.com>

diff --git a/drivers/net/wireless/libertas/rx.c
b/drivers/net/wireless/libertas/rx.c
index 6332fd4..149557a 100644
--- a/drivers/net/wireless/libertas/rx.c
+++ b/drivers/net/wireless/libertas/rx.c
@@ -247,7 +247,10 @@ int lbs_process_rxed_packet(struct lbs_private
*priv, struct sk_buff *skb)
        priv->stats.rx_packets++;
 
        skb->protocol = eth_type_trans(skb, dev);
-       netif_rx(skb);
+       if (in_interrupt())
+               netif_rx(skb);
+       else
+               netif_rx_ni(skb);
 
        ret = 0;
 done:

-----------------------------------------------------------------------------------------
Maybe we should add a new function after all...

int netif_rx_any(struct sk_buff *skb) 
{
       if (in_interrupt())
               return netif_rx(skb);

	return netif_rx_ni(skb);
}



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ