lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 05 Mar 2014 11:24:33 -0800
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	David Miller <davem@...emloft.net>
Cc:	netdev@...r.kernel.org, xiyou.wangcong@...il.com, mpm@...enic.com,
	satyam.sharma@...il.com
Subject: Re: [PATCH] netpoll: Don't call driver methods from interrupt context.

David Miller <davem@...emloft.net> writes:

> From: ebiederm@...ssion.com (Eric W. Biederman)
> Date: Tue, 04 Mar 2014 16:03:43 -0800
>
>> So I would like some clear guidance.  Will you accept patches to make
>> it safe to call the napi poll routines from hard irq context, or should
>> we simply defer messages prented with netconsole in hard irq context
>> into another context where we can run the napi code?
>> 
>> If there is not a clear way to fix the problems that crop up we should
>> just delete all of the netpoll code altogether, as it seems deadly in
>> it's current form.
>
> Clearly to make netconsole most useful we should synchronously emit
> log messages.
>
> Because what if the system hangs right after this event, but before
> we get back to a "safe" context.
>
> That's one bug that will be a billion times harder to diagnose if
> we defer.

In general I agree.  

The gripping hand for me is kernel/rcu/tree.c:print_cpu_stall() that
generates a warning from irq context on every cpu simultaneously.

Which without netpoll I can debug by just logging into the machine and
dumping dmesg, but with netpoll machine die when the warning is
generarted because of the after the first few messages each additional
message generates a new message.

Now that I have looked closer the printk generating a printk problem
seems to be something that is best solved at the printk level.  So if
you will accept the patches I will proceed to shore up the existing
netpoll implementations.

I am thinking pretty seriously about forcing hard irq context during
netconsole's use of netpoll to ensure that the hard irq context case
actually get's tested.  I need to do some audit's to see if that would
cause any side effects beyond leaving irq's disabled.

diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index ba2f5e710af1..aaa9062061c8 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -734,6 +734,7 @@ static void write_msg(struct console *con, const char *msg, unsigned int len)
        unsigned long flags;
        struct netconsole_target *nt;
        const char *tmp;
+       bool hard_irq;
 
        if (oops_only && !oops_in_progress)
                return;
@@ -742,6 +743,9 @@ static void write_msg(struct console *con, const char *msg, unsigned int len)
                return;
 
        spin_lock_irqsave(&target_list_lock, flags);
+       hard_irq = in_irq();
+       if (!hard_irq)
+               irq_enter();
        list_for_each_entry(nt, &target_list, list) {
                netconsole_target_get(nt);
                if (nt->enabled && netif_running(nt->np.dev)) {
@@ -761,6 +765,8 @@ static void write_msg(struct console *con, const char *msg, unsigned int len)
                }
                netconsole_target_put(nt);
        }
+       if (!hard_irq)
+               irq_exit();
        spin_unlock_irqrestore(&target_list_lock, flags);
 }


Eric


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists