lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-Id: <20151007153408.5c66b7468dee3b2dac16930b@linux-foundation.org> Date: Wed, 7 Oct 2015 15:34:08 -0700 From: Andrew Morton <akpm@...ux-foundation.org> To: Vitaly Kuznetsov <vkuznets@...hat.com> Cc: HATAYAMA Daisuke <d.hatayama@...fujitsu.com>, Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>, Jiri Kosina <jkosina@...e.cz>, Baoquan He <bhe@...hat.com>, Prarit Bhargava <prarit@...hat.com>, Xie XiuQi <xiexiuqi@...wei.com>, Seth Jennings <sjenning@...hat.com>, linux-kernel@...r.kernel.org, "K. Y. Srinivasan" <kys@...rosoft.com>, Jan Kara <jack@...e.cz> Subject: Re: [PATCH] panic: release stale console lock to always get the logbuf printed out (cc Jan) On Wed, 7 Oct 2015 19:02:22 +0200 Vitaly Kuznetsov <vkuznets@...hat.com> wrote: > In some cases we may end up killing the CPU holding the console lock > while still having valuable data in logbuf. E.g. I'm observing the > following: > - A crash is happening on one CPU and console_unlock() is being called on > some other. > - console_unlock() tries to print out the buffer before releasing the lock > and on slow console it takes time. > - in the meanwhile crashing CPU does lots of printk()-s with valuable data > (which go to the logbuf) and sends IPIs to all other CPUs. > - console_unlock() finishes printing previous chunk and enables interrupts > before trying to print out the rest, the CPU catches the IPI and never > releases console lock. Why doesn't the lock-owning CPU release the console lock? Because it was stopped by smp_send_stop() in panic()? I don't recall why we stop CPUs in panic(), and of course we didn't document the reason. I guess it makes sense from the "what else can we do" point of view, but I wonder if we can just do it later on - that would fix this problem? (dumb aside: why doesn't smp_send_stop() stop the calling CPU?) > This is not the only possible case: in VT/fb subsystems we have many other > console_lock()/console_unlock() users. Non-masked interrupts (or receiving > NMI in case of extreme slowness) will have the same result. Getting the > whole console buffer printed out on crash should be top priority. Yes, this is a pretty big hole in the logic. > --- a/kernel/panic.c > +++ b/kernel/panic.c > @@ -23,6 +23,7 @@ > #include <linux/sysrq.h> > #include <linux/init.h> > #include <linux/nmi.h> > +#include <linux/console.h> > > #define PANIC_TIMER_STEP 100 > #define PANIC_BLINK_SPD 18 > @@ -147,6 +148,15 @@ void panic(const char *fmt, ...) > > bust_spinlocks(0); > > + /* > + * We may have ended up killing the CPU holding the lock and still have > + * some valuable data in console buffer. Try to acquire the lock and > + * release it regardless of the result. The release will also print the > + * buffers out. > + */ > + console_trylock(); > + console_unlock(); > + "killing the CPU" is a bit vague. How's this look? --- a/kernel/panic.c~panic-release-stale-console-lock-to-always-get-the-logbuf-printed-out-fix +++ a/kernel/panic.c @@ -149,10 +149,10 @@ void panic(const char *fmt, ...) bust_spinlocks(0); /* - * We may have ended up killing the CPU holding the lock and still have - * some valuable data in console buffer. Try to acquire the lock and - * release it regardless of the result. The release will also print the - * buffers out. + * We may have ended up stopping the CPU holding the lock (in + * smp_send_stop()) while still having some valuable data in the console + * buffer. Try to acquire the lock then release it regardless of the + * result. The release will also print the buffers out. */ console_trylock(); console_unlock(); _ Does the console_trylock() guarantee that the console lock is now held? If the console_lock-holding CPU is still running then there's a window where the above code could enter console_unlock() when nobody's holding console_lock. If smp_send_stop() always works (synchronously) then that won't happen. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists