lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20160301155504.GB22107@localhost.localdomain>
Date:	Tue, 1 Mar 2016 15:55:07 +0000
From:	Catalin Marinas <catalin.marinas@....com>
To:	Dmitry Vyukov <dvyukov@...gle.com>
Cc:	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Jiri Slaby <jslaby@...e.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Peter Hurley <peter@...leysoftware.com>,
	One Thousand Gnomes <gnomes@...rguk.ukuu.org.uk>,
	J Freyensee <james_p_freyensee@...ux.intel.com>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	Paul Bolle <pebolle@...cali.nl>,
	Alexander Potapenko <glider@...gle.com>,
	Kostya Serebryany <kcc@...gle.com>,
	Sasha Levin <sasha.levin@...cle.com>,
	syzkaller <syzkaller@...glegroups.com>
Subject: Re: tty: memory leak in tty_register_driver

On Tue, Mar 01, 2016 at 04:27:28PM +0100, Dmitry Vyukov wrote:
> On Mon, Feb 29, 2016 at 12:34 PM, Catalin Marinas
> <catalin.marinas@....com> wrote:
> > On Mon, Feb 29, 2016 at 11:22:58AM +0100, Dmitry Vyukov wrote:
> >> Regarding stopping all threads and doing proper scan, why is not it
> >> feasible? Will kernel break if we stall all CPUs for seconds? In
> >> automatic testing scenarios a stalled for several seconds machine is
> >> not a problem. But on the other hand, absence of false positives is a
> >> must. And it would improve testing bandwidth, because we don't need
> >> sleep and second scan.
> >
> > Scanning time is the main issue with it taking minutes on some slow ARM
> > machines (my primary testing target). Such timing was significantly
> > improved with commit 93ada579b0ee ("mm: kmemleak: optimise kmemleak_lock
> > acquiring during kmemleak_scan") but even if it is few seconds, it is
> > not suitable for a live, interactive system.
> >
> > What we could do though, since you already trigger the scanning
> > manually, is to add a "stopscan" command that you echo into
> > /sys/kernel/debug/kmemleak and performs a stop_machine() during memory
> > scanning. If you have time, please feel free to give it a try ;).
> 
> Stopscan would be useful for me, but I don't feel like I am ready to
> tackle it.

It's not that hard ;). Anyway, when I get a bit of time I'll try to look
into it.

> To be absolutely sure that we don't miss pointers we would also need
> to scan all registers from stopped CPUs, and I don't know how to
> obtain that.

With stop_machine(), we probably wouldn't need to. This mechanism causes
the other CPUs to go take an IPI and execute a certain function (or wait
for the completion of a function call on another CPU). We can assume
that the functions stop_machine() is calling wouldn't manipulate
allocated objects/lists/etc., so the register file content is not
relevant to kmemleak. The previous context interrupted by the IPI would
be stored on the IRQ stack and that's one area the kmemleak does not
scan (it's architecture specific).

On the CPU issuing the stop_machine(), this would be done as a result of
debugfs write and I don't think we have any object
allocation/manipulation on this path (and it's only the callee-saved
registers that we would miss when calling kmemleak's scan_object()).

So yes, in addition to stop_machine(), we would have to scan the IRQ
stack if the architecture uses a separate one (vs just the current
thread stack).

-- 
Catalin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ