lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 25 Oct 2012 04:57:11 -0700
From:	Dave Hansen <dave@...ux.vnet.ibm.com>
To:	Borislav Petkov <bp@...en8.de>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Michal Hocko <mhocko@...e.cz>, linux-mm@...ck.org,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] add some drop_caches documentation and info messsge

On 10/25/2012 02:24 AM, Borislav Petkov wrote:
> But let's discuss this a bit further. So, for the benchmarking aspect,
> you're either going to have to always require dmesg along with
> benchmarking results or /proc/vmstat, depending on where the drop_caches
> stats end up.
> 
> Is this how you envision it?
> 
> And then there are the VM bug cases, where you might not always get
> full dmesg from a panicked system. In that case, you'd want the kernel
> tainting thing too, so that it at least appears in the oops backtrace.
> 
> Although the tainting thing might not be enough - a user could
> drop_caches at some point in time and the oops happening much later
> could be unrelated but that can't be expressed in taint flags.

Here's the problem: Joe Kernel Developer gets a bug report, usually
something like "the kernel is slow", or "the kernel is eating up all my
memory".  We then start going and digging in to the problem with the
usual tools.  We almost *ALWAYS* get dmesg, and it's reasonably common,
but less likely, that we get things like vmstat along with such a bug
report.

Joe Kernel Developer digs in the statistics or the dmesg and tries to
figure out what happened.  I've run in to a couple of cases in practice
(and I assume Michal has too) where the bug reporter was using
drop_caches _heavily_ and did not realize the implications.  It was
quite hard to track down exactly how the page cache and dentries/inodes
were getting purged.

There are rarely oopses involved in these scenarios.

The primary goal of this patch is to make debugging those scenarios
easier so that we can quickly realize that drop_caches is the reason our
caches went away, not some anomalous VM activity.  A secondary goal is
to tell the user: "Hey, maybe this isn't something you want to be doing
all the time."

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ