linux-kernel - Re: [RFC PATCH 9/9] debugfs: free debugfs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20170418133136.GS3956@linux.vnet.ibm.com>
Date:   Tue, 18 Apr 2017 06:31:36 -0700
From:   "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:     Johannes Berg <johannes@...solutions.net>
Cc:     Nicolai Stange <nicstange@...il.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 9/9] debugfs: free debugfs_fsdata instances

On Tue, Apr 18, 2017 at 11:39:27AM +0200, Johannes Berg wrote:
> On Mon, 2017-04-17 at 09:01 -0700, Paul E. McKenney wrote:
> 
> > If you have not already done so, please run this with debug enabled,
> > especially CONFIG_PROVE_LOCKING=y (which implies CONFIG_PROVE_RCU=y).
> > This is important because there are configurations for which the
> > deadlocks you saw with SRCU turn into silent failure, including
> > memory corruption.
> > CONFIG_PROVE_RCU=y will catch many of those situations.
> 
> Can you elaborate on that? I think we may have had CONFIG_PROVE_RCU
> enabled in the builds where we saw the problem, but I'm not sure.

CONFIG_PROVE_RCU=y will reliably catch things like this:

1.	rcu_read_lock();
	synchronize_rcu();
	rcu_read_unlock();

	With CONFIG_PROVE_RCU=n and CONFIG_PREEMPT=n, this will result in
	too-short grace periods, which can free things out from under the
	read-side critical section, which in turn can result in arbitrary
	memory corruption.  You might not even get a "scheduling while
	atomic", though CONFIG_PREEMPT_COUNT=y will produce this message.

	With CONFIG_PREEMPT=y, on the other hand, this should
	deadlock in a manner similar to the earlier SRCU deadlocks
	seen in debugfs.

2.	rcu_read_lock();
	schedule_timeout_interruptible(HZ);
	rcu_read_unlock();

	With CONFIG_PROVE_RCU=y and CONFIG_PREEMPT=y, this will just
	work, more or less.  Until someone runs with CONFIG_PREEMPT=n,
	which will produce "scheduling while atomic".  (I have a
	fix for this queued for 4.13, FWIW, so that in the future
	CONFIG_PROVE_RCU=y and CONFIG_PREEMPT=y will complain about
	this.  But for now, silent bug.)

There are more, but this should get you the flavor of the types
of bugs CONFIG_PROVE_RCU=y can locate for you.

> Can you say which configurations you're thinking of? And perhaps what
> kind of corruption you're thinking of also? I'm having a hard time
> imagining any corruption that should happen?

#1 is the silent corruption case given CONFIG_PROVE_RCU=n,
CONFIG_PREEMPT=n, and CONFIG_PREEMPT_COUNT=n.

> Nicolai probably never even ran into this problem, though it should be
> easy to reproduce.

I am just worried that the situation resulting in the earlier SRCU
deadlocks might be hiding behind CONFIG_PROVE_RCU=n, CONFIG_PREEMPT=n,
and CONFIG_PREEMPT_COUNT=n.  Or some other bug hiding behind some
other set of Kconfig options.

							Thanx, Paul