linux-kernel - Re: [PATCH v6 00/16] locking/lockdep: Add support for dynamic keys

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190118094808.GA27931@hirez.programming.kicks-ass.net>
Date:   Fri, 18 Jan 2019 10:48:08 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Bart Van Assche <bvanassche@....org>
Cc:     mingo@...hat.com, tj@...nel.org, longman@...hat.com,
        johannes.berg@...el.com, linux-kernel@...r.kernel.org,
        Paul McKenney <paulmck@...ux.vnet.ibm.com>
Subject: Re: [PATCH v6 00/16] locking/lockdep: Add support for dynamic keys

On Mon, Jan 14, 2019 at 08:52:33AM -0800, Bart Van Assche wrote:
> On Mon, 2019-01-14 at 13:52 +0100, Peter Zijlstra wrote:
> > On Fri, Jan 11, 2019 at 09:01:41AM -0800, Bart Van Assche wrote:
> > > The list_del_rcu() call must only happen once. 
> > 
> > Yes; obviously. But if we need to check all @pf's, that means the entry
> > is still reachable after a single reset_lock()/free_key_range(), which
> > is a bug.
> > 
> > > I ran into complaints reporting that
> > > the list_del_rcu() call triggered list corruption. This change made these complaints
> > > disappear.
> > 
> > I'm saying this solution buggy, because that means the entry is still
> > reachable after we do call_rcu() (which is a straight up UAF).
> > 
> > Also put it differently, what guarantees checking those two @pf's is
> > sufficient. Suppose your earlier @pf already did the RCU callback and
> > freed stuff while the second is in progress. Then you're poking into
> > dead space.
> 
> zap_class() only examines elements of the list_entries[] array for which the
> corresponding bit in list_entries_in_use has been set. The RCU callback clears 
> the bits in the list_entries_in_use that correspond to elements that have been
> freed. The graph lock serializes zap_class() calls and the code inside the
> RCU callback. So it's not clear to me why you are claiming that zap_class()
> would trigger a use-after-free?

The scenario is like:


CPU0					CPU1					CPU2

lockdep_reset_lock_reg()
  pf = get_pending_free_lock() // pf[0]
  __lockdep_reset_lock(pf)
    zap_class()
  schedule_free_zapped_classes(pf)
    call_rcu()


  // here is wbere the objects 'freed' in zap_class()
  // can still be used through references obtained
  // __before__ we did call_rcu().


					lockdep_reset_lock_reg()
					  pf = get_pending_free_lock() // pf[1]
					  __lockdep_reset_lock(pf)
					    zap_class()
					      list_entry_being_freed()
						// checks: pf[0]

						// this is a problem, it
						// should _NEVER_ match
						// anything from pf[0]

						// those entries should
						// be unreachable,
						// otherwise:


										rcu_read_lock()
										entry = rcu_dereference()

<rcu-callback>
  free_zapped_classes()

										entry->class // UAF, just freed by rcu-callback

										rcu_read_unlock()




Now, arguably, I'm having a really hard time actually finding the RCU user of
lock_list::entry, the comment in add_lock_to_list() seems to mention
look_up_lock_class(), but the only RCU usage there is the
lock_class::hash_entry, not lock_list::entry.

If lock_class is not indeed RCU used, that would simplify things. Please
double check.

But in any case, the normal RCU pattern is:

lock()
add-to-data-structure()
unlock()

				rcu_read_lock()
				obj = obtain-from-data-structure();

lock()
remove-from-data-structure()
  call_rcu()
unlock();

				use(obj);
				rcu_read_unlock();


<rcu-callback>
  actually-free-obj()



Fundamentally RCU delays the callback to the point where the last observer
that started before call_rcu() has finished and no later (in practise it often
is much later, but no guarantees there). So being able to reach an object
after you did call_rcu() on it is a fundamental fail.