linux-kernel - Re: [PATCH 1/7] static_key: flush rate limit timer on rmmod

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20131018072420.GA13111@hpx.cz>
Date:	Fri, 18 Oct 2013 09:24:25 +0200
From:	Radim Krčmář <rkrcmar@...hat.com>
To:	Paolo Bonzini <pbonzini@...hat.com>
Cc:	linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...nel.org>,
	Andrew Jones <drjones@...hat.com>,
	"H. Peter Anvin" <hpa@...ux.intel.com>,
	Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
Subject: Re: [PATCH 1/7] static_key: flush rate limit timer on rmmod

2013-10-17 12:35+0200, Paolo Bonzini:
> Il 17/10/2013 12:10, Radim Krčmář ha scritto:
> > Fix a bug when we free module memory while timer is pending by marking
> > deferred static keys and flushing the timer on module unload.
> > 
> > Also make static_key_rate_limit() useable more than once.
> > 
> > Reproducer: (host crasher)
> >   modprobe kvm_intel
> >   (sleep 1; echo quit) \
> >     | qemu-kvm -kernel /dev/null -monitor stdio &
> >   sleep 0.5
> >   until modprobe -rv kvm_intel 2>/dev/null; do true; done
> >   modprobe -v kvm_intel
> > 
> > Signed-off-by: Radim Krčmář <rkrcmar@...hat.com>
> > ---
> > Very hacky; I've already queued generalizing ratelimit and applying it
> > here, but there is still a lot to do on static keys ...
> > 
> >  include/linux/jump_label.h |  1 +
> >  kernel/jump_label.c        | 17 ++++++++++++++++-
> >  2 files changed, 17 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
> > index a507907..848bd15 100644
> > --- a/include/linux/jump_label.h
> > +++ b/include/linux/jump_label.h
> > @@ -58,6 +58,7 @@ struct static_key {
> >  #ifdef CONFIG_MODULES
> >  	struct static_key_mod *next;
> >  #endif
> > +	atomic_t deferred;
> >  };
> >  
> >  # include <asm/jump_label.h>
> > diff --git a/kernel/jump_label.c b/kernel/jump_label.c
> > index 297a924..7018042 100644
> > --- a/kernel/jump_label.c
> > +++ b/kernel/jump_label.c
> > @@ -116,8 +116,9 @@ EXPORT_SYMBOL_GPL(static_key_slow_dec_deferred);
> >  void jump_label_rate_limit(struct static_key_deferred *key,
> >  		unsigned long rl)
> >  {
> > +	if (!atomic_xchg(&key->key.deferred, 1))
> > +		INIT_DELAYED_WORK(&key->work, jump_label_update_timeout);
> 
> Can it actually happen that jump_label_rate_limit is called multiple
> times?  If so, this hunk alone would be a separate bugfix.  I don't
> think all the concurrency that you're protecting against can actually
> happen, but in any case I'd just take the jump_label_lock() instead of
> using atomics.

It can't happen in current code and it is highly unlikely to happen in
future too.

There was no reason to take the lock, so I didn't, but we could use bool
in struct then ... I'll do it, even though it has more lines of code, it
is probably easier to understand.

> It's also not necessary to use a new field, since you can just check
> key->timeout.

The flush is done automatically and we don't know if the jump_entry
belongs to deferred key, so we shouldn't just blindly try.
(another bit to jump_entry flags would supply enough information, but we
 haven't decided if we want to optimize them into pointers and there
 isn't much space in them + they were introduced in patch [5/7])

> All this gives something like this for static_key_rate_limit_flush:
> 
>         if (key->timeout) {
> 		jump_label_lock();
> 		if (key->enabled) {
> 			jump_label_unlock();
> 			flush_delayed_work(&dkey->work);
> 		} else
> 			jump_label_unlock();
> 	}

Ugh, I see a problem in original patch now: I changed it from
cancel_delayed_work() in the module that owns this key shortly before
posting, because it could still bug then and forgot it isn't good to
take jump_label_lock() a second time, which would be done in the flush.

This needs be solved by checking if we are the last module that uses
this key and issuing a cancel() then and I'm not sure it would not still
bug yet -- the work could already be running, just waiting for
jump_label_lock() we would then somehow manage to free the memory first.

(leaving it to programmer starts to look sane ...)

> Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/