[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170526200340.GX8951@wotan.suse.de>
Date:   Fri, 26 May 2017 22:03:40 +0200
From:   "Luis R. Rodriguez" <mcgrof@...nel.org>
To:     Dmitry Torokhov <dmitry.torokhov@...il.com>
Cc:     "Luis R. Rodriguez" <mcgrof@...nel.org>, akpm@...ux-foundation.org,
        jeyu@...hat.com, shuah@...nel.org, rusty@...tcorp.com.au,
        ebiederm@...ssion.com, acme@...hat.com, corbet@....net,
        martin.wilck@...e.com, mmarek@...e.com, pmladek@...e.com,
        hare@...e.com, rwright@....com, jeffm@...e.com, DSterba@...e.com,
        fdmanana@...e.com, neilb@...e.com, linux@...ck-us.net,
        rgoldwyn@...e.com, subashab@...eaurora.org, xypron.glpk@....de,
        keescook@...omium.org, atomlin@...hat.com, mbenes@...e.cz,
        paulmck@...ux.vnet.ibm.com, dan.j.williams@...el.com,
        jpoimboe@...hat.com, davem@...emloft.net, mingo@...hat.com,
        alan@...ux.intel.com, tytso@....edu, gregkh@...uxfoundation.org,
        torvalds@...ux-foundation.org, linux-kselftest@...r.kernel.org,
        linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 2/5] kmod: reduce atomic operations on kmod_concurrent
On Thu, May 25, 2017 at 06:11:00PM -0700, Dmitry Torokhov wrote:
> On Thu, May 25, 2017 at 05:16:27PM -0700, Luis R. Rodriguez wrote:
> > When checking if we want to allow a kmod thread to kick off we increment,
> > then read to see if we should enable a thread. If we were over the allowed
> > limit limit we decrement. Splitting the increment far apart from decrement
> > means there could be a time where two increments happen potentially
> > giving a false failure on a thread which should have been allowed.
> > 
> > CPU1			CPU2
> > atomic_inc()
> > 			atomic_inc()
> > atomic_read()
> > 			atomic_read()
> > atomic_dec()
> > 			atomic_dec()
> > 
> > In this case a read on CPU1 gets the atomic_inc()'s and we could negate
> > it from getting a kmod thread. We could try to prevent this with a lock
> > or preemption but that is overkill. We can fix by reducing the number of
> > atomic operations. We do this by inverting the logic of of the enabler,
> > instead of incrementing kmod_concurrent as we get new kmod users, define the
> > variable kmod_concurrent_max as the max number of currently allowed kmod
> > users and as we get new kmod users just decrement it if its still positive.
> > This combines the dec and read in one atomic operation.
> > 
> > In this case we no longer get the same false failure:
> > 
> > CPU1			CPU2
> > atomic_dec_if_positive()
> > 			atomic_dec_if_positive()
> > atomic_inc()
> > 			atomic_inc()
> > 
> > Suggested-by: Petr Mladek <pmladek@...e.com>
> > Signed-off-by: Luis R. Rodriguez <mcgrof@...nel.org>
> > ---
> >  include/linux/kmod.h |  2 ++
> >  init/main.c          |  1 +
> >  kernel/kmod.c        | 44 +++++++++++++++++++++++++-------------------
> >  3 files changed, 28 insertions(+), 19 deletions(-)
> > 
> > diff --git a/include/linux/kmod.h b/include/linux/kmod.h
> > index c4e441e00db5..8e2f302b214a 100644
> > --- a/include/linux/kmod.h
> > +++ b/include/linux/kmod.h
> > @@ -38,10 +38,12 @@ int __request_module(bool wait, const char *name, ...);
> >  #define request_module_nowait(mod...) __request_module(false, mod)
> >  #define try_then_request_module(x, mod...) \
> >  	((x) ?: (__request_module(true, mod), (x)))
> > +void init_kmod_umh(void);
> >  #else
> >  static inline int request_module(const char *name, ...) { return -ENOSYS; }
> >  static inline int request_module_nowait(const char *name, ...) { return -ENOSYS; }
> >  #define try_then_request_module(x, mod...) (x)
> > +static inline void init_kmod_umh(void) { }
> >  #endif
> >  
> >  
> > diff --git a/init/main.c b/init/main.c
> > index 9ec09ff8a930..9b20be716cf7 100644
> > --- a/init/main.c
> > +++ b/init/main.c
> > @@ -650,6 +650,7 @@ asmlinkage __visible void __init start_kernel(void)
> >  	thread_stack_cache_init();
> >  	cred_init();
> >  	fork_init();
> > +	init_kmod_umh();
> >  	proc_caches_init();
> >  	buffer_init();
> >  	key_init();
> > diff --git a/kernel/kmod.c b/kernel/kmod.c
> > index 563f97e2be36..cafd27b92d19 100644
> > --- a/kernel/kmod.c
> > +++ b/kernel/kmod.c
> > @@ -46,6 +46,7 @@
> >  #include <trace/events/module.h>
> >  
> >  extern int max_threads;
> > +unsigned int max_modprobes;
> >  
> >  #define CAP_BSET	(void *)1
> >  #define CAP_PI		(void *)2
> > @@ -56,6 +57,8 @@ static DEFINE_SPINLOCK(umh_sysctl_lock);
> >  static DECLARE_RWSEM(umhelper_sem);
> >  
> >  #ifdef CONFIG_MODULES
> > +static atomic_t kmod_concurrent_max = ATOMIC_INIT(0);
> > +#define MAX_KMOD_CONCURRENT 50	/* Completely arbitrary value - KAO */
> >  
> >  /*
> >  	modprobe_path is set via /proc/sys.
> > @@ -127,10 +130,7 @@ int __request_module(bool wait, const char *fmt, ...)
> >  {
> >  	va_list args;
> >  	char module_name[MODULE_NAME_LEN];
> > -	unsigned int max_modprobes;
> >  	int ret;
> > -	static atomic_t kmod_concurrent = ATOMIC_INIT(0);
> > -#define MAX_KMOD_CONCURRENT 50	/* Completely arbitrary value - KAO */
> >  	static int kmod_loop_msg;
> >  
> >  	/*
> > @@ -154,21 +154,7 @@ int __request_module(bool wait, const char *fmt, ...)
> >  	if (ret)
> >  		return ret;
> >  
> > -	/* If modprobe needs a service that is in a module, we get a recursive
> > -	 * loop.  Limit the number of running kmod threads to max_threads/2 or
> > -	 * MAX_KMOD_CONCURRENT, whichever is the smaller.  A cleaner method
> > -	 * would be to run the parents of this process, counting how many times
> > -	 * kmod was invoked.  That would mean accessing the internals of the
> > -	 * process tables to get the command line, proc_pid_cmdline is static
> > -	 * and it is not worth changing the proc code just to handle this case. 
> > -	 * KAO.
> > -	 *
> > -	 * "trace the ppid" is simple, but will fail if someone's
> > -	 * parent exits.  I think this is as good as it gets. --RR
> > -	 */
> > -	max_modprobes = min(max_threads/2, MAX_KMOD_CONCURRENT);
> > -	atomic_inc(&kmod_concurrent);
> > -	if (atomic_read(&kmod_concurrent) > max_modprobes) {
> > +	if (atomic_dec_if_positive(&kmod_concurrent_max) < 0) {
> >  		/* We may be blaming an innocent here, but unlikely */
> >  		if (kmod_loop_msg < 5) {
> >  			printk(KERN_ERR
> > @@ -184,10 +170,30 @@ int __request_module(bool wait, const char *fmt, ...)
> >  
> >  	ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
> >  
> > -	atomic_dec(&kmod_concurrent);
> > +	atomic_inc(&kmod_concurrent_max);
> > +
> >  	return ret;
> >  }
> >  EXPORT_SYMBOL(__request_module);
> > +
> > +/*
> > + * If modprobe needs a service that is in a module, we get a recursive
> > + * loop.  Limit the number of running kmod threads to max_threads/2 or
> > + * MAX_KMOD_CONCURRENT, whichever is the smaller.  A cleaner method
> > + * would be to run the parents of this process, counting how many times
> > + * kmod was invoked.  That would mean accessing the internals of the
> > + * process tables to get the command line, proc_pid_cmdline is static
> > + * and it is not worth changing the proc code just to handle this case.
> > + *
> > + * "trace the ppid" is simple, but will fail if someone's
> > + * parent exits.  I think this is as good as it gets.
> > + */
> > +void __init init_kmod_umh(void)
> > +{
> > +	max_modprobes = min(max_threads/2, MAX_KMOD_CONCURRENT);
> > +	atomic_set(&kmod_concurrent_max, max_modprobes);
> 
> I would love if we could initialize atomic statically. So the trouble we
> are trying to solve here is we create more threads than kernel supports,
> with thread count being calculated as:
> 
> 	threads = div64_u64((u64) totalram_pages * (u64) PAGE_SIZE,
> 			    (u64) THREAD_SIZE * 8UL);
> 
> So to not being serve 50 threads we need to deal with system smaller
> than 3200 pages, or ~13M memory (assume thread size is 8 pages - 64 bit
> with kasan, smaller page sizes reduce memory even more). Can you run
> 4.12 with modules support on machine with such memory?
> 
> So maybe we shoudl simply say:
> 
> static atomic_t kmod_concurrent_max = ATOMIC_INIT(MAX_KMOD_CONCURRENT_MAX);
> 
> and call it a day? So we do not need init_kmod_umh() and don't need to
> call it from init/main.c.
I like this very much. If shit blows up we can simply use the kconfig thing
I had proposed earlier, however it would have to accept less than 50 threads
given we'd be computing this at build time, not at run time.
  Luis
Powered by blists - more mailing lists
 
