[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170526200340.GX8951@wotan.suse.de>
Date: Fri, 26 May 2017 22:03:40 +0200
From: "Luis R. Rodriguez" <mcgrof@...nel.org>
To: Dmitry Torokhov <dmitry.torokhov@...il.com>
Cc: "Luis R. Rodriguez" <mcgrof@...nel.org>, akpm@...ux-foundation.org,
jeyu@...hat.com, shuah@...nel.org, rusty@...tcorp.com.au,
ebiederm@...ssion.com, acme@...hat.com, corbet@....net,
martin.wilck@...e.com, mmarek@...e.com, pmladek@...e.com,
hare@...e.com, rwright@....com, jeffm@...e.com, DSterba@...e.com,
fdmanana@...e.com, neilb@...e.com, linux@...ck-us.net,
rgoldwyn@...e.com, subashab@...eaurora.org, xypron.glpk@....de,
keescook@...omium.org, atomlin@...hat.com, mbenes@...e.cz,
paulmck@...ux.vnet.ibm.com, dan.j.williams@...el.com,
jpoimboe@...hat.com, davem@...emloft.net, mingo@...hat.com,
alan@...ux.intel.com, tytso@....edu, gregkh@...uxfoundation.org,
torvalds@...ux-foundation.org, linux-kselftest@...r.kernel.org,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 2/5] kmod: reduce atomic operations on kmod_concurrent
On Thu, May 25, 2017 at 06:11:00PM -0700, Dmitry Torokhov wrote:
> On Thu, May 25, 2017 at 05:16:27PM -0700, Luis R. Rodriguez wrote:
> > When checking if we want to allow a kmod thread to kick off we increment,
> > then read to see if we should enable a thread. If we were over the allowed
> > limit limit we decrement. Splitting the increment far apart from decrement
> > means there could be a time where two increments happen potentially
> > giving a false failure on a thread which should have been allowed.
> >
> > CPU1 CPU2
> > atomic_inc()
> > atomic_inc()
> > atomic_read()
> > atomic_read()
> > atomic_dec()
> > atomic_dec()
> >
> > In this case a read on CPU1 gets the atomic_inc()'s and we could negate
> > it from getting a kmod thread. We could try to prevent this with a lock
> > or preemption but that is overkill. We can fix by reducing the number of
> > atomic operations. We do this by inverting the logic of of the enabler,
> > instead of incrementing kmod_concurrent as we get new kmod users, define the
> > variable kmod_concurrent_max as the max number of currently allowed kmod
> > users and as we get new kmod users just decrement it if its still positive.
> > This combines the dec and read in one atomic operation.
> >
> > In this case we no longer get the same false failure:
> >
> > CPU1 CPU2
> > atomic_dec_if_positive()
> > atomic_dec_if_positive()
> > atomic_inc()
> > atomic_inc()
> >
> > Suggested-by: Petr Mladek <pmladek@...e.com>
> > Signed-off-by: Luis R. Rodriguez <mcgrof@...nel.org>
> > ---
> > include/linux/kmod.h | 2 ++
> > init/main.c | 1 +
> > kernel/kmod.c | 44 +++++++++++++++++++++++++-------------------
> > 3 files changed, 28 insertions(+), 19 deletions(-)
> >
> > diff --git a/include/linux/kmod.h b/include/linux/kmod.h
> > index c4e441e00db5..8e2f302b214a 100644
> > --- a/include/linux/kmod.h
> > +++ b/include/linux/kmod.h
> > @@ -38,10 +38,12 @@ int __request_module(bool wait, const char *name, ...);
> > #define request_module_nowait(mod...) __request_module(false, mod)
> > #define try_then_request_module(x, mod...) \
> > ((x) ?: (__request_module(true, mod), (x)))
> > +void init_kmod_umh(void);
> > #else
> > static inline int request_module(const char *name, ...) { return -ENOSYS; }
> > static inline int request_module_nowait(const char *name, ...) { return -ENOSYS; }
> > #define try_then_request_module(x, mod...) (x)
> > +static inline void init_kmod_umh(void) { }
> > #endif
> >
> >
> > diff --git a/init/main.c b/init/main.c
> > index 9ec09ff8a930..9b20be716cf7 100644
> > --- a/init/main.c
> > +++ b/init/main.c
> > @@ -650,6 +650,7 @@ asmlinkage __visible void __init start_kernel(void)
> > thread_stack_cache_init();
> > cred_init();
> > fork_init();
> > + init_kmod_umh();
> > proc_caches_init();
> > buffer_init();
> > key_init();
> > diff --git a/kernel/kmod.c b/kernel/kmod.c
> > index 563f97e2be36..cafd27b92d19 100644
> > --- a/kernel/kmod.c
> > +++ b/kernel/kmod.c
> > @@ -46,6 +46,7 @@
> > #include <trace/events/module.h>
> >
> > extern int max_threads;
> > +unsigned int max_modprobes;
> >
> > #define CAP_BSET (void *)1
> > #define CAP_PI (void *)2
> > @@ -56,6 +57,8 @@ static DEFINE_SPINLOCK(umh_sysctl_lock);
> > static DECLARE_RWSEM(umhelper_sem);
> >
> > #ifdef CONFIG_MODULES
> > +static atomic_t kmod_concurrent_max = ATOMIC_INIT(0);
> > +#define MAX_KMOD_CONCURRENT 50 /* Completely arbitrary value - KAO */
> >
> > /*
> > modprobe_path is set via /proc/sys.
> > @@ -127,10 +130,7 @@ int __request_module(bool wait, const char *fmt, ...)
> > {
> > va_list args;
> > char module_name[MODULE_NAME_LEN];
> > - unsigned int max_modprobes;
> > int ret;
> > - static atomic_t kmod_concurrent = ATOMIC_INIT(0);
> > -#define MAX_KMOD_CONCURRENT 50 /* Completely arbitrary value - KAO */
> > static int kmod_loop_msg;
> >
> > /*
> > @@ -154,21 +154,7 @@ int __request_module(bool wait, const char *fmt, ...)
> > if (ret)
> > return ret;
> >
> > - /* If modprobe needs a service that is in a module, we get a recursive
> > - * loop. Limit the number of running kmod threads to max_threads/2 or
> > - * MAX_KMOD_CONCURRENT, whichever is the smaller. A cleaner method
> > - * would be to run the parents of this process, counting how many times
> > - * kmod was invoked. That would mean accessing the internals of the
> > - * process tables to get the command line, proc_pid_cmdline is static
> > - * and it is not worth changing the proc code just to handle this case.
> > - * KAO.
> > - *
> > - * "trace the ppid" is simple, but will fail if someone's
> > - * parent exits. I think this is as good as it gets. --RR
> > - */
> > - max_modprobes = min(max_threads/2, MAX_KMOD_CONCURRENT);
> > - atomic_inc(&kmod_concurrent);
> > - if (atomic_read(&kmod_concurrent) > max_modprobes) {
> > + if (atomic_dec_if_positive(&kmod_concurrent_max) < 0) {
> > /* We may be blaming an innocent here, but unlikely */
> > if (kmod_loop_msg < 5) {
> > printk(KERN_ERR
> > @@ -184,10 +170,30 @@ int __request_module(bool wait, const char *fmt, ...)
> >
> > ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
> >
> > - atomic_dec(&kmod_concurrent);
> > + atomic_inc(&kmod_concurrent_max);
> > +
> > return ret;
> > }
> > EXPORT_SYMBOL(__request_module);
> > +
> > +/*
> > + * If modprobe needs a service that is in a module, we get a recursive
> > + * loop. Limit the number of running kmod threads to max_threads/2 or
> > + * MAX_KMOD_CONCURRENT, whichever is the smaller. A cleaner method
> > + * would be to run the parents of this process, counting how many times
> > + * kmod was invoked. That would mean accessing the internals of the
> > + * process tables to get the command line, proc_pid_cmdline is static
> > + * and it is not worth changing the proc code just to handle this case.
> > + *
> > + * "trace the ppid" is simple, but will fail if someone's
> > + * parent exits. I think this is as good as it gets.
> > + */
> > +void __init init_kmod_umh(void)
> > +{
> > + max_modprobes = min(max_threads/2, MAX_KMOD_CONCURRENT);
> > + atomic_set(&kmod_concurrent_max, max_modprobes);
>
> I would love if we could initialize atomic statically. So the trouble we
> are trying to solve here is we create more threads than kernel supports,
> with thread count being calculated as:
>
> threads = div64_u64((u64) totalram_pages * (u64) PAGE_SIZE,
> (u64) THREAD_SIZE * 8UL);
>
> So to not being serve 50 threads we need to deal with system smaller
> than 3200 pages, or ~13M memory (assume thread size is 8 pages - 64 bit
> with kasan, smaller page sizes reduce memory even more). Can you run
> 4.12 with modules support on machine with such memory?
>
> So maybe we shoudl simply say:
>
> static atomic_t kmod_concurrent_max = ATOMIC_INIT(MAX_KMOD_CONCURRENT_MAX);
>
> and call it a day? So we do not need init_kmod_umh() and don't need to
> call it from init/main.c.
I like this very much. If shit blows up we can simply use the kconfig thing
I had proposed earlier, however it would have to accept less than 50 threads
given we'd be computing this at build time, not at run time.
Luis
Powered by blists - more mailing lists