linux-kernel - Re: [PATCH 1/6] kmod: add dynamic max concurrent thread count

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170525173002.GC29859@dtor-ws>
Date:   Thu, 25 May 2017 10:30:02 -0700
From:   Dmitry Torokhov <dmitry.torokhov@...il.com>
To:     "Luis R. Rodriguez" <mcgrof@...nel.org>
Cc:     Tom Gundersen <teg@...m.no>, Filipe Manana <fdmanana@...e.com>,
        "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        "linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
        rgoldwyn@...e.com, hare <hare@...e.com>,
        Jonathan Corbet <corbet@....net>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        linux-kselftest@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        Dan Williams <dan.j.williams@...el.com>,
        Aaron Tomlin <atomlin@...hat.com>, rwright@....com,
        Heinrich Schuchardt <xypron.glpk@....de>,
        Michal Marek <mmarek@...e.com>, martin.wilck@...e.com,
        Rusty Russell <rusty@...tcorp.com.au>,
        Jeff Mahoney <jeffm@...e.com>, Ingo Molnar <mingo@...hat.com>,
        Petr Mladek <pmladek@...e.com>,
        Guenter Roeck <linux@...ck-us.net>,
        "Eric W. Biederman" <ebiederm@...ssion.com>, shuah@...nel.org,
        DSterba@...e.com, Kees Cook <keescook@...omium.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...hat.com>,
        Miroslav Benes <mbenes@...e.cz>, NeilBrown <neilb@...e.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        David Miller <davem@...emloft.net>,
        Jessica Yu <jeyu@...hat.com>,
        Subash Abhinov Kasiviswanathan <subashab@...eaurora.org>
Subject: Re: [PATCH 1/6] kmod: add dynamic max concurrent thread count

On Thu, May 25, 2017 at 09:50:15AM -0700, Luis R. Rodriguez wrote:
> On Thu, May 25, 2017 at 9:38 AM, Dmitry Torokhov
> <dmitry.torokhov@...il.com> wrote:
> > On Thu, May 25, 2017 at 06:22:01PM +0200, Luis R. Rodriguez wrote:
> >> On Fri, May 19, 2017 at 02:58:29PM -0700, Dmitry Torokhov wrote:
> >> > On Fri, May 19, 2017 at 02:45:29PM -0700, Luis R. Rodriguez wrote:
> >> > > On May 19, 2017 1:45 PM, "Dmitry Torokhov" <dmitry.torokhov@...il.com>
> >> > > wrote:
> >> > >
> >> > > On Thu, May 18, 2017 at 08:24:39PM -0700, Luis R. Rodriguez wrote:
> >> > > > We currently statically limit the number of modprobe threads which
> >> > > > we allow to run concurrently to 50. As per Keith Owens, this was a
> >> > > > completely arbitrary value, and it was set in the 2.3.38 days [0]
> >> > > > over 16 years ago in year 2000.
> >> > > >
> >> > > > Although we haven't yet hit our lower limits, experimentation [1]
> >> > > > shows that when and if we hit this limit in the worst case, will be
> >> > > > fatal -- consider get_fs_type() failures upon mount on a system which
> >> > > > has many partitions, some of which might even be with the same
> >> > > > filesystem. Its best to be prudent and increase and set this
> >> > > > value to something more sensible which ensures we're far from hitting
> >> > > > the limit and also allows default build/user run time override.
> >> > > >
> >> > > > The worst case is fatal given that once a module fails to load there
> >> > > > is a period of time during which subsequent request for the same module
> >> > > > will fail, so in the case of partitions its not just one request that
> >> > > > could fail, but whole series of partitions. This later issue of a
> >> > > > module request failure domino effect can be addressed later, but
> >> > > > increasing the limit to something more meaninful should at least give us
> >> > > > enough cushion to avoid this for a while.
> >> > > >
> >> > > > Set this value up with a bit more meaninful modern limits:
> >> > > >
> >> > > > Bump this up to 64  max for small systems (CONFIG_BASE_SMALL)
> >> > > > Bump this up to 128 max for larger systems (!CONFIG_BASE_SMALL)
> >> > > >
> >> > > > Also allow the default max limit to be further fine tuned at compile
> >> > > > time and at initialization at run time at boot up using the kernel
> >> > > > parameter: max_modprobes.
> >> > > >
> >> > > > [0] https://git.kernel.org/cgit/linux/kernel/git/history/
> >> > > history.git/commit/?id=ab1c4ec7410f6ec64e1511d1a7d850fc99c09b44
> >> > > > [1] https://github.com/mcgrof/test_request_module
> >> > >
> >> > > If we actually run into this issue, instead of slamming the system with
> >> > > bazillion concurrent requests, can we wait for the other modprobes to
> >> > > finish and then continue?
> >> > >
> >> > >
> >> > > Yes ! That I have a patch that does precisely that ! That is actually still
> >> > > *not enough* to not fail fatally but this would be subject of another
> >> > > series with more debatable approaches.
> >> > >
> >> >
> >> > Then please post it.
> >>
> >> Will do.
> >>
> >> > > This at least pushes us to closer safer limits for now while also making it
> >> > > configurable.
> >> >
> >> > Making it configurable depending on how big/little box is makes no
> >> > sense,
> >>
> >> If we set a hard limit then we need to patch a system if we need to increment
> >> it. This is rather stupid given we have no current heuristics to make kmod
> >> loading deterministic from userspace, and in the worst case this can be fatal.
> >> General system size is a good first guess, but making it configurable is
> >> really key given current limitations. I'll post further patches which reveals
> >> some of these issues more clearly.
> >>
> >> > especially if the above is implemented, as depth of modprobe
> >> > invocations depends on configuration and not computing power of the
> >> > hardware the system is running on.
> >>
> >> You seem to agree making it configurable is sensible , but not depending on
> >> the system size ?
> >
> > No, I am saying that making it configurable based on system size makes
> > no sense at all, and making it configurable given you already have
> > patches removing hard failures gives no benefit.
> 
> Ah no, the problem is that hard failures are not yet removed in this
> patch set at all! This series only contains the things I thought were
> non-radical really.

I know they are not removed in this patch set.

> 
> In fact even with the subsequent patches from my 2nd series I'll
> eventually post post -- these fatal issues are not cured at all unless
> we dance with userspace a bit, or unless as you suggest we have *all*
> pending theads wait without killing any.

Well, that is too bad, I understood you already implemented what I
suggested.

> 
> *This* small patch should enable folks to move the needle to a more
> *fair* limit, its also useful to backport a simple fix even if the
> other stuff is not merged, *but* it *also* provides a way for systems
> to move away from the slippery slope if they know what they are doing.

Look, you are trying to push a band-aid solution for a problem that is
purely theoretical (as you say in your patch description we are not
hitting this problem in practice, only your test does). There is
no slippery slope for systems to move away, no need to backport
anything. We seem to agree that a better solution is possible (throttle
number of concurrently running modprobes without killing requesters),
and with that solution the band-aid will no longer be needed.

So please implement and post the proper fix for the issue.

Thanks.

-- 
Dmitry