lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161216074142.GC13946@wotan.suse.de>
Date:   Fri, 16 Dec 2016 08:41:42 +0100
From:   "Luis R. Rodriguez" <mcgrof@...nel.org>
To:     "Luis R. Rodriguez" <mcgrof@...nel.org>
Cc:     Kees Cook <keescook@...omium.org>, shuah@...nel.org,
        Jessica Yu <jeyu@...hat.com>,
        Rusty Russell <rusty@...tcorp.com.au>,
        Arnd Bergmann <arnd@...db.de>,
        "Eric W. Biederman" <ebiederm@...ssion.com>,
        Dmitry Torokhov <dmitry.torokhov@...il.com>,
        Arnaldo Carvalho de Melo <acme@...hat.com>,
        Jonathan Corbet <corbet@....net>, martin.wilck@...e.com,
        Michal Marek <mmarek@...e.com>, Petr Mladek <pmladek@...e.com>,
        hare@...e.com, rwright@....com, Jeff Mahoney <jeffm@...e.com>,
        DSterba@...e.com, fdmanana@...e.com, neilb@...e.com,
        rgoldwyn@...e.com, subashab@...eaurora.org,
        Heinrich Schuchardt <xypron.glpk@....de>,
        Aaron Tomlin <atomlin@...hat.com>, mbenes@...e.cz,
        "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        "David S. Miller" <davem@...emloft.net>,
        Ingo Molnar <mingo@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        linux-kselftest@...r.kernel.org,
        "linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC 01/10] kmod: add test driver to stress test the module
 loader

On Tue, Dec 13, 2016 at 10:10:41PM +0100, Luis R. Rodriguez wrote:
> On Thu, Dec 08, 2016 at 12:24:35PM -0800, Kees Cook wrote:
> > On Thu, Dec 8, 2016 at 10:47 AM, Luis R. Rodriguez <mcgrof@...nel.org> wrote:
> > > 3) finit_module() consumes quite a bit of memory.
> > 
> > Is this due to reading the module into kernel memory or something else?
> 
> Very likely yes, but to be honest I have not had chance to instrument too
> carefully, its TODO work :)

I've checked and the issue is since get_fs_type() does not check for
aliases we end up hammering tons of module requests, this in turn is
an analysis on load_module(). Within there layout_and_allocate()
uses first a local copy of the passed user data and mapping it into
a struct module, after a bit of sanity checks it finally allocates a
copy for us, so its struct module size * however many requests were
allowed to get in for load_module(). We could simply avoid an allocation
if the module is already present. I have this as another optimization
now but am running many other tests to compare performance.

> > > +# Once tese are enabled please leave them as-is. Write your own test,
> > > +# we have tons of space.
> > > +kmod_test_0001
> > > +kmod_test_0002
> > > +kmod_test_0003
> > > +kmod_test_0004
> > > +kmod_test_0005
> > > +kmod_test_0006
> > > +kmod_test_0007
> > > +
> > > +#kmod_test_0008
> > > +#kmod_test_0009
> > 
> > While it's documented in the commit log, I think a short note for each
> > disabled test should be added here too.
> 
> Will do, thanks so much for the review!

As I added test 0008's reason for why I think it fails I realized that the reason the test
can sometimes fail is very different than test 0009 which is for get_fs_type(). You see
get_fs_type() hammers kmod concurrent since we don't have an alias check and moprobe
calling fs-xfs for instance does not catch that the module is already loaded so it
delays the get_fs_type() call and so the __request_module() call, hogging up its
kmod concurrent increment.

For direct request_module() calls we don't have the alias issue, but since
we don't check if a module is loaded prior to calling userspace (I now have a fix
for this, reducing this latency does help) it means there are often times the
chances we will pour in tons of requests without them getting processed and
go over the concurrent limit.

I've added a clutch into __request_module() then so instead of just failing
we first check if we're at a threshold (say about 1/4 away from limit) and
if so we let a few threads breath, until they are done. This fixes *both*
test cases without much code changes, however as I've noted in other threads,
this is not the only issue to address.

  Luis

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ