lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7640a2d9-a32d-2fd7-8f64-586edb9b781e@virtuozzo.com>
Date:   Tue, 27 Sep 2022 10:44:20 +0300
From:   Alexander Atanasov <alexander.atanasov@...tuozzo.com>
To:     Hyeonggon Yoo <42.hyeyoo@...il.com>
Cc:     Jonathan Corbet <corbet@....net>, Christoph Lameter <cl@...ux.com>,
        Pekka Enberg <penberg@...nel.org>,
        David Rientjes <rientjes@...gle.com>,
        Joonsoo Kim <iamjoonsoo.kim@....com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Roman Gushchin <roman.gushchin@...ux.dev>, kernel@...nvz.org,
        Kees Cook <keescook@...omium.org>,
        Roman Gushchin <guro@...com>, Jann Horn <jannh@...gle.com>,
        Vijayanand Jitta <vjitta@...eaurora.org>,
        linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org
Subject: Re: [PATCH v2] mm: Make failslab writable again

Hello,

On 27.09.22 3:49, Hyeonggon Yoo wrote:
> On Fri, Sep 23, 2022 at 10:34:28AM +0300, Alexander Atanasov wrote:
>> Hello,
>>
>> On 21.09.22 14:30, Hyeonggon Yoo wrote:
>>> On Tue, Sep 20, 2022 at 03:11:11PM +0300, Alexander Atanasov wrote:
>>>> In (060807f841ac mm, slub: make remaining slub_debug related attributes
>>>> read-only) failslab was made read-only.
>>>> I think it became a collateral victim to the two other options for which
>>>> the reasons are perfectly valid.
>>>> Here is why:
>>>>    - sanity_checks and trace are slab internal debug options,
>>>>      failslab is used for fault injection.
>>>>    - for fault injections, which by presumption are random, it
>>>>      does not matter if it is not set atomically. And you need to
>>>>      set atleast one more option to trigger fault injection.
>>>>    - in a testing scenario you may need to change it at runtime
>>>>      example: module loading - you test all allocations limited
>>>>      by the space option. Then you move to test only your module's
>>>>      own slabs.
>>>>    - when set by command line flags it effectively disables all
>>>>      cache merges.
>>>
>>> Maybe we can make failslab= boot parameter to consider cache filtering?
>>>
>>> With that, just pass something like this:
>>> 	failslab=X,X,X,X,cache_filter slub_debug=A,<cache-name>>
>>
>>> Users should pass slub_debug=A,<cache-name> anyway to prevent cache merging.
>>
>> It will be good to have this in case you want to test cache that is used
>> early. But why push something to command line option only when it can be
>> changed at runtime?
> 
> Hmm okay. I'm not against changing it writable. (it looks okay to me.)

Okay. Good to know that.

> Just wanted to understand your use case!
> Can you please elaborate why booting with slub_debug=A,<your cache name>
> and enabling cache_filter after boot does not work?

I didn't say it does not work - it does work but requires reboot. You 
may want to test variations of caches for example. Cache A, Cache B ... 
C and so on one by one. Reboots might be fast these days with VMs but 
you may not be able to test everything in a VM. And ... reboots used to 
be the signature move of one Other OS.

> Or is it trying to changnig these steps,
> 
> FROM
> 	1. booting with slub_debug=A,<cache name>
> 	2. write to cache_filter to enable cache filtering
> 	3. setup probability, interval, times, size
> 
> TO
> 
> 	1. write to failslab attribute of <cache name> (may fail it has alias)
> 	2. write to cache_filter to enable cache filtering
> 	3. setup probability, interval, times, size
> ?
> 
> as you may know, SLAB_FAILSLAB does nothing whens
> cache_filter is disabled, and you should pass slub_debug=A,<cache name> anyway

Okay , i think there awaits another problem:
bool __should_failslab(struct kmem_cache *s, gfp_t gfpflags)
{
...

         if (failslab.cache_filter && !(s->flags & SLAB_FAILSLAB))
                 return false;
...
	return should_fail(&failslab.attr, s->object_size);
}

So if you do not have cache_filter set ... you go to should_fail for all 
slabs.
I've been hit by that and spend a lot of time trying to understand why i 
got crashes at random places. And the reason was that i read an old 
documentation that said cache_filter is writable and i blindly wrote 1 
to it. If the intent is to only work with cache filter set - then i will 
update the patch to do so. This is the only place where SLAB_FAILSLAB is 
explicitly tested, other places check it as part of SLAB_NEVER_MERGE.

But even for all caches it is kind of possible to test with size(space) 
which is in turn useful because you need to figure out how you handle 
failures from external caches - external to your code under test and you 
don't want to keep track for all of them (same goes for too much options 
in command line).


> to prevent doing cache merging with <cache name>.

Or you can pass SLAB_FAILSLAB from your module when creating the cache 
to prevent merge when under test.


-- 
Regards,
Alexander Atanasov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ