lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210729134939.iulryxjarhjmpugz@linutronix.de>
Date:   Thu, 29 Jul 2021 15:49:39 +0200
From:   Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To:     Vlastimil Babka <vbabka@...e.cz>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Christoph Lameter <cl@...ux.com>,
        David Rientjes <rientjes@...gle.com>,
        Pekka Enberg <penberg@...nel.org>,
        Joonsoo Kim <iamjoonsoo.kim@....com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Jann Horn <jannh@...gle.com>
Subject: Re: [RFC v2 00/34] SLUB: reduce irq disabled scope and make it RT
 compatible

now that I'm slowly catching up…

On 2021-07-02 22:25:05 [+0200], Vlastimil Babka wrote:
> > - perf_5.10 stat -r 10 hackbench -g200 -s 4096 -l500
> > Old:
> > |         464.967,20 msec task-clock                #   27,220 CPUs utilized            ( +-  0,16% )
> > New:
> > |         422.865,71 msec task-clock                #    4,782 CPUs utilized            ( +-  0,34% )
> 
> The series shouldn't significantly change the memory allocator
> interaction, though.
> Seems there's less cycles, but more time elapsed, thus more sleeping -
> is it locks becoming mutexes on RT?

yes, most likely since the !RT parts are mostly unchanged.

> My second guess - list_lock remains spinlock with my series, thus RT
> mutex, but the current RT tree converts it to raw_spinlock. I'd hope
> leaving that one as non-raw spinlock would still be much better for RT
> goals, even if hackbench (which is AFAIK very slab intensive) throughput
> regresses - hopefully not that much.

Yes, the list_lock seems to be the case. I picked your
slub-local-lock-v3r0 and changed the list_lock (+slab_lock()) to use
raw_spinlock_t and disable interrupts and CPUs utilisation went to
~23CPUs (plus a bunch of warnings which probably made it a little slower
again).
The difference between a sleeping lock (spinlock_t) and a mutex is
that we attempt not to preempt a task that acquired a spinlock_t even if
it is running for some time and the scheduler would preempt it (like it
would do if the task had a mutex acquired. These are the "lazy preempt"
bits in the RT patch).

By making the list_lock a raw_spinlock_t a lot of IRQ-flags dancing
needs to be done as the page-allocator must be entered with enabled
interrupts. And then there is the possibility that you may need to free
some memory even if you allocate memory which requires some extra steps
on RT due to the IRQ-off part. All this vanishes by keeping list_lock a
spinlock_t.
The kernel-build test on /dev/shm remained unchanged so that is good.
Unless there is a real-world use-case, that gets worse, I don't mind
keeping the spinlock_t here. I haven't seen tglx complaining so far.

Sebastian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ