lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 8 Jun 2022 12:05:50 +0200
From:   Uladzislau Rezki <urezki@...il.com>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     linux-mm@...ck.org, LKML <linux-kernel@...r.kernel.org>,
        Christoph Hellwig <hch@...radead.org>,
        Matthew Wilcox <willy@...radead.org>,
        Nicholas Piggin <npiggin@...il.com>,
        Oleksiy Avramchenko <oleksiy.avramchenko@...y.com>
Subject: Re: [PATCH 0/5] Reduce a vmalloc internal lock contention preparation work

>
> I can toss it in for some runtime testing, but...
>
> What lock are we talking about here, what is the magnitude of the
> performance issues it is causing and what is the status of the patch
> which uses all this preparation?
>
1.
The vmalloc still uses the global lock in order to access to the global
vmap space. As for magnitude it depends on number of CPUs, higher
number higher contention. Linear dependence.

2.
I am not aware about performance issues which i run into on my setup,
from the other hand there is a "Per cpu kva allocator" built on top of
vmalloc. See vm_map_ram() vm_unmap_ram(). Having vmalloc-per
CPU we can get rid of it.

It is used by the XFS, f2fs and some drivers. The reason is that a
vmalloc is costly due to internal global lock. That is why those users
go with "Per cpu kva allocator" to accelerate their workloads.

3.
My synthetic test shows a big difference between per-CPU vmalloc
patches and default variant. I have different prototypes based on
various ways how to make it per-CPU. I still do not have a fully solution
that satisfies all the needs. But i do not think it is possible due to many
constraints.

4.
This series is not tighten to future per-cpu-vmalloc patches, it is rather
makes the vmalloc code to be more generic as a result of such common
code it would be easier to extend it to per-cpu variant.

It means if per-cpu is not in place it is not needed to be reverted back.

That is the status.

-- 
Uladzislau Rezki

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ