lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 7 Mar 2022 18:04:43 +0100
From:   Michal Hocko <mhocko@...e.com>
To:     Suren Baghdasaryan <surenb@...gle.com>
Cc:     akpm@...ux-foundation.org, hannes@...xchg.org, pmladek@...e.com,
        peterz@...radead.org, guro@...com, shakeelb@...gle.com,
        minchan@...nel.org, timmurray@...gle.com, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, kernel-team@...roid.com
Subject: Re: [RFC 1/1] mm: page_alloc: replace mm_percpu_wq with kthreads in
 drain_all_pages

On Thu 24-02-22 17:28:19, Suren Baghdasaryan wrote:
> Sending as an RFC to confirm if this is the right direction and to
> clarify if other tasks currently executed on mm_percpu_wq should be
> also moved to kthreads. The patch seems stable in testing but I want
> to collect more performance data before submitting a non-RFC version.
> 
> 
> Currently drain_all_pages uses mm_percpu_wq to drain pages from pcp
> list during direct reclaim. The tasks on a workqueue can be delayed
> by other tasks in the workqueues using the same per-cpu worker pool.
> This results in sizable delays in drain_all_pages when cpus are highly
> contended.

This is not about cpus being highly contended. It is about too much work
on the WQ context.

> Memory management operations designed to relieve memory pressure should
> not be allowed to block by other tasks, especially if the task in direct
> reclaim has higher priority than the blocking tasks.

Agreed here.

> Replace the usage of mm_percpu_wq with per-cpu low priority FIFO
> kthreads to execute draining tasks.

This looks like a natural thing to do when WQ context is not suitable
but I am not sure the additional resources is really justified. Large
machines with a lot of cpus would create a lot of kernel threads. Can we
do better than that?

Would it be possible to have fewer workers (e.g. 1 or one per numa node)
and it would perform the work on a dedicated cpu by changing its
affinity? Or would that introduce an unacceptable overhead?

Or would it be possible to update the existing WQ code to use rescuer
well before the WQ is completely clogged?
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ