linux-kernel - Re: [PATCH] cgroup/cpuset: Avoid memory migration when nodemasks match

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <b404f50a-6a35-92d5-1500-613296d0807f@redhat.com>
Date:   Wed, 25 Aug 2021 15:18:57 -0400
From:   Waiman Long <llong@...hat.com>
To:     Nicolas Saenz Julienne <nsaenzju@...hat.com>,
        cgroups@...r.kernel.org, linux-kernel@...r.kernel.org
Cc:     tj@...nel.org, lizefan.x@...edance.com, hannes@...xchg.org,
        mtosatti@...hat.com, nilal@...hat.com, frederic@...nel.org
Subject: Re: [PATCH] cgroup/cpuset: Avoid memory migration when nodemasks
 match

On 8/25/21 6:54 AM, Nicolas Saenz Julienne wrote:
> With the introduction of ee9707e8593d ("cgroup/cpuset: Enable memory
> migration for cpuset v2") attaching a process to a different cgroup will
> trigger a memory migration regardless of whether it's really needed.
> Memory migration is an expensive operation, so bypass it if the
> nodemasks passed to cpuset_migrate_mm() are equal.
>
> Note that we're not only avoiding the migration work itself, but also a
> call to lru_cache_disable(), which triggers and flushes an LRU drain
> work on every online CPU.
>
> Signed-off-by: Nicolas Saenz Julienne <nsaenzju@...hat.com>
>
> ---
>
> NOTE: This also alleviates hangs I stumbled upon while testing
> linux-next on systems with nohz_full CPUs (running latency sensitive
> loads). ee9707e8593d's newly imposed memory migration never finishes, as
> the LRU drain is never scheduled on isolated CPUs.
>
> I tried to follow the user-space call trace, it's something like this:
>
>    Create new tmux pane, which triggers hostname operation, hangs...
>      -> systemd (pid 1) creates new hostnamed process (using clone())
>        -> hostnamed process attaches itself to:
>    	 "system.slice/systemd-hostnamed.service/cgroup.procs"
>          -> hangs... Waiting for LRU drain to finish on nohz_full CPUs.
>
> As far as CPU isolation is concerned, this calls for better
> understanding of the underlying issues. For example, should LRU be made
> CPU isolation aware or should we deal with it at cgroup/cpuset level? In
> the meantime, I figured this small optimization is worthwhile on its
> own.
>
>   kernel/cgroup/cpuset.c | 5 +++++
>   1 file changed, 5 insertions(+)
>
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index 44d234b0df5e..d497a65c4f04 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -1634,6 +1634,11 @@ static void cpuset_migrate_mm(struct mm_struct *mm, const nodemask_t *from,
>   {
>   	struct cpuset_migrate_mm_work *mwork;
>   
> +	if (nodes_equal(*from, *to)) {
> +		mmput(mm);
> +		return;
> +	}
> +
>   	mwork = kzalloc(sizeof(*mwork), GFP_KERNEL);
>   	if (mwork) {
>   		mwork->mm = mm;

Thanks for the fix. So cpuset v1 with memory_migrate flag set will have 
the same problem then.

Acked-by: Waiman Long <longman@...hat.com>

Cheers,
Longman