lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACSyD1Ny8OFxZkVPaskpnTDXgWZLBNK04GwjynT2a0ahUwKcAw@mail.gmail.com>
Date: Thu, 29 Aug 2024 22:30:09 +0800
From: Zhongkun He <hezhongkun.hzk@...edance.com>
To: Michal Hocko <mhocko@...e.com>
Cc: akpm@...ux-foundation.org, hannes@...xchg.org, roman.gushchin@...ux.dev, 
	shakeel.butt@...ux.dev, muchun.song@...ux.dev, lizefan.x@...edance.com, 
	linux-mm@...ck.org, linux-kernel@...r.kernel.org, cgroups@...r.kernel.org
Subject: Re: [External] Re: [RFC PATCH 0/2] Add disable_unmap_file arg to memory.reclaim

On Thu, Aug 29, 2024 at 9:36 PM Michal Hocko <mhocko@...e.com> wrote:
>
> On Thu 29-08-24 21:15:50, Zhongkun He wrote:
> > On Thu, Aug 29, 2024 at 7:51 PM Michal Hocko <mhocko@...e.com> wrote:
> [...]
> > > Is this some artificial workload or something real world?
> > >
> >
> > This is an artificial workload to show the detail of this case more
> > easily. But we have encountered this problem on our servers.
>
> This is always good to mention in the changelog. If you can observe this
> in real workloads it is good to get numbers from those because
> artificial workloads tend to overshoot the underlying problem and we can
> potentially miss the practical contributors to the problem.

That sounds reasonable. I will try it.

>
> Seeing this my main question is whether we should focus on swappiness
> behavior more than adding a very strange and very targetted reclaim
> mode. After all we have a mapped memory and executables protection in
> place. So in the end this is more about balance between anon vs. file
> LRUs.
>

I  have a question about the swappiness, if set the swappiness=0, we can only
reclaim the file pages. but we do not have an option to disable the reclaim from
file pages because there are faster storages for the swap without IO, like zram
and zswap.  I wonder if we can give it a try in this direction.

> > If the performance of the disk is poor, like HDD, the situation will
> > become even worse.
>
> Doesn't that impact swapin/out as well? Or do you happen to have a
> faster storage for the swap?

Yes, we use ZRAM as the swap storage.

>
> > The delay of the task becomes more serious because reading data will
> > be slower.  Hot pages will thrash repeatedly between the memory and
> > the disk.
>
> Doesn't refault stats and IO cost aspect of the reclaim when balancing
> LRUs dealing with this situation already? Why it doesn't work in your
> case? Have you tried to investigate that?

OK, I'll try to reproduce the problem again. but IIUC, we could not reclaim
pages from one side. Please see this 'commit d483a5dd009  ("mm:
vmscan: limit the range of LRU type balancing")'  [1]

Unless this condition is met:
sc->file_is_tiny =
            file + free <= total_high_wmark &&
            !(sc->may_deactivate & DEACTIVATE_ANON) &&
            anon >> sc->priority;

[1]: https://lore.kernel.org/all/20200520232525.798933-15-hannes@cmpxchg.org/T/#u

> --
> Michal Hocko
> SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ