lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 16 Aug 2023 15:08:23 +0800
From:   "Huang, Ying" <ying.huang@...el.com>
To:     Michal Hocko <mhocko@...e.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, Christoph Lameter <cl@...ux.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Vlastimil Babka <vbabka@...e.cz>
Subject: Re: [PATCH] mm: fix draining remote pageset

Michal Hocko <mhocko@...e.com> writes:

> On Mon 14-08-23 09:59:51, Huang, Ying wrote:
>> Hi, Michal,
>> 
>> Michal Hocko <mhocko@...e.com> writes:
>> 
>> > On Fri 11-08-23 17:08:19, Huang Ying wrote:
>> >> If there is no memory allocation/freeing in the remote pageset after
>> >> some time (3 seconds for now), the remote pageset will be drained to
>> >> avoid memory wastage.
>> >> 
>> >> But in the current implementation, vmstat updater worker may not be
>> >> re-queued when we are waiting for the timeout (pcp->expire != 0) if
>> >> there are no vmstat changes, for example, when CPU goes idle.
>> >
>> > Why is that a problem?
>> 
>> The pages of the remote zone may be kept in the local per-CPU pageset
>> for long time as long as there's no page allocation/freeing on the
>> logical CPU.  In addition to the logical CPU goes idle, this is also
>> possible if the logical CPU is busy in the user space.
>
> But why is this a problem? Is the scale of the problem sufficient to
> trigger out of memory situations or be otherwise harmful?

This may trigger premature page reclaiming.  The pages in the PCP of the
remote zone would have been freed to satisfy the page allocation for the
remote zone to avoid page reclaiming.  It's highly possible that the
local CPU just allocate/free from/to the remote zone temporarily.  So,
we should free PCP pages of the remote zone if there is no page
allocation/freeing from/to the remote zone for 3 seconds.

This will not trigger OOM, because all PCP will be drained if allocation
failed after direct reclaiming.

--
Best Regards,
Huang, Ying

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ