lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 19 Jan 2024 20:59:22 +0800
From: Kefeng Wang <wangkefeng.wang@...wei.com>
To: Michal Hocko <mhocko@...e.com>
CC: Andrew Morton <akpm@...ux-foundation.org>, <linux-mm@...ck.org>,
	<linux-kernel@...r.kernel.org>, <ryan.roberts@....com>, Matthew Wilcox
	<willy@...radead.org>, David Hildenbrand <david@...hat.com>
Subject: Re: [PATCH v2] mm: memory: move mem_cgroup_charge() into
 alloc_anon_folio()



On 2024/1/19 16:00, Michal Hocko wrote:
> On Fri 19-01-24 10:05:15, Kefeng Wang wrote:
>>
>>
>> On 2024/1/18 23:59, Michal Hocko wrote:
>>> On Wed 17-01-24 18:39:54, Kefeng Wang wrote:
>>>> mem_cgroup_charge() uses the GFP flags in a fairly sophisticated way.
>>>> In addition to checking gfpflags_allow_blocking(), it pays attention
>>>> to __GFP_NORETRY and __GFP_RETRY_MAYFAIL to ensure that processes within
>>>> this memcg do not exceed their quotas. Using the same GFP flags ensures
>>>> that we handle large anonymous folios correctly, including falling back
>>>> to smaller orders when there is plenty of memory available in the system
>>>> but this memcg is close to its limits.
>>>
>>> The changelog is not really clear in the actual problem you are trying
>>> to fix. Is this pure consistency fix or have you actually seen any
>>> misbehavior. From the patch I suspect you are interested in THPs much
>>> more than regular order-0 pages because those are GFP_KERNEL like when
>>> it comes to charging. THPs have a variety of options on how aggressive
>>> the allocation should try. From that perspective NORETRY and
>>> RETRY_MAYFAIL are not all that interesting because costly allocations
>>> (which THPs are) already do imply MAYFAIL and NORETRY.
>>
>> I don't meet actual issue, it founds from code inspection.
>>
>> mTHP is introduced by Ryan(19eaf44954df "mm: thp: support allocation of
>> anonymous multi-size THP"),so we have similar check for mTHP like PMD THP
>> in alloc_anon_folio(), it will try to allocate large order folio below
>> PMD_ORDER, and fallback to order-0 folio if fails, meanwhile,
>> it get GFP flags from vma_thp_gfp_mask() according to user configuration
>> like PMD THP allocation, so
>>
>> 1) the memory charge failure check should be moved into fallback
>> logical, because it will make us to allocated as much as possible large
>> order folio, although the memcg's memory usage is close to its limits.
>>
>> 2) using seem GFP flags for allocate/mem charge, be consistent with PMD
>> THP firstly, in addition, according to GFP flag returned for
>> vma_thp_gfp_mask(), GFP_TRANSHUGE_LIGHT could make us skip direct reclaim,
>> _GFP_NORETRY will make us skip mem_cgroup_oom and won't kill
>> any progress from large order folio charging.
> 
> OK, makes sense. Please turn that into the changelog.

Sure.

> 
>>> GFP_TRANSHUGE_LIGHT is more interesting though because those do not dive
>>> into the direct reclaim at all. With the current code they will reclaim
>>> charges to free up the space for the allocated THP page and that defeats
>>> the light mode. I have a vague recollection of preparing a patch to
>>
>> We are interesting to GFP_TRANSHUGE_LIGHT and _GFP_NORETRY as mentioned
>> above.
> 
> if mTHP can be smaller than COSTLY_ORDER then you are correct and
> NORETRY makes a difference. Please mention that in the changelog as
> well.
> 

For memory cgroup charge, _GFP_NORETRY checked to make us directly skip
mem_cgroup_oom(), it has no concern with folio order or COSTLY_ORDER 
when check _GFP_NORETRY in try_charge_memcg(), so I think NORETRY should
always make difference for all large order folio.

	

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ