lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZaosK59cRa27K9zW@tiehlicka>
Date: Fri, 19 Jan 2024 09:00:43 +0100
From: Michal Hocko <mhocko@...e.com>
To: Kefeng Wang <wangkefeng.wang@...wei.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, ryan.roberts@....com,
	Matthew Wilcox <willy@...radead.org>,
	David Hildenbrand <david@...hat.com>
Subject: Re: [PATCH v2] mm: memory: move mem_cgroup_charge() into
 alloc_anon_folio()

On Fri 19-01-24 10:05:15, Kefeng Wang wrote:
> 
> 
> On 2024/1/18 23:59, Michal Hocko wrote:
> > On Wed 17-01-24 18:39:54, Kefeng Wang wrote:
> > > mem_cgroup_charge() uses the GFP flags in a fairly sophisticated way.
> > > In addition to checking gfpflags_allow_blocking(), it pays attention
> > > to __GFP_NORETRY and __GFP_RETRY_MAYFAIL to ensure that processes within
> > > this memcg do not exceed their quotas. Using the same GFP flags ensures
> > > that we handle large anonymous folios correctly, including falling back
> > > to smaller orders when there is plenty of memory available in the system
> > > but this memcg is close to its limits.
> > 
> > The changelog is not really clear in the actual problem you are trying
> > to fix. Is this pure consistency fix or have you actually seen any
> > misbehavior. From the patch I suspect you are interested in THPs much
> > more than regular order-0 pages because those are GFP_KERNEL like when
> > it comes to charging. THPs have a variety of options on how aggressive
> > the allocation should try. From that perspective NORETRY and
> > RETRY_MAYFAIL are not all that interesting because costly allocations
> > (which THPs are) already do imply MAYFAIL and NORETRY.
> 
> I don't meet actual issue, it founds from code inspection.
> 
> mTHP is introduced by Ryan(19eaf44954df "mm: thp: support allocation of
> anonymous multi-size THP"),so we have similar check for mTHP like PMD THP
> in alloc_anon_folio(), it will try to allocate large order folio below
> PMD_ORDER, and fallback to order-0 folio if fails, meanwhile,
> it get GFP flags from vma_thp_gfp_mask() according to user configuration
> like PMD THP allocation, so
> 
> 1) the memory charge failure check should be moved into fallback
> logical, because it will make us to allocated as much as possible large
> order folio, although the memcg's memory usage is close to its limits.
> 
> 2) using seem GFP flags for allocate/mem charge, be consistent with PMD
> THP firstly, in addition, according to GFP flag returned for
> vma_thp_gfp_mask(), GFP_TRANSHUGE_LIGHT could make us skip direct reclaim,
> _GFP_NORETRY will make us skip mem_cgroup_oom and won't kill
> any progress from large order folio charging.

OK, makes sense. Please turn that into the changelog.

> > GFP_TRANSHUGE_LIGHT is more interesting though because those do not dive
> > into the direct reclaim at all. With the current code they will reclaim
> > charges to free up the space for the allocated THP page and that defeats
> > the light mode. I have a vague recollection of preparing a patch to
> 
> We are interesting to GFP_TRANSHUGE_LIGHT and _GFP_NORETRY as mentioned
> above.

if mTHP can be smaller than COSTLY_ORDER then you are correct and
NORETRY makes a difference. Please mention that in the changelog as
well.

Thanks!
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ