[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZyJNizBQ-h4feuJe@tiehlicka>
Date: Wed, 30 Oct 2024 16:15:23 +0100
From: Michal Hocko <mhocko@...e.com>
To: Gutierrez Asier <gutierrez.asier@...wei-partners.com>
Cc: akpm@...ux-foundation.org, david@...hat.com, ryan.roberts@....com,
baohua@...nel.org, willy@...radead.org, peterx@...hat.com,
hannes@...xchg.org, hocko@...nel.org, roman.gushchin@...ux.dev,
shakeel.butt@...ux.dev, muchun.song@...ux.dev,
cgroups@...r.kernel.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, stepanov.anatoly@...wei.com,
alexander.kozhevnikov@...wei-partners.com, guohanjun@...wei.com,
weiyongjun1@...wei.com, wangkefeng.wang@...wei.com,
judy.chenhui@...wei.com, yusongping@...wei.com,
artem.kuzin@...wei.com, kang.sun@...wei.com
Subject: Re: [RFC PATCH 0/3] Cgroup-based THP control
On Wed 30-10-24 17:58:04, Gutierrez Asier wrote:
>
>
> On 10/30/2024 4:27 PM, Michal Hocko wrote:
> > On Wed 30-10-24 15:51:00, Gutierrez Asier wrote:
> >>
> >>
> >> On 10/30/2024 11:38 AM, Michal Hocko wrote:
> >>> On Wed 30-10-24 16:33:08, gutierrez.asier@...wei-partners.com wrote:
> >>>> From: Asier Gutierrez <gutierrez.asier@...wei-partners.com>
> >>>>
> >>>> Currently THP modes are set globally. It can be an overkill if only some
> >>>> specific app/set of apps need to get benefits from THP usage. Moreover, various
> >>>> apps might need different THP settings. Here we propose a cgroup-based THP
> >>>> control mechanism.
> >>>>
> >>>> THP interface is added to memory cgroup subsystem. Existing global THP control
> >>>> semantics is supported for backward compatibility. When THP modes are set
> >>>> globally all the changes are propagated to memory cgroups. However, when a
> >>>> particular cgroup changes its THP policy, the global THP policy in sysfs remains
> >>>> the same.
> >>>
> >>> Do you have any specific examples where this would be benefitial?
> >>
> >> Now we're mostly focused on database scenarios (MySQL, Redis).
> >
> > That seems to be more process than workload oriented. Why the existing
> > per-process tuning doesn't work?
> >
> > [...]
>
> 1st Point
>
> We're trying to provide a transparent mechanism, but all the existing per-process
> methods require to modify an app itself (MADV_HUGE, MADV_COLLAPSE, hugetlbfs)
There is also prctl to define per-process policy. We currently have
means to disable THP for the process to override the defeault behavior.
That would be mostly transparent for the application.
You have not really answered a more fundamental question though. Why the
THP behavior should be at the cgroup scope? From a practical POV that
would represent containers which are a mixed bag of applications to
support the workload. Why does the same THP policy apply to all of them?
Doesn't this make the sub-optimal global behavior the same on the cgroup
level when some parts will benefit while others will not?
> Moreover we're using file-backed THPs too (for .text mostly), which make it for
> user-space developers even more complicated.
>
> >>>> Child cgroups inherit THP settings from parent cgroup upon creation. Particular
> >>>> cgroup mode changes aren't propagated to child cgroups.
> >>>
> >>> So this breaks hierarchical property, doesn't it? In other words if a
> >>> parent cgroup would like to enforce a certain policy to all descendants
> >>> then this is not really possible.
> >>
> >> The first idea was to have some flexibility when changing THP policies.
> >>
> >> I will submit a new patch set which will enforce the cgroup hierarchy and change all
> >> the children recursively.
> >
> > What is the expected semantics then?
>
> 2nd point (on semantics)
> 1. Children inherit the THP policy upon creation
> 2. Parent's policy changes are propagated to all the children
> 3. Children can set the policy independently
So if the parent decides that none of the children should be using THP
they can override that so the tuning at parent has no imperative
control. This is breaking hierarchical property that is expected from
cgroup control files.
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists