linux-kernel - Re: [PATCH v2 1/3] mm: Fix dropped memcg from mem cgroup soft limit tree

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YC68QRVsCONXscCl@dhcp22.suse.cz>
Date:   Thu, 18 Feb 2021 20:13:05 +0100
From:   Michal Hocko <mhocko@...e.com>
To:     Tim Chen <tim.c.chen@...ux.intel.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        Dave Hansen <dave.hansen@...el.com>,
        Ying Huang <ying.huang@...el.com>, linux-mm@...ck.org,
        cgroups@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/3] mm: Fix dropped memcg from mem cgroup soft limit
 tree

On Thu 18-02-21 10:30:20, Tim Chen wrote:
> 
> 
> On 2/18/21 12:24 AM, Michal Hocko wrote:
> 
> > 
> > I have already acked this patch in the previous version along with Fixes
> > tag. It seems that my review feedback has been completely ignored also
> > for other patches in this series.
> 
> Michal,
> 
> My apology.  Our mail system screwed up and there are some mail missing
> from our mail system that I completely missed your mail.  
> Only saw them now after I looked into the lore.kernel.org.

I see. My apology for suspecting you from ignoring my review.
 
> Responding to your comment:
> 
> >Have you observed this happening in the real life? I do agree that the
> >threshold based updates of the tree is not ideal but the whole soft
> >reclaim code is far from optimal. So why do we care only now? The
> >feature is essentially dead and fine tuning it sounds like a step back
> >to me.
> 
> Yes, I did see the issue mentioned in patch 2 breaking soft limit
> reclaim for cgroup v1.  There are still some of our customers using
> cgroup v1 so we will like to fix this if possible.

It would be great to see more details.

> For patch 3 regarding the uncharge_batch, it
> is more of an observation that we should uncharge in batch of same node
> and not prompted by actual workload.
> Thinking more about this, the worst that could happen
> is we could have some entries in the soft limit tree that overestimate
> the memory used.  The worst that could happen is a soft page reclaim
> on that cgroup.  The overhead from extra memcg event update could
> be more than a soft page reclaim pass.  So let's drop patch 3
> for now.

I would still prefer to handle that in the soft limit reclaim path and
check each memcg for the soft limit reclaim excess before the reclaim.
 
> Let me know if you will like me to resend patch 1 with the fixes tag
> for commit 4e41695356fb ("memory controller: soft limit reclaim on contention")
> and if there are any changes I should make for patch 2.

I will ack and suggest Fixes.

> 
> Thanks.
> 
> Tim

-- 
Michal Hocko
SUSE Labs