[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231130194912.GB543908@cmpxchg.org>
Date: Thu, 30 Nov 2023 14:49:12 -0500
From: Johannes Weiner <hannes@...xchg.org>
To: Shakeel Butt <shakeelb@...gle.com>
Cc: Dan Schatzberg <schatzberg.dan@...il.com>,
Roman Gushchin <roman.gushchin@...ux.dev>,
Yosry Ahmed <yosryahmed@...gle.com>, Huan Yang <link@...o.com>,
linux-kernel@...r.kernel.org, cgroups@...r.kernel.org,
linux-mm@...ck.org, Michal Hocko <mhocko@...nel.org>,
Muchun Song <muchun.song@...ux.dev>,
Andrew Morton <akpm@...ux-foundation.org>,
David Hildenbrand <david@...hat.com>,
Matthew Wilcox <willy@...radead.org>,
Huang Ying <ying.huang@...el.com>,
Kefeng Wang <wangkefeng.wang@...wei.com>,
Peter Xu <peterx@...hat.com>,
"Vishal Moola (Oracle)" <vishal.moola@...il.com>,
Yue Zhao <findns94@...il.com>, Hugh Dickins <hughd@...gle.com>
Subject: Re: [PATCH 0/1] Add swappiness argument to memory.reclaim
On Thu, Nov 30, 2023 at 06:44:24PM +0000, Shakeel Butt wrote:
> On Thu, Nov 30, 2023 at 07:36:53AM -0800, Dan Schatzberg wrote:
> > (Sorry for the resend - forgot to cc the mailing lists)
> >
> > This patch proposes augmenting the memory.reclaim interface with a
> > swappiness=<val> argument that overrides the swappiness value for that instance
> > of proactive reclaim.
> >
> > Userspace proactive reclaimers use the memory.reclaim interface to trigger
> > reclaim. The memory.reclaim interface does not allow for any way to effect the
> > balance of file vs anon during proactive reclaim. The only approach is to adjust
> > the vm.swappiness setting. However, there are a few reasons we look to control
> > the balance of file vs anon during proactive reclaim, separately from reactive
> > reclaim:
> >
> > * Swapout should be limited to manage SSD write endurance. In near-OOM
>
> Is this about swapout to SSD only?
>
> > situations we are fine with lots of swap-out to avoid OOMs. As these are
> > typically rare events, they have relatively little impact on write endurance.
> > However, proactive reclaim runs continuously and so its impact on SSD write
> > endurance is more significant. Therefore it is desireable to control swap-out
> > for proactive reclaim separately from reactive reclaim
>
> This is understandable but swapout to zswap should be fine, right?
> (Sorry I am not following the discussion on zswap patches from Nhat. Is
> the answer there?)
Memory compression alone would be fine, yes.
However, we don't use zswap in all cgroups. Lower priority things are
forced directly to disk. Some workloads compress poorly and also go
directly to disk for better memory efficiency. On such cgroups, it's
important for proactive reclaim to manage swap rates to avoid burning
out the flash.
Note that zswap also does SSD writes during writeback. I know this
doesn't apply to Google because of the ghost files, but we have SSD
swapfiles behind zswap. And this part will become more relevant with
Nhat's enhanced writeback patches.
Powered by blists - more mailing lists