[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200414163658.GB136578@cmpxchg.org>
Date: Tue, 14 Apr 2020 12:36:58 -0400
From: Johannes Weiner <hannes@...xchg.org>
To: Alex Shi <alex.shi@...ux.alibaba.com>
Cc: cgroups@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, akpm@...ux-foundation.org,
mgorman@...hsingularity.net, tj@...nel.org, hughd@...gle.com,
khlebnikov@...dex-team.ru, daniel.m.jordan@...cle.com,
yang.shi@...ux.alibaba.com, willy@...radead.org,
shakeelb@...gle.com, Michal Hocko <mhocko@...nel.org>,
Vladimir Davydov <vdavydov.dev@...il.com>,
Roman Gushchin <guro@...com>,
Chris Down <chris@...isdown.name>,
Thomas Gleixner <tglx@...utronix.de>,
Vlastimil Babka <vbabka@...e.cz>, Qian Cai <cai@....pw>,
Andrey Ryabinin <aryabinin@...tuozzo.com>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Jérôme Glisse <jglisse@...hat.com>,
Andrea Arcangeli <aarcange@...hat.com>,
David Rientjes <rientjes@...gle.com>,
"Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>,
swkhack <swkhack@...il.com>,
"Potyra, Stefan" <Stefan.Potyra@...ktrobit.com>,
Mike Rapoport <rppt@...ux.vnet.ibm.com>,
Stephen Rothwell <sfr@...b.auug.org.au>,
Colin Ian King <colin.king@...onical.com>,
Jason Gunthorpe <jgg@...pe.ca>,
Mauro Carvalho Chehab <mchehab+samsung@...nel.org>,
Peng Fan <peng.fan@....com>,
Nikolay Borisov <nborisov@...e.com>,
Ira Weiny <ira.weiny@...el.com>,
Kirill Tkhai <ktkhai@...tuozzo.com>,
Yafang Shao <laoar.shao@...il.com>,
Wei Yang <richard.weiyang@...ux.alibaba.com>
Subject: Re: [PATCH v8 03/10] mm/lru: replace pgdat lru_lock with lruvec lock
On Tue, Apr 14, 2020 at 04:19:01PM +0800, Alex Shi wrote:
>
>
> 在 2020/4/14 上午2:07, Johannes Weiner 写道:
> > But isolation actually needs to lock out charging, or it would operate
> > on the wrong list:
> >
> > isolation: commit_charge:
> > if (TestClearPageLRU(page))
> > page->mem_cgroup = new
> > // page is still physically on
> > // the root_mem_cgroup's LRU. We're
> > // updating the wrong list:
> > memcg = page->mem_cgroup
> > spin_lock(memcg->lru_lock)
> > del_page_from_lru_list(page, memcg)
> > spin_unlock(memcg->lru_lock)
> >
> > lrucare really is a mess. Even before this patch series, it makes
> > things tricky and subtle and error prone.
> >
> > The only reason we're doing it is for when there is swapping without
> > swap tracking, in which case swap reahadead needs to put pages on the
> > LRU but cannot charge them until we have a faulting vma later.
> >
> > But it's not clear how practical such a configuration is. Both memory
> > and swap are shared resources, and isolation isn't really effective
> > when you restrict access to memory but then let workloads swap freely.
> >
> > Plus, the overhead of tracking is tiny - 512k per G of swap (0.04%).
> >
> > Maybe we should just delete MEMCG_SWAP and unconditionally track swap
> > entry ownership when the memory controller is enabled. I don't see a
> > good reason not to, and it would simplify the entire swapin path, the
> > LRU locking, and the page->mem_cgroup stabilization rules.
>
> Hi Johannes,
>
> I think what you mean here is to keep swap_cgroup id even it was swaped,
> then we read back the page from swap disk, we don't need to charge it.
> So all other memcg charge are just happens on non lru list, thus we have
> no isolation required in above awkward scenario.
We don't need to change how swap recording works, we just need to
always do it when CONFIG_MEMCG && CONFIG_SWAP.
We can uncharge the page once it's swapped out. The only difference is
that with a swap record, we know who owned the page and can charge
readahead pages right awya, before setting PageLRU; whereas without a
record, we read pages onto the LRU, and then wait until we hit a page
fault with an mm to charge. That's why we have this lrucare mess.
Powered by blists - more mailing lists