[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180416130936.GC26022@bombadil.infradead.org>
Date: Mon, 16 Apr 2018 06:09:36 -0700
From: Matthew Wilcox <willy@...radead.org>
To: Vlastimil Babka <vbabka@...e.cz>
Cc: Michal Hocko <mhocko@...nel.org>,
Johannes Weiner <hannes@...xchg.org>,
Minchan Kim <minchan@...nel.org>, Roman Gushchin <guro@...com>,
linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
Alexander Viro <viro@...iv.linux.org.uk>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
kernel-team@...com
Subject: Re: [PATCH 3/3] dcache: account external names as indirectly
reclaimable memory
On Mon, Apr 16, 2018 at 02:06:21PM +0200, Vlastimil Babka wrote:
> On 04/16/2018 01:41 PM, Michal Hocko wrote:
> > On Fri 13-04-18 10:37:16, Johannes Weiner wrote:
> >> On Fri, Apr 13, 2018 at 04:28:21PM +0200, Michal Hocko wrote:
> >>> On Fri 13-04-18 16:20:00, Vlastimil Babka wrote:
> >>>> We would need kmalloc-reclaimable-X variants. It could be worth it,
> >>>> especially if we find more similar usages. I suspect they would be more
> >>>> useful than the existing dma-kmalloc-X :)
> >>>
> >>> I am still not sure why __GFP_RECLAIMABLE cannot be made work as
> >>> expected and account slab pages as SLAB_RECLAIMABLE
> >>
> >> Can you outline how this would work without separate caches?
> >
> > I thought that the cache would only maintain two sets of slab pages
> > depending on the allocation reuquests. I am pretty sure there will be
> > other details to iron out and
>
> For example the percpu (and other) array caches...
>
> > maybe it will turn out that such a large
> > portion of the chache would need to duplicate the state that a
> > completely new cache would be more reasonable.
>
> I'm afraid that's the case, yes.
I'm not sure it'll be so bad, at least for SLUB ... I think everything
we need to duplicate is already percpu, and if we combine GFP_DMA
and GFP_RECLAIMABLE into this, we might even get more savings. Also,
we only need to do this for the kmalloc slabs; currently 13 of them.
So we eliminate 13 caches and in return allocate 13 * 2 * NR_CPU pointers.
That'll be a win on some machines and a loss on others, but the machines
where it's consuming more memory should have more memory to begin with,
so I'd count it as a win.
The node partial list probably wants to be trebled in size to have one
list per memory type. But I think the allocation path only changes
like this:
@@ -2663,10 +2663,13 @@ static __always_inline void *slab_alloc_node(struct kmem
_cache *s,
struct kmem_cache_cpu *c;
struct page *page;
unsigned long tid;
+ unsigned int offset = 0;
s = slab_pre_alloc_hook(s, gfpflags);
if (!s)
return NULL;
if (s->flags & SLAB_KMALLOC)
offset = flags_to_slab_id(gfpflags);
redo:
/*
* Must read kmem_cache cpu data via this cpu ptr. Preemption is
@@ -2679,8 +2682,8 @@ static __always_inline void *slab_alloc_node(struct kmem_cache *s,
* to check if it is matched or not.
*/
do {
- tid = this_cpu_read(s->cpu_slab->tid);
- c = raw_cpu_ptr(s->cpu_slab);
+ tid = this_cpu_read((&s->cpu_slab[offset])->tid);
+ c = raw_cpu_ptr(&s->cpu_slab[offset]);
} while (IS_ENABLED(CONFIG_PREEMPT) &&
unlikely(tid != READ_ONCE(c->tid)));
> > Is this worth exploring
> > at least? I mean something like this should help with the fragmentation
> > already AFAIU. Accounting would be just free on top.
>
> Yep. It could be also CONFIG_urable so smaller systems don't need to
> deal with the memory overhead of this.
>
> So do we put it on LSF/MM agenda?
We have an agenda? :-)
Powered by blists - more mailing lists