[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADAEsF8AMf0U8JSf3Bhmm5xBf-XsQYMJijRXKZUYXYToAiW3oA@mail.gmail.com>
Date: Thu, 18 Dec 2014 09:50:20 +0800
From: Ganesh Mahendran <opensource.ganesh@...il.com>
To: Seth Jennings <sjennings@...iantweb.net>
Cc: Minchan Kim <minchan@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>, Nitin Gupta <ngupta@...are.org>,
Dan Streetman <ddstreet@...e.org>,
Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
Luigi Semenzato <semenzato@...gle.com>,
Jerome Marchand <jmarchan@...hat.com>, juno.choi@....com,
seungho1.park@....com
Subject: Re: [RFC 0/6] zsmalloc support compaction
2014-12-18 7:19 GMT+08:00 Seth Jennings <sjennings@...iantweb.net>:
> On Tue, Dec 02, 2014 at 11:49:41AM +0900, Minchan Kim wrote:
>> Recently, there was issue about zsmalloc fragmentation and
>> I got a report from Juno that new fork failed although there
>> are plenty of free pages in the system.
>> His investigation revealed zram is one of the culprit to make
>> heavy fragmentation so there was no more contiguous 16K page
>> for pgd to fork in the ARM.
>>
>> This patchset implement *basic* zsmalloc compaction support
>> and zram utilizes it so admin can do
>> "echo 1 > /sys/block/zram0/compact"
>>
>> Actually, ideal is that mm migrate code is aware of zram pages and
>> migrate them out automatically without admin's manual opeartion
>> when system is out of contiguous page. Howver, we need more thinking
>> before adding more hooks to migrate.c. Even though we implement it,
>> we need manual trigger mode, too so I hope we could enhance
>> zram migration stuff based on this primitive functions in future.
>>
>> I just tested it on only x86 so need more testing on other arches.
>> Additionally, I should have a number for zsmalloc regression
>> caused by indirect layering. Unfortunately, I don't have any
>> ARM test machine on my desk. I will get it soon and test it.
>> Anyway, before further work, I'd like to hear opinion.
>>
>> Pathset is based on v3.18-rc6-mmotm-2014-11-26-15-45.
>
> Hey Minchan, sorry it has taken a while for me to look at this.
>
> I have prototyped this for zbud to and I see you face some of the same
> issues, some of them much worse for zsmalloc like large number of
> objects to move to reclaim a page (with zbud, the max is 1).
>
> I see you are using zsmalloc itself for allocating the handles. Why not
> kmalloc()? Then you wouldn't need to track the handle_class stuff and
> adjust the class sizes (just in the interest of changing only what is
> need to achieve the functionality).
>
> I used kmalloc() but that is not without issue as the handles can be
> allocated from many slabs and any slab that contains a handle can't be
> freed, basically resulting in the handles themselves needing to be
> compacted, which they can't be because the user handle is a pointer to
> them.
>
> One way to fix this, but it would be some amount of work, is to have the
> user (zswap/zbud) provide the space for the handle to zbud/zsmalloc.
> The zswap/zbud layer knows the size of the device (i.e. handle space)
> and could allocate a statically sized vmalloc area for holding handles
> so they don't get spread all over memory. I haven't fully explored this
> idea yet.
>
> It is pretty limiting having the user trigger the compaction. Can we
> have a work item that periodically does some amount of compaction?
> Maybe also have something analogous to direct reclaim that, when
> zs_malloc fails to secure a new page, it will try to compact to get one?
> I understand this is a first step. Maybe too much.
Yes, User do not know when to do the compaction.
Actually, zsmalloc's responsibility is to keep the fragmentation in a low level.
How about dynamically monitoring the fragmentation and do the compaction when
there are too much fragmentation.
I am working on another patch to collect statistics of zsmalloc
objects. Maybe that will
be helpful for this.
Thanks.
>
> Also worth pointing out that the fullness groups are very coarse.
> Combining the objects from a ZS_ALMOST_EMPTY zspage and ZS_ALMOST_FULL
> zspage, might not result in very tight packing. In the worst case, the
> destination zspage would be slightly over 1/4 full (see
> fullness_threshold_frac)
>
> It also seems that you start with the smallest size classes first.
> Seems like if we start with the biggest first, we move fewer objects and
> reclaim more pages.
>
> It does add a lot of code :-/ Not sure if there is any way around that
> though if we want this functionality for zsmalloc.
>
> Seth
>
>>
>> Thanks.
>>
>> Minchan Kim (6):
>> zsmalloc: expand size class to support sizeof(unsigned long)
>> zsmalloc: add indrection layer to decouple handle from object
>> zsmalloc: implement reverse mapping
>> zsmalloc: encode alloced mark in handle object
>> zsmalloc: support compaction
>> zram: support compaction
>>
>> drivers/block/zram/zram_drv.c | 24 ++
>> drivers/block/zram/zram_drv.h | 1 +
>> include/linux/zsmalloc.h | 1 +
>> mm/zsmalloc.c | 596 +++++++++++++++++++++++++++++++++++++-----
>> 4 files changed, 552 insertions(+), 70 deletions(-)
>>
>> --
>> 2.0.0
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@...ck.org. For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@...ck.org"> email@...ck.org </a>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists