[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <da8288f7-f6c6-00da-fcbc-5c70c42359bc@linux.ibm.com>
Date: Tue, 20 Apr 2021 20:55:12 +0530
From: Pratik Sampat <psampat@...ux.ibm.com>
To: Dennis Zhou <dennis@...nel.org>
Cc: Roman Gushchin <guro@...com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 0/4] percpu: partial chunk depopulation
Hello Dennis,
On 20/04/21 8:09 pm, Dennis Zhou wrote:
> On Tue, Apr 20, 2021 at 04:37:02PM +0530, Pratik Sampat wrote:
>> On 20/04/21 4:27 am, Dennis Zhou wrote:
>>> On Mon, Apr 19, 2021 at 10:50:43PM +0000, Dennis Zhou wrote:
>>>> Hello,
>>>>
>>>> This series is a continuation of Roman's series in [1]. It aims to solve
>>>> chunks holding onto free pages by adding a reclaim process to the percpu
>>>> balance work item.
>>>>
>>>> The main difference is that the nr_empty_pop_pages is now managed at
>>>> time of isolation instead of intermixed. This helps with deciding which
>>>> chunks to free instead of having to interleave returning chunks to
>>>> active duty.
>>>>
>>>> The allocation priority is as follows:
>>>> 1) appropriate chunk slot increasing until fit
>>>> 2) sidelined chunks
>>>> 3) full free chunks
>>>>
>>>> The last slot for to_depopulate is never used for allocations.
>>>>
>>>> A big thanks to Roman for initiating the work and being available for
>>>> iterating on these ideas.
>>>>
>>>> This patchset contains the following 4 patches:
>>>> 0001-percpu-factor-out-pcpu_check_block_hint.patch
>>>> 0002-percpu-use-pcpu_free_slot-instead-of-pcpu_nr_slots-1.patch
>>>> 0003-percpu-implement-partial-chunk-depopulation.patch
>>>> 0004-percpu-use-reclaim-threshold-instead-of-running-for-.patch
>>>>
>>>> 0001 and 0002 are clean ups. 0003 implement partial chunk depopulation
>>>> initially from Roman. 0004 adds a reclaim threshold so we do not need to
>>>> schedule for every page freed.
>>>>
>>>> This series is on top of percpu$for-5.14 67c2669d69fb.
>>>>
>>>> diffstats below:
>>>>
>>>> Dennis Zhou (2):
>>>> percpu: use pcpu_free_slot instead of pcpu_nr_slots - 1
>>>> percpu: use reclaim threshold instead of running for every page
>>>>
>>>> Roman Gushchin (2):
>>>> percpu: factor out pcpu_check_block_hint()
>>>> percpu: implement partial chunk depopulation
>>>>
>>>> mm/percpu-internal.h | 5 +
>>>> mm/percpu-km.c | 5 +
>>>> mm/percpu-stats.c | 20 ++--
>>>> mm/percpu-vm.c | 30 ++++++
>>>> mm/percpu.c | 252 ++++++++++++++++++++++++++++++++++++++-----
>>>> 5 files changed, 278 insertions(+), 34 deletions(-)
>>>>
>>>> Thanks,
>>>> Dennis
>>> Hello Pratik,
>>>
>>> Do you mind testing this series again on POWER9? The base is available
>>> here:
>>> https://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu.git/log/?h=for-5.14
>>>
>>> Thanks,
>>> Dennis
>> Hello Dennis, I have tested this patchset on POWER9.
>>
>> I have tried variations of the percpu_test in the top level and nested cgroups
>> creation as the test with 1000:10 didn't show any benefits.
> This is most likely because the 1 in every 11 still pins every page
> while 1 in 50 does not. Can you try the patch below on top? I think it
> may show slightly better perf as well. If it doesn't I'll just drop it.
I did try it out, although my test spanned only across varying the inner cgroup
folders; it didn't seem to show any benefit over the previous test without the
patch for being being able to spawn as little memory cgroup folders and seeing
partial memory depopulation.
>> The following example shows more consistent benefits with the de-allocation
>> strategy.
>> Outer: 1000
>> Inner: 50
>> # ./percpu_test.sh
>> Percpu: 6912 kB
>> Percpu: 532736 kB
>> Percpu: 278784 kB
>>
>> I believe it could be a result of bulk freeing within "free_unref_page_commit",
>> where pages are only free'd if pcp->count >= pcp->high. As POWER has a larger
>> page size it would end up creating lesser number of pages but with the
>> effects of fragmentation.
> This is unrelated to per cpu pages in slab/slub. Percpu is a separate
> memory allocator.
>
You're right, I was actually referencing incorrect code.
>> Having said that, the patchset and its behavior does look good to me.
> Thanks, can I throw the following on the appropriate patches? In the
> future it's good to be explicit about this because some prefer to credit
> different emails.
>
> Tested-by: Pratik Sampat <psampat@...ux.ibm.com>
Sure thing please feel free to add a tested-by wherever you feel appropriate.
I'll be more explicit about them in the future. Thanks!
> Thanks,
> Dennis
>
> The following may do a little better on power9:
> ---
> From a1464c4d5900cca68fd95b935178d72bb74837d5 Mon Sep 17 00:00:00 2001
> From: Dennis Zhou <dennis@...nel.org>
> Date: Tue, 20 Apr 2021 14:25:20 +0000
> Subject: [PATCH] percpu: convert free page float to bytes
>
> The percpu memory allocator keeps around a minimum number of free pages
> to ensure we can satisfy atomic allocations. However, we've always kept
> this number in terms of pages. On certain architectures like arm and
> powerpc, the default page size could be 64k instead of 4k. So, start
> with a target number of free bytes and then convert to pages.
>
> Signed-off-by: Dennis Zhou <dennis@...nel.org>
> ---
> mm/percpu.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/mm/percpu.c b/mm/percpu.c
> index ba13e683d022..287fe3091244 100644
> --- a/mm/percpu.c
> +++ b/mm/percpu.c
> @@ -80,6 +80,7 @@
> #include <linux/mutex.h>
> #include <linux/percpu.h>
> #include <linux/pfn.h>
> +#include <linux/sizes.h>
> #include <linux/slab.h>
> #include <linux/spinlock.h>
> #include <linux/vmalloc.h>
> @@ -107,11 +108,12 @@
> /* chunks in slots below this are subject to being sidelined on failed alloc */
> #define PCPU_SLOT_FAIL_THRESHOLD 3
>
> -#define PCPU_EMPTY_POP_PAGES_LOW 2
> -#define PCPU_EMPTY_POP_PAGES_HIGH 4
> +#define PCPU_EMPTY_POP_PAGES_LOW (max_t(int, (SZ_8K) / PAGE_SIZE, 1))
> +#define PCPU_EMPTY_POP_PAGES_HIGH (max_t(int, (SZ_16K) / PAGE_SIZE, \
> + PCPU_EMPTY_POP_PAGES_LOW + 1))
>
> /* only schedule reclaim if there are at least N empty pop pages sidelined */
> -#define PCPU_EMPTY_POP_RECLAIM_THRESHOLD 4
> +#define PCPU_EMPTY_POP_RECLAIM_THRESHOLD (PCPU_EMPTY_POP_PAGES_HIGH)
>
> #ifdef CONFIG_SMP
> /* default addr <-> pcpu_ptr mapping, override in asm/percpu.h if necessary */
Thanks,
Pratik
Powered by blists - more mailing lists