[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210617115544.GB183559@KEI>
Date: Thu, 17 Jun 2021 20:55:44 +0900
From: Janghyuck Kim <janghyuck.kim@...sung.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Palmer Dabbelt <palmerdabbelt@...gle.com>,
Atish Patra <atish.patra@....com>,
Gavin Shan <gshan@...hat.com>,
Zhengyuan Liu <liuzhengyuan@...kylinos.cn>,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org
Subject: Re: [PATCH 1/2] mm: support fastpath if NUMA is enabled with numa
off
On Wed, Jun 16, 2021 at 06:32:50PM +0100, Matthew Wilcox wrote:
> On Wed, Jun 16, 2021 at 05:37:41PM +0900, Janghyuck Kim wrote:
> > Architecture might support fake node when CONFIG_NUMA is enabled but any
> > node settings were supported by ACPI or device tree. In this case,
> > getting memory policy during memory allocation path is meaningless.
> >
> > Moreover, performance degradation was observed in the minor page fault
> > test, which is provided by (https://protect2.fireeye.com/v1/url?k=c81407ae-978f3ea4-c8158ce1-0cc47a31384a-10187d5ead74c318&q=1&e=cbc91c9b-80e1-4ca0-b51a-9f79fad5b0c1&u=https%3A%2F%2Flkml.org%2Flkml%2F2006%2F8%2F29%2F294).
> > Average faults/sec of enabling NUMA with fake node was 5~6 % worse than
> > disabling NUMA. To reduce this performance regression, fastpath is
> > introduced. fastpath can skip the memory policy checking if NUMA is
> > enabled but it uses fake node. If architecture doesn't support fake
> > node, fastpath affects nothing for memory allocation path.
>
> This patch doesn't even apply to the current kernel, but putting that
> aside, what's the expensive part of the current code? That is,
> comparing performance stats between this numa_off enabled and numa_off
> disabled, where do you see taking a lot of time?
>
mempolicy related code that I skipped by this patch took a short time,
taking only a few tens of nanoseconds that difficult to measure by
sched_clock's degree of precision. But it can be affect the minor page
fault test with large buffer size, because one page fault handling takes
several ms. As I replied in previous mail, performance regression has
been reduced from 5~6% to 2~3%.
>
Powered by blists - more mailing lists