[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6260c66e-68a3-ab3e-4bd9-4a290d068e1f@linux.intel.com>
Date: Mon, 29 Jun 2020 09:57:42 -0700
From: Tim Chen <tim.c.chen@...ux.intel.com>
To: Andrew Morton <akpm@...ux-foundation.org>,
Matthew Wilcox <willy@...radead.org>
Cc: Vladimir Davydov <vdavydov@...tuozzo.com>,
Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...e.cz>,
Dave Hansen <dave.hansen@...el.com>,
Ying Huang <ying.huang@...el.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [Patch] mm: Increase pagevec size on large system
On 6/26/20 8:47 PM, Andrew Morton wrote:
> On Sat, 27 Jun 2020 04:13:04 +0100 Matthew Wilcox <willy@...radead.org> wrote:
>
>> On Fri, Jun 26, 2020 at 02:23:03PM -0700, Tim Chen wrote:
>>> Enlarge the pagevec size to 31 to reduce LRU lock contention for
>>> large systems.
>>>
>>> The LRU lock contention is reduced from 8.9% of total CPU cycles
>>> to 2.2% of CPU cyles. And the pmbench throughput increases
>>> from 88.8 Mpages/sec to 95.1 Mpages/sec.
>>
>> The downside here is that pagevecs are often stored on the stack (eg
>> truncate_inode_pages_range()) as well as being used for the LRU list.
>> On a 64-bit system, this increases the stack usage from 128 to 256 bytes
>> for this array.
>>
>> I wonder if we could do something where we transform the ones on the
>> stack to DECLARE_STACK_PAGEVEC(pvec), and similarly DECLARE_LRU_PAGEVEC
>> the ones used for the LRUs. There's plenty of space in the header to
>> add an unsigned char sz, delete PAGEVEC_SIZE and make it an variable
>> length struct.
>>
>> Or maybe our stacks are now big enough that we just don't care.
>> What do you think?
>
> And I wonder how useful CONFIG_NR_CPUS is for making this decision.
> Presumably a lot of general-purpose kernel builds have CONFIG_NR_CPUS
> much larger than the actual number of CPUs.
>
> I can't think of much of a fix for this, apart from making it larger on
> all kernels, Is there a downside to this?
>
Thanks for Matthew and Andrew's feedbacks.
I am okay with Matthew's suggestion of keeping the stack pagevec size unchanged.
Andrew, do you have a preference?
I was assuming that for people who really care about saving the kernel memory
usage, they would make CONFIG_NR_CPUS small. I also have a hard time coming
up with a better scheme.
Otherwise, we will have to adjust the pagevec size when we actually
found out how many CPUs we have brought online. It seems like a lot
of added complexity for going that route.
Tim
Powered by blists - more mailing lists