[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cf736b49-57e3-51df-56af-5b71d0304e4a@redhat.com>
Date: Mon, 21 Mar 2022 15:17:54 -0400
From: Prarit Bhargava <prarit@...hat.com>
To: Justin Forbes <jforbes@...oraproject.org>,
Yu Zhao <yuzhao@...gle.com>
Cc: Andi Kleen <ak@...ux.intel.com>, kernel-team@...ts.ubuntu.com,
Vaibhav Jain <vaibhav@...ux.ibm.com>,
Rik van Riel <riel@...riel.com>,
Mel Gorman <mgorman@...e.de>,
Catalin Marinas <catalin.marinas@....com>,
Johannes Weiner <hannes@...xchg.org>,
Aneesh Kumar <aneesh.kumar@...ux.ibm.com>,
Brian Geffon <bgeffon@...gle.com>,
"open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
Jesse Barnes <jsbarnes@...gle.com>,
Sofia Trinh <sofia.trinh@....works>,
"Huang, Ying" <ying.huang@...el.com>,
linux-kernel <linux-kernel@...r.kernel.org>,
Steven Barrett <steven@...uorix.net>,
Shuang Zhai <szhai2@...rochester.edu>,
Donald Carr <d@...os-reins.com>,
Oleksandr Natalenko <oleksandr@...alenko.name>,
Holger Hoffstätte <holger@...lied-asynchrony.com>,
Will Deacon <will@...nel.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Jonathan Corbet <corbet@....net>,
Mike Rapoport <rppt@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Jens Axboe <axboe@...nel.dk>, Hillf Danton <hdanton@...a.com>,
Michal Hocko <mhocko@...nel.org>,
kernel <kernel@...ts.fedoraproject.org>,
Suleiman Souhlal <suleiman@...gle.com>,
Daniel Byrne <djbyrne@....edu>,
the arch/x86 maintainers <x86@...nel.org>,
Konstantin Kharlamov <Hi-Angel@...dex.ru>,
Matthew Wilcox <willy@...radead.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Michael Larabel <Michael@...haellarabel.com>,
Linux-MM <linux-mm@...ck.org>,
Kernel Page Reclaim v2 <page-reclaim@...gle.com>,
Jan Alexander Steffens <heftig@...hlinux.org>,
Linux ARM <linux-arm-kernel@...ts.infradead.org>
Subject: Re: [PATCH v9 05/14] mm: multi-gen LRU: groundwork
On 3/21/22 14:58, Justin Forbes wrote:
> On Mon, Mar 14, 2022 at 4:30 AM Yu Zhao <yuzhao@...gle.com> wrote:
>>
>> On Mon, Mar 14, 2022 at 2:09 AM Huang, Ying <ying.huang@...el.com> wrote:
>>>
>>> Hi, Yu,
>>>
>>> Yu Zhao <yuzhao@...gle.com> writes:
>>>> diff --git a/mm/Kconfig b/mm/Kconfig
>>>> index 3326ee3903f3..747ab1690bcf 100644
>>>> --- a/mm/Kconfig
>>>> +++ b/mm/Kconfig
>>>> @@ -892,6 +892,16 @@ config ANON_VMA_NAME
>>>> area from being merged with adjacent virtual memory areas due to the
>>>> difference in their name.
>>>>
>>>> +# the multi-gen LRU {
>>>> +config LRU_GEN
>>>> + bool "Multi-Gen LRU"
>>>> + depends on MMU
>>>> + # the following options can use up the spare bits in page flags
>>>> + depends on !MAXSMP && (64BIT || !SPARSEMEM || SPARSEMEM_VMEMMAP)
>>>
>>> LRU_GEN depends on !MAXSMP. So, What is the maximum NR_CPUS supported
>>> by LRU_GEN?
>>
>> LRU_GEN doesn't really care about NR_CPUS. IOW, it doesn't impose a
>> max number. The dependency is with NODES_SHIFT selected by MAXSMP:
>> default "10" if MAXSMP
>> This combined with LAST_CPUPID_SHIFT can exhaust the spare bits in page flags.
>>
>> MAXSMP is meant for kernel developers to test their code, and it
>> should not be used in production [1]. But some distros unfortunately
>> ship kernels built with this option, e.g., Fedora and Ubuntu. And
>> their users reported build errors to me after they applied MGLRU on
>> those kernels ("Not enough bits in page flags"). Let me add Fedora and
>> Ubuntu to this thread.
>>
>> Fedora and Ubuntu,
>>
>> Could you please clarify if there is a reason to ship kernels built
>> with MAXSMP? Otherwise, please consider disabling this option. Thanks.
>>
>> As per above, MAXSMP enables ridiculously large numbers of CPUs and
>> NUMA nodes for testing purposes. It is detrimental to performance,
>> e.g., CPUMASK_OFFSTACK.
>
> It was enabled for Fedora, and RHEL because we did need more than 512
> CPUs, originally only in RHEL until SGI (years ago) complained that
> they were testing very large machines with Fedora. The testing done
> on RHEL showed that the performance impact was minimal. For a very
> long time we had MAXSMP off and carried a patch which allowed us to
> turn on CPUMASK_OFFSTACK without debugging because there was supposed
> to be "something else" coming. In 2019 we gave up, dropped that patch
> and just turned on MAXSMP.
>
> I do not have any metrics for how often someone runs Fedora on a
> ridiculously large machine these days, but I would guess that number
> is not 0.
It is not 0. I've seen data from large systems (1000+ logical threads)
that are running Fedora albeit with a modified Fedora kernel.
Additionally the max limit for CPUS in RHEL is 1792, however, we have
recently had a request to *double* that to 3584. You should just assume
that number will continue to increase.
P.
>
> Justin
>
>> [1] https://lore.kernel.org/lkml/20131106055634.GA24044@gmail.com/
>>
> _______________________________________________
> kernel mailing list -- kernel@...ts.fedoraproject.org
> To unsubscribe send an email to kernel-leave@...ts.fedoraproject.org
> Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives: https://lists.fedoraproject.org/archives/list/kernel@lists.fedoraproject.org
> Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Powered by blists - more mailing lists