[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <df0e5f64-10bc-4c3c-a515-288a6f501065@redhat.com>
Date: Wed, 8 Jul 2020 11:25:36 +0200
From: David Hildenbrand <david@...hat.com>
To: Mike Rapoport <rppt@...nel.org>
Cc: Mike Rapoport <rppt@...ux.ibm.com>,
Dan Williams <dan.j.williams@...el.com>,
Michal Hocko <mhocko@...nel.org>, Jia He <justin.he@....com>,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>,
Vishal Verma <vishal.l.verma@...el.com>,
Dave Jiang <dave.jiang@...el.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Baoquan He <bhe@...hat.com>,
Chuhong Yuan <hslester96@...il.com>,
Linux ARM <linux-arm-kernel@...ts.infradead.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Linux MM <linux-mm@...ck.org>,
linux-nvdimm <linux-nvdimm@...ts.01.org>,
Kaly Xin <Kaly.Xin@....com>
Subject: Re: [PATCH v2 1/3] arm64/numa: export memory_add_physaddr_to_nid as
EXPORT_SYMBOL_GPL
On 08.07.20 11:15, Mike Rapoport wrote:
> On Wed, Jul 08, 2020 at 10:45:17AM +0200, David Hildenbrand wrote:
>> On 08.07.20 10:39, Mike Rapoport wrote:
>>> On Wed, Jul 08, 2020 at 10:26:41AM +0200, David Hildenbrand wrote:
>>>> On 08.07.20 09:50, Dan Williams wrote:
>>>>> On Wed, Jul 8, 2020 at 12:22 AM David Hildenbrand <david@...hat.com> wrote:
>>>>>>
>>>>>>>>>>>> On Tue 07-07-20 13:59:15, Jia He wrote:
>>>>>>>>>>>>> This exports memory_add_physaddr_to_nid() for module driver to use.
>>>>>>>>>>>>>
>>>>>>>>>>>>> memory_add_physaddr_to_nid() is a fallback option to get the nid in case
>>>>>>>>>>>>> NUMA_NO_NID is detected.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Suggested-by: David Hildenbrand <david@...hat.com>
>>>>>>>>>>>>> Signed-off-by: Jia He <justin.he@....com>
>>>>>>>>>>>>> ---
>>>>>>>>>>>>> arch/arm64/mm/numa.c | 5 +++--
>>>>>>>>>>>>> 1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>>>>>>>>
>>>>>>>>>>>>> diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
>>>>>>>>>>>>> index aafcee3e3f7e..7eeb31740248 100644
>>>>>>>>>>>>> --- a/arch/arm64/mm/numa.c
>>>>>>>>>>>>> +++ b/arch/arm64/mm/numa.c
>>>>>>>>>>>>> @@ -464,10 +464,11 @@ void __init arm64_numa_init(void)
>>>>>>>>>>>>>
>>>>>>>>>>>>> /*
>>>>>>>>>>>>> * We hope that we will be hotplugging memory on nodes we already know about,
>>>>>>>>>>>>> - * such that acpi_get_node() succeeds and we never fall back to this...
>>>>>>>>>>>>> + * such that acpi_get_node() succeeds. But when SRAT is not present, the node
>>>>>>>>>>>>> + * id may be probed as NUMA_NO_NODE by acpi, Here provide a fallback option.
>>>>>>>>>>>>> */
>>>>>>>>>>>>> int memory_add_physaddr_to_nid(u64 addr)
>>>>>>>>>>>>> {
>>>>>>>>>>>>> - pr_warn("Unknown node for memory at 0x%llx, assuming node 0\n", addr);
>>>>>>>>>>>>> return 0;
>>>>>>>>>>>>> }
>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(memory_add_physaddr_to_nid);
>>>>>>>>>>>>
>>>>>>>>>>>> Does it make sense to export a noop function? Wouldn't make more sense
>>>>>>>>>>>> to simply make it static inline somewhere in a header? I haven't checked
>>>>>>>>>>>> whether there is an easy way to do that sanely bu this just hit my eyes.
>>>
>>>> I'd be curious if what we are trying to optimize here is actually worth
>>>> optimizing. IOW, is there a well-known scenario where the dummy value on
>>>> arm64 would be problematic and is worth the effort?
>>>
>>> Well, it started with Michal's comment above that EXPORT_SYMBOL_GPL()
>>> for a stub might be an overkill.
>>>
>>> I think Jia's suggestion [1] with addition of a comment that explains
>>> why and when the stub will be used, can work for both
>>> memory_add_physaddr_to_nid() and phys_to_target_node().
>>
>> Agreed.
>>
>>>
>>> But on more theoretical/fundmanetal level, I think we lack a generic
>>> abstraction similar to e.g. x86 'struct numa_meminfo' that serves as
>>> translaton of firmware supplied information into data that can be used
>>> by the generic mm without need to reimplement it for each and every
>>> arch.
>>
>> Right. As I expressed, I am not a friend of using memblock for that, and
>> the pgdat node span is tricky.
>>
>> Maybe abstracting that x86 concept is possible in some way (and we could
>> restrict the information to boot-time properties, so we don't have to
>> mess with memory hot(un)plug - just as done for numa_meminfo AFAIKS).
>
> I agree with pgdat part and disagree about memblock. It already has
> non-init physmap, why won't we add memblock.memory to the mix? ;-)
Can we generalize and tweak physmap to contain node info? That's all we
need, no? (the special mem= parameter handling should not matter for our
use case, where "physmap" and "memory" would differ)
>
> Now, seriously, memblock already has all the necessary information about
> the coldplug memory for several architectures. x86 being an exception
> because for some reason the reserved memory is not considered memory
> there. The infrastructure for quiering and iterating memory regions is
> already there. We just need to leave out the irrelevant parts, like
> memblock.reserved and allocation funcions.
I *really* don't want to mess with memblocks on memory hot(un)plug on
x86 and s390x (+other architectures in the future). I also thought about
stopping to create memblocks for hotplugged memory on arm64, by tweaking
pfn_valid() to query memblocks only for early sections.
If "physmem" is not an option, can we at least introduce something like
ARCH_UPDTAE_MEMBLOCK_ON_HOTPLUG to avoid doing that on x86 and s390x for
now (and later maybe for others)?
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists