lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6aae78fa-b505-0f76-087b-d8b2146c62f1@redhat.com>
Date:   Wed, 8 Jul 2020 09:16:01 +0200
From:   David Hildenbrand <david@...hat.com>
To:     Dan Williams <dan.j.williams@...el.com>
Cc:     Mike Rapoport <rppt@...ux.ibm.com>, Justin He <Justin.He@....com>,
        Michal Hocko <mhocko@...nel.org>,
        Catalin Marinas <Catalin.Marinas@....com>,
        Will Deacon <will@...nel.org>,
        Vishal Verma <vishal.l.verma@...el.com>,
        Dave Jiang <dave.jiang@...el.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Baoquan He <bhe@...hat.com>,
        Chuhong Yuan <hslester96@...il.com>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "linux-nvdimm@...ts.01.org" <linux-nvdimm@...ts.01.org>,
        Kaly Xin <Kaly.Xin@....com>
Subject: Re: [PATCH v2 1/3] arm64/numa: export memory_add_physaddr_to_nid as
 EXPORT_SYMBOL_GPL

On 08.07.20 09:04, Dan Williams wrote:
> On Tue, Jul 7, 2020 at 11:59 PM David Hildenbrand <david@...hat.com> wrote:
>>
>> On 08.07.20 08:22, Mike Rapoport wrote:
>>> On Tue, Jul 07, 2020 at 09:27:43PM -0700, Dan Williams wrote:
>>>> On Tue, Jul 7, 2020 at 9:08 PM Justin He <Justin.He@....com> wrote:
>>>> [..]
>>>>>> Especially for architectures that use memblock info for numa info
>>>>>> (which seems to be everyone except x86) why not implement a generic
>>>>>> memory_add_physaddr_to_nid() that does:
>>>>>>
>>>>>> int memory_add_physaddr_to_nid(u64 addr)
>>>>>> {
>>>>>>         unsigned long start_pfn, end_pfn, pfn = PHYS_PFN(addr);
>>>>>>         int nid;
>>>>>>
>>>>>>         for_each_online_node(nid) {
>>>>>>                 get_pfn_range_for_nid(nid, &start_pfn, &end_pfn);
>>>>>>                 if (pfn >= start_pfn && pfn <= end_pfn)
>>>>>>                         return nid;
>>>>>>         }
>>>>>>         return NUMA_NO_NODE;
>>>>>> }
>>>>>
>>>>> Thanks for your suggestion,
>>>>> Could I wrap the codes and let memory_add_physaddr_to_nid simply invoke
>>>>> phys_to_target_node()?
>>>>
>>>> I think it needs to be the reverse. phys_to_target_node() should call
>>>> memory_add_physaddr_to_nid() by default, but fall back to searching
>>>> reserved memory address ranges in memblock. See phys_to_target_node()
>>>> in arch/x86/mm/numa.c. That one uses numa_meminfo instead of memblock,
>>>> but the principle is the same i.e. that a target node may not be
>>>> represented in memblock.memory, but memblock.reserved. I'm working on
>>>> a patch to provide a function similar to get_pfn_range_for_nid() that
>>>> operates on reserved memory.
>>>
>>> Do we really need yet another memblock iterator?
>>> I think only x86 has memory that is not in memblock.memory but only in
>>> memblock.reserved.
>>
>> Reading about abusing the memblock allcoator once again in memory
>> hotplug paths makes me shiver.
> 
> Technical reasoning please?

ARCH_KEEP_MEMBLOCK is (AFAIK) only a hack for arm64 to implement
pfn_valid(), because they zap out individual pages corresponding to
memory holes of full sections.

I am not a friend of adding more post-init code to rely on memblock
data. It just makes it harder to eventually get rid of ARCH_KEEP_MEMBLOCK.

> 
> arm64 numa information is established from memblock data. It seems
> counterproductive to ignore that fact if we're already touching
> memory_add_physaddr_to_nid() and have a use case for a driver to call
> it.

... and we are trying to handle the "only a single dummy node" case
(patch #2), or what am I missing? What is there to optimize currently?

-- 
Thanks,

David / dhildenb

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ