lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <e5d9423e-5a61-4fbe-b971-52e4283c1afd@redhat.com>
Date:   Tue, 31 Oct 2023 11:13:14 +0100
From:   David Hildenbrand <david@...hat.com>
To:     "Verma, Vishal L" <vishal.l.verma@...el.com>,
        "Williams, Dan J" <dan.j.williams@...el.com>,
        "Jiang, Dave" <dave.jiang@...el.com>,
        "osalvador@...e.de" <osalvador@...e.de>,
        "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>
Cc:     "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
        "Huang, Ying" <ying.huang@...el.com>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "aneesh.kumar@...ux.ibm.com" <aneesh.kumar@...ux.ibm.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-cxl@...r.kernel.org" <linux-cxl@...r.kernel.org>,
        "Hocko, Michal" <mhocko@...e.com>,
        "nvdimm@...ts.linux.dev" <nvdimm@...ts.linux.dev>,
        "jmoyer@...hat.com" <jmoyer@...hat.com>,
        "Jonathan.Cameron@...wei.com" <Jonathan.Cameron@...wei.com>
Subject: Re: [PATCH v7 2/3] mm/memory_hotplug: split memmap_on_memory requests
 across memblocks

On 31.10.23 03:14, Verma, Vishal L wrote:
> On Mon, 2023-10-30 at 11:20 +0100, David Hildenbrand wrote:
>> On 26.10.23 00:44, Vishal Verma wrote:
>>>
> [..]
> 
>>> @@ -2146,11 +2186,69 @@ void try_offline_node(int nid)
>>>    }
>>>    EXPORT_SYMBOL(try_offline_node);
>>>    
>>> -static int __ref try_remove_memory(u64 start, u64 size)
>>> +static void __ref remove_memory_blocks_and_altmaps(u64 start, u64 size)
>>>    {
>>> -       struct memory_block *mem;
>>> -       int rc = 0, nid = NUMA_NO_NODE;
>>> +       unsigned long memblock_size = memory_block_size_bytes();
>>>          struct vmem_altmap *altmap = NULL;
>>> +       struct memory_block *mem;
>>> +       u64 cur_start;
>>> +       int rc;
>>> +
>>> +       /*
>>> +        * For memmap_on_memory, the altmaps could have been added on
>>> +        * a per-memblock basis. Loop through the entire range if so,
>>> +        * and remove each memblock and its altmap.
>>> +        */
>>
>> /*
>>    * altmaps where added on a per-memblock basis; we have to process
>>    * each individual memory block.
>>    */
>>
>>> +       for (cur_start = start; cur_start < start + size;
>>> +            cur_start += memblock_size) {
>>> +               rc = walk_memory_blocks(cur_start, memblock_size, &mem,
>>> +                                       test_has_altmap_cb);
>>> +               if (rc) {
>>> +                       altmap = mem->altmap;
>>> +                       /*
>>> +                        * Mark altmap NULL so that we can add a debug
>>> +                        * check on memblock free.
>>> +                        */
>>> +                       mem->altmap = NULL;
>>> +               }
>>
>> Simpler (especially, we know that there must be an altmap):
>>
>> mem = find_memory_block(pfn_to_section_nr(cur_start));
>> altmap = mem->altmap;
>> mem->altmap = NULL;
>>
>> I think we might be able to remove test_has_altmap_cb() then.
>>
>>> +
>>> +               remove_memory_block_devices(cur_start, memblock_size);
>>> +
>>> +               arch_remove_memory(cur_start, memblock_size, altmap);
>>> +
>>> +               /* Verify that all vmemmap pages have actually been freed. */
>>> +               if (altmap) {
>>
>> There must be an altmap, so this can be done unconditionally.
> 
> Hi David,

Hi!

> 
> All other comments make sense, making those changes now.
> 
> However for this one, does the WARN() below go away then?
> 
> I was wondering if maybe arch_remove_memory() is responsible for
> freeing the altmap here, and at this stage we're just checking if that
> happened. If it didn't WARN and then free it.

I think that has to stay, to make sure arch_remove_memory() did the 
right thing and we don't -- by BUG -- still have some altmap pages in 
use after they should have been completely freed.

> 
> I drilled down the path, and I don't see altmap actually getting freed
> in vmem_altmap_free(), but I wasn't sure if <something else> was meant
> to free it as altmap->alloc went down to 0.


vmem_altmap_free() does the "altmap->alloc -= nr_pfns", which is called 
when arch_remove_memory() frees the vmemmap pages and detects that they 
actually come from the altmap reserve and not from the buddy/earlyboot 
allocator.

Freeing an altmap is just unaccounting it in the altmap structure; and 
here we make sure that we are actually back down to 0 and don't have 
some weird altmap freeing BUG in arch_remove_memory().

-- 
Cheers,

David / dhildenb

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ