lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 21 Nov 2023 20:24:48 +0100
From:   David Hildenbrand <david@...hat.com>
To:     Sumanth Korikkar <sumanthk@...ux.ibm.com>
Cc:     Gerald Schaefer <gerald.schaefer@...ux.ibm.com>,
        linux-mm <linux-mm@...ck.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Oscar Salvador <osalvador@...e.de>,
        Michal Hocko <mhocko@...e.com>,
        "Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>,
        Anshuman Khandual <anshuman.khandual@....com>,
        Alexander Gordeev <agordeev@...ux.ibm.com>,
        Heiko Carstens <hca@...ux.ibm.com>,
        Vasily Gorbik <gor@...ux.ibm.com>,
        linux-s390 <linux-s390@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 0/8] implement "memmap on memory" feature on s390

On 21.11.23 14:13, Sumanth Korikkar wrote:
> On Fri, Nov 17, 2023 at 04:37:29PM +0100, David Hildenbrand wrote:
>>>
>>> Maybe there is also already a common code bug with that, s390 might be
>>> special but that is often also good for finding bugs in common code ...
>>
>> If it's only the page_init_poison() as noted by Sumanth, we could disable
>> that on s390x with an altmap some way or the other; should be possible.
>>
>> I mean, you effectively have your own poisoning if the altmap is effectively
>> inaccessible and makes your CPU angry on access :)
>>
>> Last but not least, support for an inaccessible altmap might come in handy
>> for virtio-mem eventually, and make altmap support eventually simpler. So
>> added bonus points.
> 
> We tried out two possibilities dealing with vmemmap altmap inaccessibilty.
> Approach 1: Add MHP_ALTMAP_INACCESSIBLE flag and pass it in add_memory()
> 
> diff --git a/drivers/s390/char/sclp_cmd.c b/drivers/s390/char/sclp_cmd.c
> index 075094ca59b4..ab2dfcc7e9e4 100644
> --- a/drivers/s390/char/sclp_cmd.c
> +++ b/drivers/s390/char/sclp_cmd.c
> @@ -358,6 +358,13 @@ static int sclp_mem_notifier(struct notifier_block *nb,
>   		 * buddy allocator later.
>   		 */
>   		__arch_set_page_nodat((void *)__va(start), memory_block->altmap->free);
> +		/*
> +		 * Poison the struct pages after memory block is accessible.
> +		 * This is needed for only altmap. Without altmap, the struct
> +		 * pages are poisoined in sparse_add_section().
> +		 */
> +		if (memory_block->altmap->inaccessible)
> +			page_init_poison(pfn_to_page(arg->start_pfn), memory_block->altmap->free);

See below, likely it's best if the core will do that.

>   		break;
>   	case MEM_FINISH_OFFLINE:
>   		sclp_mem_change_state(start, size, 0);
> @@ -412,7 +419,7 @@ static void __init add_memory_merged(u16 rn)
>   		goto skip_add;
>   	for (addr = start; addr < start + size; addr += block_size)
>   		add_memory(0, addr, block_size,
> -			   MACHINE_HAS_EDAT1 ? MHP_MEMMAP_ON_MEMORY : MHP_NONE);
> +			   MACHINE_HAS_EDAT1 ? MHP_MEMMAP_ON_MEMORY|MHP_ALTMAP_INACCESSIBLE : MHP_NONE);
>   skip_add:
>   	first_rn = rn;
>   	num = 1;
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 7d2076583494..5c70707e706f 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -106,6 +106,11 @@ typedef int __bitwise mhp_t;
>    * implies the node id (nid).
>    */
>   #define MHP_NID_IS_MGID		((__force mhp_t)BIT(2))
> +/*
> + * Mark memmap on memory (struct pages array) as inaccessible during memory
> + * hotplug addition phase.
> + */
> +#define MHP_ALTMAP_INACCESSIBLE	((__force mhp_t)BIT(3))

If we go that path, maybe rather MHP_OFFLINE_INACCESSIBLE and document 
how this relates to/interacts with the memory notifier callbacks and the 
altmap.

Then, we can logically abstract this from altmap handling. Simply, the 
memory should not be read/written before the memory notifier was called.

In the core, you can do the poisioning for the altmap in that case after 
calling the notifier, probably in mhp_init_memmap_on_memory() as you do 
below.

>   
>   /*
>    * Extended parameters for memory hotplug:
> diff --git a/include/linux/memremap.h b/include/linux/memremap.h
> index 744c830f4b13..9837f3e6fb95 100644
> --- a/include/linux/memremap.h
> +++ b/include/linux/memremap.h
> @@ -25,6 +25,7 @@ struct vmem_altmap {
>   	unsigned long free;
>   	unsigned long align;
>   	unsigned long alloc;
> +	bool inaccessible;

We should then likely remember that information for the memory block, 
not the altmap.

[...]

> 
> 
> Approach 2:
> ===========
> Shouldnt kasan zero shadow mapping performed first before
> accessing/initializing memmap via page_init_poisining()?  If that is

Likely the existing way. The first access to the poisoned memmap should 
be a write, not a read. But I didn't look into the details.

Can you try reproducing?

> true, then it is a problem for all architectures and should could be
> fixed like:
> 
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 7a5fc89a8652..eb3975740537 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1093,6 +1093,7 @@ int mhp_init_memmap_on_memory(unsigned long pfn, unsigned long nr_pages,
>   	if (ret)
>   		return ret;
> 
> +	page_init_poison(pfn_to_page(pfn), sizeof(struct page) * nr_pages);
>   	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_UNMOVABLE);
> 
>   	for (i = 0; i < nr_pages; i++)
> diff --git a/mm/sparse.c b/mm/sparse.c
> index 77d91e565045..4ddf53f52075 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -906,8 +906,11 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn,
>   	/*
>   	 * Poison uninitialized struct pages in order to catch invalid flags
>   	 * combinations.
> +	 * For altmap, do this later when onlining the memory, as it might
> +	 * not be accessible at this point.
>   	 */
> -	page_init_poison(memmap, sizeof(struct page) * nr_pages);
> +	if (!altmap)
> +		page_init_poison(memmap, sizeof(struct page) * nr_pages);
> 
>   	ms = __nr_to_section(section_nr);
>   	set_section_nid(section_nr, nid);
> 

That's too generic when it comes to other altmap users, especially DAX 
or when the altmap is accessible while memory is offlining, and we want 
to catch that.

> 
> 
> Also, if this approach is taken, should page_init_poison() be performed
> with cond_resched() as mentioned in commit d33695b16a9f
> ("mm/memory_hotplug: poison memmap in remove_pfn_range_from_zone()") ?

Likely as soon as possible after it is accessible :)


-- 
Cheers,

David / dhildenb

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ