lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZiAkRMUfiPDUGPdL@kernel.org>
Date: Wed, 17 Apr 2024 22:34:28 +0300
From: Mike Rapoport <rppt@...nel.org>
To: Nam Cao <namcao@...utronix.de>
Cc: Matthew Wilcox <willy@...radead.org>,
	Björn Töpel <bjorn@...nel.org>,
	Christian Brauner <brauner@...nel.org>,
	Andreas Dilger <adilger@...ger.ca>,
	Al Viro <viro@...iv.linux.org.uk>,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>,
	Jan Kara <jack@...e.cz>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	linux-riscv@...ts.infradead.org, Theodore Ts'o <tytso@....edu>,
	Ext4 Developers List <linux-ext4@...r.kernel.org>,
	Conor Dooley <conor@...nel.org>,
	Anders Roxell <anders.roxell@...aro.org>,
	Alexandre Ghiti <alex@...ti.fr>
Subject: Re: riscv32 EXT4 splat, 6.8 regression?

On Wed, Apr 17, 2024 at 12:36:39AM +0200, Nam Cao wrote:
> On 2024-04-16 Mike Rapoport wrote:
> > On Tue, Apr 16, 2024 at 06:00:29PM +0100, Matthew Wilcox wrote:
> > > On Tue, Apr 16, 2024 at 07:31:54PM +0300, Mike Rapoport wrote:
> > > > > -	if (!IS_ENABLED(CONFIG_64BIT)) {
> > > > > -		max_mapped_addr = __pa(~(ulong)0);
> > > > > -		if (max_mapped_addr == (phys_ram_end - 1))
> > > > > -			memblock_set_current_limit(max_mapped_addr - 4096);
> > > > > -	}
> > > > > +	memblock_reserve(__pa(-PAGE_SIZE), PAGE_SIZE);
> > > > 
> > > > Ack.
> > > 
> > > Can this go to generic code instead of letting architecture maintainers
> > > fall over it?
> > 
> > Yes, it's just have to happen before setup_arch() where most architectures
> > enable memblock allocations.
> 
> This also works, the reported problem disappears.
> 
> However, I am confused about one thing: doesn't this make one page of
> physical memory inaccessible?
> 
> Is it better to solve this by setting max_low_pfn instead? Then at
> least the page is still accessible as high memory.

It could be if riscv had support for HIGHMEM.
 
> Best regards,
> Nam
> 
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index fa34cf55037b..6e3130cae675 100644
> --- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -197,7 +197,6 @@ early_param("mem", early_mem);
>  static void __init setup_bootmem(void)
>  {
>  	phys_addr_t vmlinux_end = __pa_symbol(&_end);
> -	phys_addr_t max_mapped_addr;
>  	phys_addr_t phys_ram_end, vmlinux_start;
>  
>  	if (IS_ENABLED(CONFIG_XIP_KERNEL))
> @@ -235,23 +234,9 @@ static void __init setup_bootmem(void)
>  	if (IS_ENABLED(CONFIG_64BIT))
>  		kernel_map.va_pa_offset = PAGE_OFFSET - phys_ram_base;
>  
> -	/*
> -	 * memblock allocator is not aware of the fact that last 4K bytes of
> -	 * the addressable memory can not be mapped because of IS_ERR_VALUE
> -	 * macro. Make sure that last 4k bytes are not usable by memblock
> -	 * if end of dram is equal to maximum addressable memory.  For 64-bit
> -	 * kernel, this problem can't happen here as the end of the virtual
> -	 * address space is occupied by the kernel mapping then this check must
> -	 * be done as soon as the kernel mapping base address is determined.
> -	 */
> -	if (!IS_ENABLED(CONFIG_64BIT)) {
> -		max_mapped_addr = __pa(~(ulong)0);
> -		if (max_mapped_addr == (phys_ram_end - 1))
> -			memblock_set_current_limit(max_mapped_addr - 4096);

To be precisely strict about the conflict between mapping a page at
0xfffff000 and IS_ERR_VALUE, memblock should not allocate the that page, so
memblock_set_current_limit() should remain. It does not need all the
surrounding if, though just setting the limit for -PAGE_SIZE should do.

Although I suspect that this call to memblock_set_current_limit() is what
caused the splat in ext4. Without that limit enforcement, the last page
would be the first one memblock allocates and it most likely would have
ended in the kernel page tables and would never be checked for IS_ERR. With
the limit set that page made it to the buddy and got allocated by the code
that actually does IS_ERR checks.

> -	}
> -
>  	min_low_pfn = PFN_UP(phys_ram_base);
> -	max_low_pfn = max_pfn = PFN_DOWN(phys_ram_end);
> +	max_pfn = PFN_DOWN(phys_ram_end);
> +	max_low_pfn = min(max_pfn, PFN_DOWN(__pa(-PAGE_SIZE)));
>  	high_memory = (void *)(__va(PFN_PHYS(max_low_pfn)));
>  
>  	dma32_phys_limit = min(4UL * SZ_1G, (unsigned long)PFN_PHYS(max_low_pfn));

-- 
Sincerely yours,
Mike.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ