lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87bjzalhzc.fsf@draig.linaro.org>
Date: Wed, 23 Oct 2024 20:47:03 +0100
From: Alex Bennée <alex.bennee@...aro.org>
To: "Arnd Bergmann" <arnd@...db.de>
Cc: "Naresh Kamboju" <naresh.kamboju@...aro.org>,  "open list"
 <linux-kernel@...r.kernel.org>,  "Linux ARM"
 <linux-arm-kernel@...ts.infradead.org>,  lkft-triage@...ts.linaro.org,
  "Linux Regressions" <regressions@...ts.linux.dev>,
  qemu-devel@...gnu.org,  "Mark Brown" <broonie@...nel.org>,  "Catalin
 Marinas" <catalin.marinas@....com>,  "Aishwarya TCV"
 <Aishwarya.TCV@....com>,  "Peter Maydell" <peter.maydell@...aro.org>,
  "Anders Roxell" <anders.roxell@...aro.org>,  "Vincenzo Frascino"
 <vincenzo.frascino@....com>,  "Thomas Gleixner" <tglx@...utronix.de>,
  "Geert Uytterhoeven" <geert@...ux-m68k.org>
Subject: Re: Qemu v9.0.2: Boot failed qemu-arm with Linux next-20241017 tag.

"Arnd Bergmann" <arnd@...db.de> writes:

> On Sun, Oct 20, 2024, at 17:39, Naresh Kamboju wrote:
>> On Fri, 18 Oct 2024 at 12:35, Naresh Kamboju <naresh.kamboju@...aro.org> wrote:
>>>
>>> The QEMU-ARMv7 boot has failed with the Linux next-20241017 tag.
>>> The boot log is incomplete, and no kernel crash was detected.
>>> However, the system did not proceed far enough to reach the login prompt.
>>>
>
>> Anders bisected this boot regressions and found,
>> # first bad commit:
>>   [efe8419ae78d65e83edc31aad74b605c12e7d60c]
>>     vdso: Introduce vdso/page.h
>>
>> We are investigating the reason for boot failure due to this commit.
>
> Anders and I did the analysis on this, the problem turned out
> to be the early_init_dt_add_memory_arch() function in
> drivers/of/fdt.c, which does bitwise operations on PAGE_MASK
> with a 'u64' instead of phys_addr_t:
>
> void __init __weak early_init_dt_add_memory_arch(u64 base, u64 size)
> {
>         const u64 phys_offset = MIN_MEMBLOCK_ADDR;
>  
>         if (size < PAGE_SIZE - (base & ~PAGE_MASK)) {
>                 pr_warn("Ignoring memory block 0x%llx - 0x%llx\n",
>                         base, base + size);
>                 return;
>         }
>
>         if (!PAGE_ALIGNED(base)) {
>                 size -= PAGE_SIZE - (base & ~PAGE_MASK);
>                 base = PAGE_ALIGN(base);
>         }
>
> On non-LPAE arm32, this broke the existing behavior for
> large 32-bit memory sizes. The obvious fix is to change
> back the PAGE_MASK definition for 32-bit arm to a signed
> number.

Agreed. However I think we were masking a calling issue that:

    /* Actual RAM size depends on initial RAM and device memory settings */
    [VIRT_MEM] =                { GiB, LEGACY_RAMLIMIT_BYTES },

And:

  -m 4G

make no sense with no ARM_LPAE (which the kernel didn't have) but if you
pass -machine virt,gic-version=3,highmem=off (the default changed awhile
back) you will get a warning:

  qemu-system-arm: Addressing limited to 32 bits, but memory exceeds it by 1073741824 bytes

but I guess that didn't trigger for some reason before this patch?

> mips32, ppc32 and hexagon had the same definition as
> well, so I think we should change at least those in order
> to restore the previous behavior in case they are affected
> by the same bug (or a different one).
>
> x86-32 and arc git flipped the other way by the patch,
> from unsigned to signed, when CONFIG_ARC_HAS_PAE40
> or CONFIG_X86_PAE are set. I think we should keep
> the 'signed' behavior as this was a bugfix by itself,
> but we may want to change arc and x86-32 with short
> phys_addr_t the same way for consistency.
>
> On csky, m68k, microblaze, nios2, openrisc, parisc32,
> riscv32, sh, sparc32, um and xtensa, we've always used
> the 'unsigned' PAGE_MASK, and there is no 64-bit
> phys_addr_t, so I would lean towards staying with
> 'unsigned' in order to not introduce a regression.
> Alternatively we could choose to go with the 'signed'
> version on all 32-bit architectures unconditionally
> for consistency. Any preferences?
>
>       Arnd

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ