lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <10D18276-0485-4368-BFDE-4EC13E42AE22@lca.pw>
Date:   Wed, 22 Apr 2020 14:35:26 -0400
From:   Qian Cai <cai@....pw>
To:     Christoph Hellwig <hch@....de>
Cc:     Borislav Petkov <bp@...e.de>,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>,
        x86 <x86@...nel.org>, LKML <linux-kernel@...r.kernel.org>,
        kasan-dev <kasan-dev@...glegroups.com>
Subject: Re: AMD boot woe due to "x86/mm: Cleanup pgprot_4k_2_large() and
 pgprot_large_2_4k()"



> On Apr 22, 2020, at 1:01 PM, Christoph Hellwig <hch@....de> wrote:
> 
> On Wed, Apr 22, 2020 at 11:55:54AM -0400, Qian Cai wrote:
>> Reverted the linux-next commit and its dependency,
>> 
>> a85573f7e741 ("x86/mm: Unexport __cachemode2pte_tbl”)
>> 9e294786c89a (“x86/mm: Cleanup pgprot_4k_2_large() and pgprot_large_2_4k()”)
>> 
>> fixed crashes or hard reset on AMD machines during boot that have been flagged by
>> KASAN in different forms indicating some sort of memory corruption with this config,
> 
> Interesting.  Your config seems to boot fine in my VM until the point
> where the lack of virtio-blk support stops it from mounting the root
> file system.
> 
> Looking at the patch I found one bug, although that should not affect
> your config (it should use the pgprotval_t type), and one difference
> that could affect code generation, although I prefer the new version
> (use of __pgprot vs a local variable + pgprot_val()).
> 
> Two patches attached, can you try them?
> <0001-x86-Use-pgprotval_t-in-protval_4k_2_large-and-pgprot.patch><0002-foo.patch>

Yes, but both patches do not help here. This time flagged by UBSAN,

static void dump_pagetable(unsigned long address)
{
        pgd_t *base = __va(read_cr3_pa());
        pgd_t *pgd = base + pgd_index(address); <—— shift-out-of-bounds here

[    4.452663][    T0] ACPI: LAPIC_NMI (acpi_id[0x73] high level lint[0x1])
[    4.459391][    T0] ACPI: LAPIC_NMI (acpi_id[0x74] high level lint[0x1])
[    4.466115][    T0] ACPI: LAPIC_NMI (acpi_id[0x75] high level lint[0x1])
[    4.472842][    T0] ACPI: LAPIC_NMI (acpi_id[0x76] high level lint[0x1])
[    4.479567][    T0] ACPI: LAPIC_NMI (acpi_id[0x77] high level lint[0x1])
[    4.486294][    T0] ACPI: LAPIC_NMI (acpi_id[0x78] high level lint[0x1])
[    4.493021][    T0] ACPI: LAPIC_NMI (acpi_id[0x79] high level lint[0x1])
[    4.499745][    T0] ACPI: LAPIC_NMI (acpi_id[0x7a] high level lint[0x1])
[    4.506471][    T0] ACPI: LAPIC_NMI (acpi_id[0x7b] high level liad access in kernel mode
[    4.901030][    T0] #PF: error_code(0x0000) - not-present page
[    4.906884][    T0] BUG: unable to handle page fault for address: ffffed11509c29da
[    4.914483][    T0] #PF: supervisor read access in kernel mode
[    4.920334][    T0] #PF: error_code(0x0000) - not-present page
[    4.926189][    T0] BUG: unable to handle page fault for address: ffffed11509c29da
[    4.933786][    T0] #PF: supervisor read access in kernel mode
[    4.939640][    T0] #PF: error_code(0x0000) - not-present page
[    4.945492][    T0] BUG: unable to handle page fault for address: ffffed11509c29da
[    4.953091][    T0] #PF: supervisor read access in kernel mode
[    4.958943][    T0] #PF: error_code(0x0000) - not-present page
[    4.964797][    T0] BUG: unable to handle page fault for address: ffffed11509c29da
[    4.972395][    T0] #PF: supervisor read access in kernel mode
[    4.978247][    T0] #PF: error_code(0x0000) - not-present page
[    4.984102][    T0] BUG: unable to handle page fault for address: ffffed11509c29da
[    4.9917age fault for address: ffffed11509c29da
[    5.481007][    T0] #PF: supervisor read access in kernel mode
[    5.486862][    T0] #PF: error_code(0x0000) - not-present page
[    5.492713][    T0] BUG: unable to handle page fault for address: ffffed11509c29da
[    5.500314][    T0] #PF: supervisor read access in kernel mode
[    5.506165][    T0] #PF: error_code(0x0000) - not-present page
[    5.512020][    T0] ================================================================================
[    5.521193][    T0] UBSAN: shift-out-of-bounds in arch/x86/mm/fault.c:450:22
[    5.528268][    T0] shift exponent 4294967295 is too large for 64-bit type 'long unsigned int'
[    5.536916][    T0] CPU: 0 PID: 0 Comm: swapper Tainted: G    B             5.7.0-rc2-next-20200422+ #10
[    5.546434][    T0] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
[    5.555692][    T0] Call Trace:
[    5.558837][    T0] ================================================================================
[    5.568012][T0] BUG: unable to handle page fault for address: 0000000a2b84dda8
[    5.961699][    T0] #PF: supervisor read access in kernel mode
[    5.967550][    T0] #PF: error_code(0x0000) - not-present page
[    5.973405][    T0] BUG: unable to handle page fault for address: 0000000a2b84dda8
[    5.981005][    T0] #PF: supervisor read access in kernel mode
[    5.986856][    T0] #PF: error_code(0x0000) - not-present page
[    5.992708][    T0] BUG: unable to handle page fault for address: 0000000a2b84dda8
[    6.000308][    T0] #PF: supervisor read access in kernel mode
[    6.006159][    T0] #PF: error_code(0x0000) - not-present page
[    6.012013][    T0] BUG: unable to handle page fault for address: 0000000a2b84dda8
[    6.019612][    T0] #PF: supervisor read access in kernel mode

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ