linux-kernel - Re: [lkp] [x86/mtrr] edfe63ec97: kernel BUG at arch/x86/mm/physaddr.c:79!

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <1459539880.3085.32.camel@hpe.com>
Date:	Fri, 01 Apr 2016 13:44:40 -0600
From:	Toshi Kani <toshi.kani@....com>
To:	kernel test robot <ying.huang@...ux.intel.com>
Cc:	lkp@...org, linux-kernel@...r.kernel.org,
	Toshi Kani <toshi.kani@....com>,
	Peter Zijlstra <peterz@...radead.org>,
	"Luis R.Rodriguez" <mcgrof@...e.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Juergen Gross <jgross@...e.com>,
	"H.Peter Anvin" <hpa@...or.com>,
	Denys Vlasenko <dvlasenk@...hat.com>,
	Brian Gerst <brgerst@...il.com>, Borislav Petkov <bp@...e.de>,
	Borislav Petkov <bp@...en8.de>,
	Andy Lutomirski <luto@...capital.net>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...nel.org>
Subject: Re: [lkp] [x86/mtrr] edfe63ec97: kernel BUG at
 arch/x86/mm/physaddr.c:79!

On Fri, 2016-04-01 at 11:05 +0800, kernel test robot wrote:
> FYI, we noticed the below changes on
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/mm
> commit edfe63ec97ed8d4496225f7ba54c9ce4207c5431 ("x86/mtrr: Fix Xorg
> crashes in Qemu sessions")
> 
> 
> [   10.429879] hgafb: HGA card not detected.
> [   10.430521] hgafb: probe of hgafb.0 failed with error -22
> [   10.434199] ------------[ cut here ]------------
> [   10.434889] kernel BUG at arch/x86/mm/physaddr.c:79!
> [   10.435784] invalid opcode: 0000 [#1] DEBUG_PAGEALLOC 
> [   10.436627] CPU: 0 PID: 117 Comm: v86d Not tainted 4.6.0-rc1-00015-
> gedfe63e #1
> [   10.437696] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS Debian-1.8.2-1 04/01/2014
> [   10.438929] task: cf91d900 ti: cf8fa000 task.ti: cf8fa000
> [   10.439664] EIP: 0060:[<c1033290>] EFLAGS: 00010206 CPU: 0
> [   10.440409] EIP is at __phys_addr+0x80/0x90
> [   10.441022] EAX: 13fe0000 EBX: 13fe0000 ECX: 00000000 EDX: 13fe0000
> [   10.441975] ESI: 00000000 EDI: 00000000 EBP: cf8fbe4c ESP: cf8fbe48
> [   10.442804]  DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
> [   10.443534] CR0: 80050033 CR2: 08063e48 CR3: 0f9f8f20 CR4: 000006b0
> [   10.444362] Stack:
> [   10.444772]  cf9e4dfc cf8fbe60 c1031eef 00000001 00001000 00000000
> cf8fbea8 c15952d1
> [   10.446322]  cf9e4dfc d3a23518 c10fce12 024080c0 024080c0 d2b05c80
> 00000000 00000000
> [   10.447870]  d15da220 cf9e4dd8 00001000 00001000 00000000 cf9ed790
> b7752000 cf9ed788
> [   10.449424] Call Trace:
> [   10.449877]  [<c1031eef>] phys_mem_access_prot_allowed+0xaf/0xf0
> [   10.450670]  [<c15952d1>] mmap_mem+0xa1/0x170
> [   10.451308]  [<c10fce12>] ? mmap_region+0x242/0x510
> [   10.451993]  [<c10fce9a>] mmap_region+0x2ca/0x510
> [   10.452657]  [<c10fd30d>] do_mmap+0x22d/0x300
> [   10.453313]  [<c10e7d74>] vm_mmap_pgoff+0x54/0x80
> [   10.453985]  [<c10fb211>] SyS_mmap_pgoff+0xa1/0x100
> [   10.454665]  [<c10013c3>] do_int80_syscall_32+0x63/0x150
> [   10.455396]  [<c1b2684e>] entry_INT80_32+0x36/0x36

In short, this is a bug in previously (and unintentionally) deadcode. 

After commit edfe63ec97, PAT is now set to disable properly when MTRRs are
disabled.  This led the following deadcode to resurrect on x86/32.

phys_mem_access_prot_allowed()
 :
#ifdef CONFIG_X86_32
        /*
         * On the PPro and successors, the MTRRs are used to set
         * memory types for physical addresses outside main memory,
         * so blindly setting UC or PWT on those pages is wrong.
         * For Pentiums and earlier, the surround logic should disable
         * caching for the high addresses through the KEN pin, but
         * we maintain the tradition of paranoia in this code.
         */
        if (!pat_enabled() &&
            !(boot_cpu_has(X86_FEATURE_MTRR) ||
              boot_cpu_has(X86_FEATURE_K6_MTRR) ||
              boot_cpu_has(X86_FEATURE_CYRIX_ARR) ||
              boot_cpu_has(X86_FEATURE_CENTAUR_MCR)) &&
            (pfn << PAGE_SHIFT) >= __pa(high_memory)) {
                pcm = _PAGE_CACHE_MODE_UC;
        }
#endif

When the system does not have much memory, 'high_memory' points to the
maximum memory address + 1, which is empty.  When CONFIG_DEBUG_VIRTUAL is
also set, __pa() calls __phys_addr(), which in turn
calls slow_virt_to_phys() for high_memory.  Because high_memory does not
point to a valid memory address, this address is not mapped.  Hence,
BUG_ON.

This can be fixed by changing it to either __pa(high_memory-1)
or __pa_nodebug(high_memory).  Since the code does not expect a valid
virtual address for high_memory, I think using __pa_nodebug() is
appropriate here.  I am going to send a patch with this change.

Note, the code should not use high_memory for this check.  I have a
separate patch for the /dev/mem driver to check if a target address is
backed by any memory (Ingo, any update on this one?).  I consider it as
enhancement, so I am not going to replace the high_memory check for this
bug fix, though.
https://lkml.org/lkml/2016/2/9/935
https://lkml.org/lkml/2016/2/17/493

Thanks,
-Toshi