lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 16 Sep 2022 11:59:14 +0800
From:   Yujie Liu <yujie.liu@...el.com>
To:     Dave Hansen <dave.hansen@...ux.intel.com>
CC:     <linux-kernel@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Andy Lutomirski <luto@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        "Ingo Molnar" <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        <x86@...nel.org>, "H. Peter Anvin" <hpa@...or.com>,
        lkp <lkp@...el.com>
Subject: Re: [PATCH] x86/mm: Set NX bit when making pages present

Hi Dave,

On Fri, Sep 09, 2022 at 08:27:21AM -0700, Dave Hansen wrote:
> The x86 mm code now actively refuses to create writable, executable
> mappings and warns when there is an attempt to create one.
> 
> 0day ran across a case triggered by module unloading, but that looks
> to be a generic problem.  It presumably goes like this:
> 
> 	1. Load module with direct map, P=1,W=1,NX=1
> 	2. Map module executable, set P=1,W=0,NX=0
> 	3. Free module, land in vfree()->vm_remove_mappings()
> 	4. Set P=0 during alias processing, P=0,W=0,NX=0
> 	5. Restore kernel mapping via set_direct_map_default_noflush(),
> 	   set P=1,W=1, resulting in P=1,W=1,NX=0
> 
> That's clearly a writable, executable mapping which is a no-no.  The
> new W^X code is clearly doing its job.
> 
> Fix it by actively setting _PAGE_NX when creating writable mappings.
> 
> One concern: I haven't been able to actually reproduce this, even by
> loading and unloading the module that 0day hit it with.  I'd like to
> be able to reproduce this before committing a fix.

We followed the steps in original report and confirmed the case could be
reproduced by running in qemu. Here we put a brief log for your
information.

$ lkp-tests/bin/lkp qemu -k bzImage -m modules.cgz job-script
...
exec command: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -fsdev local,id=test_dev,path=/root/.lkp//result/rcuscale/300s-srcu/vm-snb/debian-i386-20191205.cgz/i386-randconfig-a013-20211012/gcc-11/652c5bf380ad018e15006a7f8349800245ddbbad/0,security_model=none -device virtio-9p-pci,fsdev=test_dev,mount_tag=9p/virtfs_mount -kernel bzImage -append root=/dev/ram0 RESULT_ROOT=/result/rcuscale/300s-srcu/vm-snb/debian-i386-20191205.cgz/i386-randconfig-a013-20211012/gcc-11/652c5bf380ad018e15006a7f8349800245ddbbad/3 BOOT_IMAGE=/pkg/linux/i386-randconfig-a013-20211012/gcc-11/652c5bf380ad018e15006a7f8349800245ddbbad/vmlinuz-5.19.0-00430-g652c5bf380ad branch=tip/x86/mm job=/lkp/jobs/scheduled/vm-meta-159/rcuscale-300s-srcu-debian-i386-20191205.cgz-652c5bf380ad018e15006a7f8349800245ddbbad-20220904-96286-fjjdsq-5.yaml user=lkp ARCH=i386 kconfig=i386-randconfig-a013-20211012 commit=652c5bf380ad018e15006a7f8349800245ddbbad vmalloc=256M initramfs_async=0 page_owner=on max_uptime=2100 LKP_LOCAL_RUN=1 selinux=0 debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 net.ifnames=0 printk.devkmsg=on panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0 drbd.minor_count=8 systemd.log_level=err ignore_loglevel console=tty0 earlyprintk=ttyS0,115200 console=ttyS0,115200 vga=normal rw  ip=dhcp result_service=9p/virtfs_mount -initrd /root/.lkp/cache/final_initrd -smp 2 -m 16384M -no-reboot -watchdog i6300esb -rtc base=localtime -device e1000,netdev=net0 -netdev user,id=net0 -display none -monitor null -serial stdio -drive file=/tmp/vdisk-root/disk-vm-snb-0,media=disk,if=virtio -drive file=/tmp/vdisk-root/disk-vm-snb-1,media=disk,if=virtio -drive file=/tmp/vdisk-root/disk-vm-snb-2,media=disk,if=virtio
early console in setup code
early console in extract_kernel
input_data: 0x02c1a094
input_len: 0x010390d3
output: 0x01000000
output_len: 0x0239fed0
kernel_total_size: 0x02c6b000
needed_size: 0x02c6b000

Decompressing Linux... Parsing ELF... No relocation needed... done.
Booting the kernel.
...
LKP: ttyS0: 287: Kernel tests: Boot OK!
LKP: ttyS0: 287: HOSTNAME vm-snb, MAC 52:54:00:12:34:56, kernel 5.19.0-00430-g652c5bf380ad 1
[  OK  ] Reached target Printer.
LKP: ttyS0: 287:  /lkp/lkp/src/bin/run-lkp /lkp/jobs/scheduled/vm-meta-159/rcuscale-300s-srcu-debian-i386-20191205.cgz-652c5bf380ad018e15006a7f8349800245ddbbad-20220904-96286-fjjdsq-5.yaml
[   11.660921][  T203] ------------[ cut here ]------------
[   11.661292][  T203] CPA refuse W^X violation: 0000000000000060 -> 0000000000000063 range: 0x00000000bab7e000 - 0x00000000bab7efff PFN 7ab7e
[   11.662112][  T203] WARNING: CPU: 0 PID: 203 at arch/x86/mm/pat/set_memory.c:600 __change_page_attr+0x245/0x260
[   11.662762][  T203] Modules linked in: torture
[   11.663057][  T203] CPU: 0 PID: 203 Comm: kworker/0:3 Not tainted 5.19.0-00430-g652c5bf380ad #1
[   11.663605][  T203] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
[   11.664175][  T203] Workqueue: events do_free_init
[   11.664545][  T203] EIP: __change_page_attr+0x245/0x260
[   11.664881][  T203] Code: ff ff ff 8d 87 ff 0f 00 00 ff 75 e4 31 d2 50 8b 45 e0 57 52 31 d2 51 52 50 68 18 49 8d 42 c6 05 bc 1e 02 43 01 e8 c1 e0 e7 00 <0f> 0b 83 c4 20 e9 40 ff ff ff e8 4c 8d f2 00 8d b4 26 00 00 00 00
[   11.666092][  T203] EAX: 00000077 EBX: 7ab7e060 ECX: 42af1540 EDX: 42af153c
[   11.666526][  T203] ESI: beb69ea4 EDI: bab7e000 EBP: beb69e4c ESP: beb69e0c
[   11.666970][  T203] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00010292
[   11.667441][  T203] CR0: 80050033 CR2: 0805ae40 CR3: 7ab70000 CR4: 00040690
[   11.667884][  T203] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[   11.668326][  T203] DR6: fffe0ff0 DR7: 00000400
[   11.668618][  T203] Call Trace:
[   11.668832][  T203]  __change_page_attr_set_clr+0x49/0x170
[   11.669178][  T203]  ? _vm_unmap_aliases+0x101/0x120
[   11.669494][  T203]  set_direct_map_default_noflush+0x49/0x60
[   11.669857][  T203]  __vunmap+0x192/0x270
[   11.670114][  T203]  __vfree+0x20/0x50
[   11.670361][  T203]  vfree+0x29/0x60
[   11.670591][  T203]  module_memfree+0x1b/0x30
[   11.670875][  T203]  do_free_init+0x2c/0x50
[   11.671142][  T203]  process_one_work+0x20c/0x480
[   11.671441][  T203]  worker_thread+0x166/0x3c0
[   11.671727][  T203]  kthread+0xbf/0xe0
[   11.671966][  T203]  ? rescuer_thread+0x310/0x310
[   11.672278][  T203]  ? kthread_complete_and_exit+0x20/0x20
[   11.672626][  T203]  ret_from_fork+0x19/0x30
[   11.672907][  T203] irq event stamp: 2183
[   11.673161][  T203] hardirqs last  enabled at (2191): [<410b8b5e>] __up_console_sem+0x6e/0x80
[   11.673706][  T203] hardirqs last disabled at (2198): [<410b8b45>] __up_console_sem+0x55/0x80
[   11.674238][  T203] softirqs last  enabled at (2164): [<41f8ed1c>] __do_softirq+0x2ac/0x3b0
[   11.674761][  T203] softirqs last disabled at (2151): [<41023525>] do_softirq_own_stack+0x25/0x30
[   11.675312][  T203] ---[ end trace 0000000000000000 ]---
[   11.677154][  T618] srcu-scale:--- Start of test: nreaders=1 nwriters=1 verbose=1 shutdown=0
[   11.677750][  T618] srcu-torture: Creating rcu_scale_reader task
[   11.684278][  T621] srcu-scale: rcu_scale_reader task started
[   11.688805][  T618] srcu-torture: Creating rcu_scale_writer task
[   11.692377][  T622] srcu-scale: rcu_scale_writer task started
[   11.692962][  T204] BUG: unable to handle page fault for address: bab7f400
[   11.693401][  T204] #PF: supervisor write access in kernel mode
[   11.693791][  T204] #PF: error_code(0x0002) - not-present page
[   11.694157][  T204] *pde = 7ab81063 *pte = 7ab7f060
[   11.694477][  T204] Oops: 0002 [#1]
...

> 
> Reported-by: kernel test robot <yujie.liu@...el.com>
> Signed-off-by: Dave Hansen <dave.hansen@...ux.intel.com>
> Cc: Peter Zijlstra (Intel) <peterz@...radead.org>
> Cc: Andy Lutomirski <luto@...nel.org>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: Thomas Gleixner <tglx@...utronix.de>
> Cc: Ingo Molnar <mingo@...hat.com>
> Cc: Borislav Petkov <bp@...en8.de>
> Cc: x86@...nel.org
> Cc: "H. Peter Anvin" <hpa@...or.com>
> Link: https://lore.kernel.org/all/fcf89147-440b-e478-40c9-228c9fe56691@intel.com/
> 
> --
> 
> 0day folks, please do share these as they come up.  We want to keep
> fixing them.

We sent another report at:
https://lore.kernel.org/all/202209141701.e293ea84-yujie.liu@intel.com/

The WARNING and Call Trace are slightly different from those in the
first report. Please kindly check if this is a unique one or a
duplicate. Thanks.

> ---
>  arch/x86/mm/pat/set_memory.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
> index 1a2d6376251c..5fb5874ea2c6 100644
> --- a/arch/x86/mm/pat/set_memory.c
> +++ b/arch/x86/mm/pat/set_memory.c
> @@ -2247,6 +2247,12 @@ static int __set_pages_p(struct page *page, int numpages)
>  				.mask_clr = __pgprot(0),
>  				.flags = 0};
>  
> +	/*
> +	 * Avoid W^X mappings that occur if the old
> +	 * mapping was !_PAGE_RW and !_PAGE_NX.
> +	 */
> +	pgprot_val(cpa.mask_set) |= __supported_pte_mask & _PAGE_NX;
> +
>  	/*
>  	 * No alias checking needed for setting present flag. otherwise,
>  	 * we may need to break large pages for 64-bit kernel text
> -- 
> 2.34.1
> 

We tested this patch, but found it couldn't fix the problem:

=========================================================================================
compiler/kconfig/rootfs/runtime/scale_type/tbox_group/testcase:
  gcc-11/i386-randconfig-a013-20211012/debian-i386-20191205.cgz/300s/srcu/vm-snb/rcuscale

commit:
  652c5bf380ad0 ("x86/mm: Refuse W^X violations")
  f7b4797d892f9 ("x86/mm: Set NX bit when making pages present")

652c5bf380ad018e f7b4797d892f968e0980fe820f7
---------------- ---------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
          6:6            0%           6:6     dmesg.BUG:unable_to_handle_page_fault_for_address
          6:6            0%           6:6     dmesg.EIP:__change_page_attr
          6:6            0%           6:6     dmesg.Kernel_panic-not_syncing:Fatal_exception
          6:6            0%           6:6     dmesg.Oops:#[##]
          6:6            0%           6:6     dmesg.WARNING:at_arch/x86/mm/pat/set_memory.c:#__change_page_attr
          6:6            0%           6:6     dmesg.boot_failures

=========================================================================================
compiler/kconfig/rootfs/runtime/tbox_group/test/testcase/torture_type:
  gcc-11/i386-randconfig-a004-20211104/debian-i386-20191205.cgz/300s/vm-snb/cpuhotplug/rcutorture/tasks-tracing

commit:
  652c5bf380ad0 ("x86/mm: Refuse W^X violations")
  f7b4797d892f9 ("x86/mm: Set NX bit when making pages present")

652c5bf380ad018e f7b4797d892f968e0980fe820f7
---------------- ---------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
          6:6            0%           6:6     dmesg.BUG:unable_to_handle_page_fault_for_address
          6:6            0%           6:6     dmesg.EIP:verify_rwx
          6:6            0%           6:6     dmesg.Kernel_panic-not_syncing:Fatal_exception
          6:6            0%           6:6     dmesg.Oops:#[##]
          6:6            0%           6:6     dmesg.WARNING:at_arch/x86/mm/pat/set_memory.c:#verify_rwx
          6:6            0%           6:6     dmesg.boot_failures

--
Best Regards,
Yujie

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ