lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <06E9A7A4-AEF9-43CC-8AD5-B7B6263ADF73@nvidia.com>
Date:   Tue, 10 May 2022 10:30:33 -0400
From:   Zi Yan <ziy@...dia.com>
To:     kernel test robot <oliver.sang@...el.com>
Cc:     Johannes Weiner <hannes@...xchg.org>,
        kernel test robot <lkp@...el.com>,
        Christophe Leroy <christophe.leroy@...roup.eu>,
        David Hildenbrand <david@...hat.com>,
        Eric Ren <renzhengeek@...il.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Mike Rapoport <rppt@...ux.ibm.com>,
        Minchan Kim <minchan@...nel.org>,
        Oscar Salvador <osalvador@...e.de>,
        Vlastimil Babka <vbabka@...e.cz>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org
Subject: Re: [mm] 23e12fc477: UBSAN:shift-out-of-bounds_in_mm/page_isolation.c

Hi kernel test robot,

There is a fixup patch for the commit: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-make-alloc_contig_range-work-at-pageblock-granularity-fix.patch
It fixed the issue as I verified it by following the steps below. No more boot hang.

--
Best Regards,
Yan, Zi

On 10 May 2022, at 5:58, kernel test robot wrote:

> Greeting,
>
> FYI, we noticed the following commit (built with clang-15):
>
> commit: 23e12fc477f1c2729af51c427087e777d6e63803 ("mm: make alloc_contig_range work at pageblock granularity")
> https://github.com/hnaz/linux-mm master
>
> in testcase: boot
>
> on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
>
> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>
>
>
> If you fix the issue, kindly add following tag
> Reported-by: kernel test robot <oliver.sang@...el.com>
>
>
> [  103.625478][    T1] ================================================================================
> [  103.628487][    T1] UBSAN: shift-out-of-bounds in mm/page_isolation.c:416:17
> [  103.631041][    T1] shift exponent 64 is too large for 64-bit type 'unsigned long'
> [  103.633539][    T1] CPU: 0 PID: 1 Comm: swapper Not tainted 5.18.0-rc4-mm1-00249-g23e12fc477f1 #1 4cafac2312e666eae49f8458f1d93cbe9d5338b2
> [  103.637394][    T1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
> [  103.640378][    T1] Call Trace:
> [  103.641583][    T1]  <TASK>
> [  103.642670][    T1]  __ubsan_handle_shift_out_of_bounds+0x356/0x3a0
> [  103.644703][    T1]  isolate_single_pageblock+0x683/0x870
> [  103.646498][    T1]  start_isolate_page_range+0x69/0xb10
> [  103.648349][    T1]  alloc_contig_range+0x27b/0x680
> [  103.650010][    T1]  alloc_contig_pages+0x413/0x550
> [  103.651549][    T1]  debug_vm_pgtable_alloc_huge_page+0x27/0xc1
> [  103.653486][    T1]  init_args+0xa5f/0xe06
> [  103.654924][    T1]  ? __hugetlb_cgroup_file_legacy_init+0x61f/0x61f
> [  103.656949][    T1]  debug_vm_pgtable+0x56/0x3e0
> [  103.658484][    T1]  ? __hugetlb_cgroup_file_legacy_init+0x61f/0x61f
> [  103.660556][    T1]  do_one_initcall+0x2bd/0x740
> [  103.662132][    T1]  ? __hugetlb_cgroup_file_legacy_init+0x61f/0x61f
> [  103.664179][    T1]  ? __llvm_gcov_reset+0x740/0x1320
> [  103.665837][    T1]  do_initcall_level+0x13c/0x284
> [  103.667460][    T1]  do_initcalls+0x75/0xb7
> [  103.668995][    T1]  kernel_init_freeable+0x158/0x1f6
> [  103.670678][    T1]  ? rest_init+0x2f0/0x2f0
> [  103.672143][    T1]  kernel_init+0x18/0x2a0
> [  103.673544][    T1]  ? rest_init+0x2f0/0x2f0
> [  103.675026][    T1]  ret_from_fork+0x22/0x30
> [  103.676494][    T1]  </TASK>
> [  103.677587][    T1] ================================================================================
> [  140.018114][    C0] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 32s!
> [  140.021174][    C0] Showing busy workqueues and worker pools:
> [  140.022912][    C0] workqueue events_power_efficient: flags=0x80
> [  140.024730][    C0]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=6/256 refcnt=7
> [  140.024759][    C0]     pending: neigh_managed_work, neigh_managed_work, neigh_managed_work, neigh_periodic_work, neigh_periodic_work, neigh_periodic_work
>
>
>
>
> To reproduce:
>
>         # build kernel
> 	cd linux
> 	cp config-5.18.0-rc4-mm1-00249-g23e12fc477f1 .config
> 	make HOSTCC=clang-15 CC=clang-15 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules
> 	make HOSTCC=clang-15 CC=clang-15 ARCH=x86_64 INSTALL_MOD_PATH=<mod-install-dir> modules_install
> 	cd <mod-install-dir>
> 	find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz
>
>
>         git clone https://github.com/intel/lkp-tests.git
>         cd lkp-tests
>         bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email
>
>         # if come across any failure that blocks the test,
>         # please remove ~/.lkp and /lkp dir to run from a clean state.
>
>
>
> -- 
> 0-DAY CI Kernel Test Service
> https://01.org/lkp
Download attachment "signature.asc" of type "application/pgp-signature" (855 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ