lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <202211061521.28931f7-oliver.sang@intel.com>
Date:   Sun, 6 Nov 2022 16:14:10 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Peter Xu <peterx@...hat.com>
CC:     <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-mm@...ck.org>,
        <linux-kernel@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        James Houghton <jthoughton@...gle.com>,
        Miaohe Lin <linmiaohe@...wei.com>,
        David Hildenbrand <david@...hat.com>,
        Muchun Song <songmuchun@...edance.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Nadav Amit <nadav.amit@...il.com>,
        Mike Kravetz <mike.kravetz@...cle.com>, <peterx@...hat.com>,
        Rik van Riel <riel@...riel.com>
Subject: Re: [PATCH RFC 05/10] mm/hugetlb: Make walk_hugetlb_range() RCU-safe


Greeting,

FYI, we noticed WARNING:suspicious_RCU_usage due to commit (built with gcc-11):

commit: 8b7e3b7ca3897ebc4cb7b23c65a4618d64056e3b ("[PATCH RFC 05/10] mm/hugetlb: Make walk_hugetlb_range() RCU-safe")
url: https://github.com/intel-lab-lkp/linux/commits/Peter-Xu/mm-hugetlb-Make-huge_pte_offset-thread-safe-for-pmd-unshare/20221031-053221
base: https://git.kernel.org/cgit/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/lkml/20221030212929.335473-6-peterx@redhat.com
patch subject: [PATCH RFC 05/10] mm/hugetlb: Make walk_hugetlb_range() RCU-safe

in testcase: kernel-selftests
version: kernel-selftests-x86_64-9313ba54-1_20221017
with following parameters:

	sc_nr_hugepages: 2
	group: vm

test-description: The kernel contains a set of "self tests" under the tools/testing/selftests/ directory. These are intended to be small unit tests to exercise individual code paths in the kernel.
test-url: https://www.kernel.org/doc/Documentation/kselftest.txt


on test machine: 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 16G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


If you fix the issue, kindly add following tag
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Link: https://lore.kernel.org/oe-lkp/202211061521.28931f7-oliver.sang@intel.com


kern  :warn  : [  181.942648] WARNING: suspicious RCU usage
kern  :warn  : [  181.943175] 6.1.0-rc1-00309-g8b7e3b7ca389 #1 Tainted: G S
kern  :warn  : [  181.943972] -----------------------------
kern  :warn  : [  181.944526] include/linux/rcupdate.h:364 Illegal context switch in RCU read-side critical section!
kern  :warn  : [  181.945559]
other info that might help us debug this:

kern  :warn  : [  181.946625]
rcu_scheduler_active = 2, debug_locks = 1
kern  :warn  : [  181.947473] 2 locks held by hmm-tests/9934:
kern :warn : [  181.948016] #0: ffff8884325b2d18 (&mm->mmap_lock#2){++++}-{3:3}, at: dmirror_fault (test_hmm.c:?) test_hmm
kern :warn : [  181.949129] #1: ffffffff858a7860 (rcu_read_lock){....}-{1:2}, at: walk_hugetlb_range (pagewalk.c:?) 
kern  :warn  : [  181.950161]
stack backtrace:
kern  :warn  : [  181.950780] CPU: 9 PID: 9934 Comm: hmm-tests Tainted: G S                 6.1.0-rc1-00309-g8b7e3b7ca389 #1
kern  :warn  : [  181.951863] Hardware name: Dell Inc. Vostro 3670/0HVPDY, BIOS 1.5.11 12/24/2018
kern  :warn  : [  181.952709] Call Trace:
kern  :warn  : [  181.953070]  <TASK>
kern :warn : [  181.953403] dump_stack_lvl (??:?) 
kern :warn : [  181.953890] __might_resched (??:?) 
kern :warn : [  181.954403] __mutex_lock (mutex.c:?) 
kern :warn : [  181.954886] ? validate_chain (lockdep.c:?) 
kern :warn : [  181.955405] ? hugetlb_fault (??:?) 
kern :warn : [  181.955926] ? mark_lock+0xca/0xac0 
kern :warn : [  181.956450] ? mutex_lock_io_nested (mutex.c:?) 
kern :warn : [  181.957039] ? check_prev_add (lockdep.c:?) 
kern :warn : [  181.957580] ? hugetlb_vm_op_pagesize (hugetlb.c:?) 
kern :warn : [  181.958177] ? hugetlb_fault (??:?) 
kern :warn : [  181.958690] hugetlb_fault (??:?) 
kern :warn : [  181.959199] ? find_held_lock (lockdep.c:?) 
kern :warn : [  181.959709] ? hugetlb_no_page (??:?) 
kern :warn : [  181.960255] ? __lock_release (lockdep.c:?) 
kern :warn : [  181.960772] ? lock_downgrade (lockdep.c:?) 
kern :warn : [  181.961292] ? lock_is_held_type (??:?) 
kern :warn : [  181.961830] ? handle_mm_fault (??:?) 
kern :warn : [  181.962363] handle_mm_fault (??:?) 
kern :warn : [  181.962870] ? hmm_vma_walk_hugetlb_entry (hmm.c:?) 
kern :warn : [  181.963501] hmm_vma_fault (hmm.c:?) 
kern :warn : [  181.964096] walk_hugetlb_range (pagewalk.c:?) 
kern :warn : [  181.964639] __walk_page_range (pagewalk.c:?) 
kern :warn : [  181.965160] walk_page_range (??:?) 
kern :warn : [  181.965670] ? __walk_page_range (??:?) 
kern :warn : [  181.966213] ? rcu_read_unlock (main.c:?) 
kern :warn : [  181.966718] ? lock_is_held_type (??:?) 
kern :warn : [  181.967259] ? mmu_interval_read_begin (??:?) 
kern :warn : [  181.967855] ? lock_is_held_type (??:?) 
kern :warn : [  181.968400] hmm_range_fault (??:?) 
kern :warn : [  181.968911] ? down_read (??:?) 
kern :warn : [  181.969383] ? hmm_vma_fault (??:?) 
kern :warn : [  181.969891] ? __lock_release (lockdep.c:?) 
kern :warn : [  181.970416] dmirror_fault (test_hmm.c:?) test_hmm
kern :warn : [  181.971012] ? dmirror_migrate_to_system+0x590/0x590 test_hmm
kern :warn : [  181.971847] ? find_held_lock (lockdep.c:?) 
kern :warn : [  181.972355] ? dmirror_write+0x202/0x310 test_hmm
kern :warn : [  181.973069] ? __lock_release (lockdep.c:?) 
kern :warn : [  181.973586] ? lock_downgrade (lockdep.c:?) 
kern :warn : [  181.974107] ? lock_is_held_type (??:?) 
kern :warn : [  181.974641] ? dmirror_write+0x202/0x310 test_hmm
kern :warn : [  181.975355] ? lock_release (??:?) 
kern :warn : [  181.975845] ? __mutex_unlock_slowpath (mutex.c:?) 
kern :warn : [  181.976444] ? bit_wait_io_timeout (mutex.c:?) 
kern :warn : [  181.977008] ? lock_is_held_type (??:?) 
kern :warn : [  181.977547] ? dmirror_do_write (test_hmm.c:?) test_hmm
kern :warn : [  181.978185] dmirror_write+0x1bf/0x310 test_hmm
kern :warn : [  181.978881] ? dmirror_fault (test_hmm.c:?) test_hmm
kern :warn : [  181.979484] ? lock_is_held_type (??:?) 
kern :warn : [  181.980021] ? __might_fault (??:?) 
kern :warn : [  181.980523] ? lock_release (??:?) 
kern :warn : [  181.981019] dmirror_fops_unlocked_ioctl (test_hmm.c:?) test_hmm
kern :warn : [  181.981732] ? dmirror_exclusive+0x780/0x780 test_hmm
kern :warn : [  181.982485] ? do_user_addr_fault (fault.c:?) 
kern :warn : [  181.983042] ? __lock_release (lockdep.c:?) 
kern :warn : [  181.983562] __x64_sys_ioctl (??:?) 
kern :warn : [  181.984074] do_syscall_64 (??:?) 
kern :warn : [  181.984545] ? do_user_addr_fault (fault.c:?) 
kern :warn : [  181.985103] ? do_user_addr_fault (fault.c:?) 
kern :warn : [  181.985654] ? irqentry_exit_to_user_mode (??:?) 
kern :warn : [  181.986256] ? lockdep_hardirqs_on_prepare (lockdep.c:?) 
kern :warn : [  181.986945] entry_SYSCALL_64_after_hwframe (??:?) 
kern  :warn  : [  181.987569] RIP: 0033:0x7fac2f598e9b
kern :warn : [ 181.988047] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1b 48 8b 44 24 18 64 48 2b 04 25 28 00
All code
========
   0:	00 48 89             	add    %cl,-0x77(%rax)
   3:	44 24 18             	rex.R and $0x18,%al
   6:	31 c0                	xor    %eax,%eax
   8:	48 8d 44 24 60       	lea    0x60(%rsp),%rax
   d:	c7 04 24 10 00 00 00 	movl   $0x10,(%rsp)
  14:	48 89 44 24 08       	mov    %rax,0x8(%rsp)
  19:	48 8d 44 24 20       	lea    0x20(%rsp),%rax
  1e:	48 89 44 24 10       	mov    %rax,0x10(%rsp)
  23:	b8 10 00 00 00       	mov    $0x10,%eax
  28:	0f 05                	syscall 
  2a:*	41 89 c0             	mov    %eax,%r8d		<-- trapping instruction
  2d:	3d 00 f0 ff ff       	cmp    $0xfffff000,%eax
  32:	77 1b                	ja     0x4f
  34:	48 8b 44 24 18       	mov    0x18(%rsp),%rax
  39:	64                   	fs
  3a:	48                   	rex.W
  3b:	2b                   	.byte 0x2b
  3c:	04 25                	add    $0x25,%al
  3e:	28 00                	sub    %al,(%rax)

Code starting with the faulting instruction
===========================================
   0:	41 89 c0             	mov    %eax,%r8d
   3:	3d 00 f0 ff ff       	cmp    $0xfffff000,%eax
   8:	77 1b                	ja     0x25
   a:	48 8b 44 24 18       	mov    0x18(%rsp),%rax
   f:	64                   	fs
  10:	48                   	rex.W
  11:	2b                   	.byte 0x2b
  12:	04 25                	add    $0x25,%al
  14:	28 00                	sub    %al,(%rax)


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        sudo bin/lkp install job.yaml           # job file is attached in this email
        bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
        sudo bin/lkp run generated-yaml-file

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.



-- 
0-DAY CI Kernel Test Service
https://01.org/lkp



View attachment "config-6.1.0-rc1-00309-g8b7e3b7ca389" of type "text/plain" (171287 bytes)

View attachment "job-script" of type "text/plain" (6219 bytes)

Download attachment "kmsg.xz" of type "application/x-xz" (49020 bytes)

View attachment "kernel-selftests" of type "text/plain" (224193 bytes)

View attachment "job.yaml" of type "text/plain" (4855 bytes)

View attachment "reproduce" of type "text/plain" (273 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ