lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Wed, 29 Nov 2017 14:39:13 +0800
From:   kernel test robot <fengguang.wu@...el.com>
To:     Dave Hansen <dave.hansen@...ux.intel.com>
Cc:     LKP <lkp@...org>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>, wfg@...ux.intel.com
Subject: b345a34006 ("x86/mm/kaiser: Use PCID feature to make user and
 .."):  WARNING: possible circular locking dependency detected

Greetings,

0day kernel testing robot got the below dmesg and the first bad commit is

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master

commit b345a34006d85c6cc2fd37baddce5bdbc0b3aef6
Author:     Dave Hansen <dave.hansen@...ux.intel.com>
AuthorDate: Wed Nov 22 16:35:09 2017 -0800
Commit:     Ingo Molnar <mingo@...nel.org>
CommitDate: Mon Nov 27 15:04:35 2017 +0100

    x86/mm/kaiser: Use PCID feature to make user and kernel switches faster
    
    Short summary: Use x86 PCID feature to avoid flushing the TLB at all
    interrupts and syscalls.  Speed them up.  Makes context switches
    and TLB flushing slower.
    
    Background:
    
    KAISER keeps two copies of the page tables.  Switches between the
    copies are performed by writing to the CR3 register.  But, CR3
    was really designed for context switches and writes to it also
    flush the entire TLB (modulo global pages).  This TLB flush
    increases the cost of interrupts and context switches.  For
    syscall-heavy microbenchmarks it can cut the rate of syscalls by 2/3.
    
    The kernel recently gained support for and Intel CPU feature
    called Process Context IDentifiers (PCID) thanks to Andy
    Lutomirski.  This feature is intended to allow you to switch
    between contexts without flushing the TLB.
    
    Implementation:
    
    PCIDs can be used to avoid flushing the TLB at kernel entry/exit.
    This is speeds up both interrupts and syscalls.
    
    First, the kernel and userspace must be assigned different ASIDs.
    On entry from userspace, move over to the kernel page tables
    *and* ASID.  On exit, restore the user page tables and ASID.
    Fortunately, the ASID is programmed via CR3, which is already
    being used to switch between the user and kernel page tables.
    This gives us convenient, one-stop shopping.
    
    The CR3 write which is used to switch between processes provides
    all the TLB flushing normally required at context switch time.
    But, with KAISER, that CR3 write only flushes the current
    (kernel) ASID.  An extra TLB flush operation is now required in
    order to flush the user ASID.  This new instruction (INVPCID) is
    probably ~100 cycles, but this is done with the assumption that
    the time lost in context switches is more than made up for by
    lower cost of interrupts and syscalls.
    
    Support:
    
    PCIDs are generally available on Sandybridge and newer CPUs.  However,
    the accompanying INVPCID instruction did not become available until
    Haswell (the ones with "v4", or called fourth-generation Core).  This
    instruction allows non-current-PCID TLB entries to be flushed without
    switching CR3 and global pages to be flushed without a double
    MOV-to-CR4.
    
    Without INVPCID, PCIDs are much harder to use.  TLB invalidation gets
    much more onerous:
    
    1. Every kernel TLB flush (even for a single page) requires an
       interrupts-off MOV-to-CR4 which is very expensive.  This is because
       there is no way to flush a kernel address that might be loaded
       in *EVERY* PCID.  Right now, there are "only" ~12 of these per-CPU,
       but that's too painful to use the MOV-to-CR3 to flush them.  That
       leaves only the MOV-to-CR4.
    
    2. Every userspace flush (even for a single page requires one of the
       following:
       a. A pair of flushing (bit 63 clear) CR3 writes: one for
          the kernel ASID and another for userspace.
       b. A pair of non-flushing CR3 writes (bit 63 set) with the
          flush done for each.  For instance, what is currently a
          single instruction without KAISER:
    
                    invpcid_flush_one(current_pcid, addr);
    
          becomes this with KAISER:
    
                    invpcid_flush_one(current_kern_pcid, addr);
                    invpcid_flush_one(current_user_pcid, addr);
    
          and this without INVPCID:
    
                    __native_flush_tlb_single(addr);
                    write_cr3(mm->pgd | current_user_pcid | NOFLUSH);
                    __native_flush_tlb_single(addr);
                    write_cr3(mm->pgd | current_kern_pcid | NOFLUSH);
    
    So, for now, fully disable PCIDs with KAISER when INVPCID is not
    available.  This is fixable, but it's an optimization that can be
    performed later.
    
    Hugh Dickins also points out that PCIDs really have two distinct
    use-cases in the context of KAISER.  The first way they can be used
    is as "TLB preservation across context-switch", which is what
    Andy Lutomirksi's 4.14 PCID code does.  They can also be used as
    a "KAISER syscall/interrupt accelerator".  If we just use them to
    speed up syscall/interrupts (and ignore the context-switch TLB
    preservation), then the deficiency of not having INVPCID
    becomes much less onerous.
    
    Signed-off-by: Dave Hansen <dave.hansen@...ux.intel.com>
    Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
    Cc: Andy Lutomirski <luto@...nel.org>
    Cc: Borislav Petkov <bp@...en8.de>
    Cc: Brian Gerst <brgerst@...il.com>
    Cc: Denys Vlasenko <dvlasenk@...hat.com>
    Cc: H. Peter Anvin <hpa@...or.com>
    Cc: Josh Poimboeuf <jpoimboe@...hat.com>
    Cc: Linus Torvalds <torvalds@...ux-foundation.org>
    Cc: Peter Zijlstra <peterz@...radead.org>
    Cc: Rik van Riel <riel@...hat.com>
    Cc: daniel.gruss@...k.tugraz.at
    Cc: hughd@...gle.com
    Cc: keescook@...gle.com
    Cc: linux-mm@...ck.org
    Cc: michael.schwarz@...k.tugraz.at
    Cc: moritz.lipp@...k.tugraz.at
    Cc: richard.fellner@...dent.tugraz.at
    Link: https://lkml.kernel.org/r/20171123003509.EC42DD15@viggo.jf.intel.com
    Signed-off-by: Ingo Molnar <mingo@...nel.org>

e794054d5a  x86/mm: Allow flushing for future ASID switches
b345a34006  x86/mm/kaiser: Use PCID feature to make user and kernel switches faster
5bef2980ad  Add linux-next specific files for 20171128
+-------------------------------------------------------+------------+------------+---------------+
|                                                       | e794054d5a | b345a34006 | next-20171128 |
+-------------------------------------------------------+------------+------------+---------------+
| boot_successes                                        | 24         | 0          | 0             |
| boot_failures                                         | 37         | 20         | 49            |
| WARNING:possible_circular_locking_dependency_detected | 37         | 11         | 45            |
| IP-Config:Auto-configuration_of_network_failed        | 2          | 0          | 18            |
| kernel_BUG_at_arch/x86/kernel/mpparse.c               | 0          | 9          | 4             |
| PANIC:early_exception                                 | 0          | 9          | 4             |
| RIP:default_get_smp_config                            | 0          | 9          | 4             |
| kernel_BUG_at_drivers/base/driver.c                   | 0          | 7          | 8             |
| invalid_opcode:#[##]                                  | 0          | 7          | 8             |
| RIP:driver_register                                   | 0          | 7          | 8             |
| Kernel_panic-not_syncing:Fatal_exception              | 0          | 7          | 8             |
+-------------------------------------------------------+------------+------------+---------------+

[    0.983311] Unpacking initramfs...
[    1.066328] Freeing initrd memory: 3932K
[    1.067813] platform rtc_cmos: registered platform RTC device (no PNP device found)
[    1.584102] 
[    1.584309] ======================================================
[    1.584858] WARNING: possible circular locking dependency detected
[    1.585360] 4.14.0-01253-gb345a340 #1 Not tainted
[    1.585748] ------------------------------------------------------
[    1.586248] kworker/0:1/14 is trying to acquire lock:
[    1.586662]  (ww_class_mutex){+.+.}, at: [<ffffffff810a9ca2>] test_abba_work+0x37/0xb7
[    1.587312] 
[    1.587312] but now in release context of a crosslock acquired at the following:
[    1.588004]  ((completion)&abba.b_ready){+.+.}, at: [<ffffffff810aa5d6>] test_abba+0x146/0x234
[    1.588004] 
[    1.588004] which lock already depends on the new lock.
[    1.588004] 
[    1.588004] the existing dependency chain (in reverse order) is:
[    1.588004] 
[    1.588004] -> #1 ((completion)&abba.b_ready){+.+.}:
[    1.588004]        __lock_acquire+0xb86/0xe99
[    1.588004]        test_abba+0x146/0x234
[    1.588004]        schedule_timeout+0x0/0xd3
[    1.588004]        lock_acquire+0x82/0xad
[    1.588004]        test_abba+0x146/0x234
[    1.588004]        __wait_for_common+0x57/0x219
[    1.588004]        test_abba+0x146/0x234
[    1.588004]        mark_held_locks+0x50/0x6d
[    1.588004]        _raw_spin_unlock_irqrestore+0x3d/0x61
[    1.588004]        test_abba+0x146/0x234
[    1.588004]        test_abba_work+0x0/0xb7
[    1.588004]        test_ww_mutex_init+0xe4/0x44d
[    1.588004]        test_ww_mutex_init+0x0/0x44d
[    1.588004]        do_one_initcall+0xd2/0x24c
[    1.588004]        parse_args+0x135/0x221
[    1.588004]        kernel_init_freeable+0x153/0x279
[    1.588004]        kernel_init+0x0/0xe6
[    1.588004]        kernel_init+0x5/0xe6
[    1.588004]        ret_from_fork+0x24/0x30
[    1.588004] 
[    1.588004] -> #0 (ww_class_mutex){+.+.}:
[    1.588004]        test_abba_work+0x37/0xb7
[    1.588004] 
[    1.588004] other info that might help us debug this:
[    1.588004] 
[    1.588004]  Possible unsafe locking scenario by crosslock:
[    1.588004] 
[    1.588004]        CPU0                    CPU1
[    1.588004]        ----                    ----
[    1.588004]   lock(ww_class_mutex);
[    1.588004]   lock((completion)&abba.b_ready);
[    1.588004]                                lock(ww_class_mutex);
[    1.588004]                                unlock((completion)&abba.b_ready);
[    1.588004] 
[    1.588004]  *** DEADLOCK ***
[    1.588004] 
[    1.588004] 5 locks held by kworker/0:1/14:
[    1.588004]  #0:  ((wq_completion)"events"){+.+.}, at: [<ffffffff8109036a>] process_one_work+0x164/0x303
[    1.588004]  #1:  ((work_completion)(&abba.work)){+.+.}, at: [<ffffffff8109036a>] process_one_work+0x164/0x303
[    1.588004]  #2:  (ww_class_acquire){+.+.}, at: [<ffffffff810a9c97>] test_abba_work+0x2c/0xb7
[    1.588004]  #3:  (ww_class_mutex){+.+.}, at: [<ffffffff810a9ca2>] test_abba_work+0x37/0xb7
[    1.588004]  #4:  (&x->wait#5){....}, at: [<ffffffff8109f90a>] complete+0x13/0x4b
[    1.588004] 
[    1.588004] stack backtrace:
[    1.588004] CPU: 0 PID: 14 Comm: kworker/0:1 Not tainted 4.14.0-01253-gb345a340 #1
[    1.588004] Workqueue: events test_abba_work
[    1.588004] Call Trace:
[    1.588004]  ? print_circular_bug+0x2a0/0x2ae
[    1.588004]  ? check_prev_add+0x95/0x253
[    1.588004]  ? look_up_lock_class+0x114/0x114
[    1.588004]  ? lock_commit_crosslock+0x322/0x3e1
[    1.588004]  ? complete+0x1f/0x4b
[    1.588004]  ? test_abba_work+0x43/0xb7
[    1.588004]  ? process_one_work+0x1d5/0x303
[    1.588004]  ? process_one_work+0x164/0x303
[    1.588004]  ? process_scheduled_works+0x27/0x27
[    1.588004]  ? worker_thread+0x1a7/0x25d
[    1.588004]  ? process_scheduled_works+0x27/0x27
[    1.588004]  ? kthread+0x126/0x12e
[    1.588004]  ? __list_del_entry+0x1d/0x1d
[    1.588004]  ? ret_from_fork+0x24/0x30
[    2.080076] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x25641074d3b, max_idle_ns: 440795244898 ns
[    7.641811] Initialise system trusted keyrings
[    7.642238] workingset: timestamp_bits=46 max_order=17 bucket_order=0
[    7.642782] zbud: loaded
[    7.643661] QNX4 filesystem 0.2.3 registered.

                                                          # HH:MM RESULT GOOD BAD GOOD_BUT_DIRTY DIRTY_NOT_BAD
git bisect start 5bef2980adef8a6032d4f4709aebe9486181052f 4fbd8d194f06c8a3fd2af1ce560ddb31f7ec8323 --
git bisect good 6dae39d936707941d0c1fce8426028c01203d050  # 07:00  G     13     0    8  10  Merge remote-tracking branch 'hwmon-staging/hwmon-next'
git bisect good 6959ac327b8a044f95a4485aaa6f4e2b1f7084a3  # 07:58  G     13     0    8   8  Merge remote-tracking branch 'vfio/next'
git bisect  bad 2ca8454a2f35c2780adeadf8a92561e2bbfcd235  # 08:25  B      4     3    4   4  Merge remote-tracking branch 'staging/staging-next'
git bisect  bad 0d1f02010f2b7705f76bab40d6ee8ea9c7d2598e  # 08:53  B      2     9    2   2  Merge remote-tracking branch 'percpu/for-next'
git bisect  bad bdc88674ac338d2d1f769f80b87dddf46e951c90  # 09:36  B      6     6    6  10  Merge remote-tracking branch 'clockevents/clockevents/next'
git bisect good f85fb1ec7c5d1334a2083f931386963db58e52f9  # 10:06  G     16     0    7  11  Merge remote-tracking branch 'spi/for-next'
git bisect  bad 102f0423f953969b3191c603fabc64338f6adcd7  # 10:50  B      6     6    6   6  Merge remote-tracking branch 'tip/auto-latest'
git bisect good f6751f178eeaf3da8c156d2a2fd7a0feccfab5ae  # 11:23  G     16     0    5   9  tools/headers: Synchronize kernel x86 UAPI headers
git bisect good e0c2535f18156c71b68b25861e76e06ca77151e5  # 11:44  G     16     0   10  12  Merge branch 'x86/urgent'
git bisect good 83529b2d6168ee82520a4ec7cc3df9b18603aea4  # 12:02  G     16     0   11  11  x86/mm/kaiser: Prepare the x86/entry assembly code for entry/exit CR3 switching
git bisect good 01e673bc37a640d9708bf9e7f30ad06a89ecafae  # 12:15  G     15     0   11  11  x86/mm: Remove hard-coded ASID limit checks
git bisect  bad 293a2ca794ee17d490daa036171f79dd090fbc8c  # 12:30  B      2     6    2   3  x86/mm/kaiser: Respect disabled CPU features
git bisect  bad b345a34006d85c6cc2fd37baddce5bdbc0b3aef6  # 12:47  B      9     7    9  11  x86/mm/kaiser: Use PCID feature to make user and kernel switches faster
git bisect good e794054d5a5dd62e38ed47be37072f7d2ed7879b  # 13:17  G     17     0    6   8  x86/mm: Allow flushing for future ASID switches
# first bad commit: [b345a34006d85c6cc2fd37baddce5bdbc0b3aef6] x86/mm/kaiser: Use PCID feature to make user and kernel switches faster
git bisect good e794054d5a5dd62e38ed47be37072f7d2ed7879b  # 13:34  G     51     0   27  37  x86/mm: Allow flushing for future ASID switches
# extra tests with debug options
git bisect  bad b345a34006d85c6cc2fd37baddce5bdbc0b3aef6  # 13:50  B      5    12    5   5  x86/mm/kaiser: Use PCID feature to make user and kernel switches faster
# extra tests on HEAD of linux-next/master
git bisect  bad 5bef2980adef8a6032d4f4709aebe9486181052f  # 13:51  B      0     8  101  41  Add linux-next specific files for 20171128
# extra tests on tree/branch linux-next/master
git bisect  bad 5bef2980adef8a6032d4f4709aebe9486181052f  # 13:52  B      0     8  101  41  Add linux-next specific files for 20171128
# extra tests with first bad commit reverted
git bisect good a4f77bcdc73cb380ae912949b819359287ac7185  # 14:37  G     17     0    7   7  Revert "x86/mm/kaiser: Use PCID feature to make user and kernel switches faster"

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/lkp                          Intel Corporation

Download attachment "dmesg-yocto-lkp-hsw01-106:20171129124645:x86_64-randconfig-s2-11282118:4.14.0-01253-gb345a340:1.gz" of type "application/gzip" (18652 bytes)

Download attachment "dmesg-vm-ivb41-yocto-ia32-29:20171129130752:x86_64-randconfig-s2-11282118:4.14.0-01252-ge794054:1.gz" of type "application/gzip" (21730 bytes)

View attachment "reproduce-yocto-lkp-hsw01-106:20171129124645:x86_64-randconfig-s2-11282118:4.14.0-01253-gb345a340:1" of type "text/plain" (896 bytes)

View attachment "config-4.14.0-01253-gb345a340" of type "text/plain" (114116 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ