[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <5a1aaa36.CWNgvwmmRFzeAlPc%fengguang.wu@intel.com>
Date: Sun, 26 Nov 2017 19:49:10 +0800
From: kernel test robot <fengguang.wu@...el.com>
To: Dave Hansen <dave.hansen@...ux.intel.com>
Cc: LKP <lkp@...org>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...nel.org>, wfg@...ux.intel.com
Subject: 2f47e7e19f ("x86/mm/kaiser: Use PCID feature to make user and
.."): WARNING: CPU: 0 PID: 1 at mm/early_ioremap.c:114 __early_ioremap
Greetings,
0day kernel testing robot got the below dmesg and the first bad commit is
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.x86/mm
commit 2f47e7e19f351692bc8048a0e6f3960dc6734cfd
Author: Dave Hansen <dave.hansen@...ux.intel.com>
AuthorDate: Wed Nov 22 16:35:09 2017 -0800
Commit: Ingo Molnar <mingo@...nel.org>
CommitDate: Fri Nov 24 14:47:24 2017 +0100
x86/mm/kaiser: Use PCID feature to make user and kernel switches faster
Short summary: Use x86 PCID feature to avoid flushing the TLB at all
interrupts and syscalls. Speed them up. Makes context switches
and TLB flushing slower.
Background:
KAISER keeps two copies of the page tables. Switches between the
copies are performed by writing to the CR3 register. But, CR3
was really designed for context switches and writes to it also
flush the entire TLB (modulo global pages). This TLB flush
increases the cost of interrupts and context switches. For
syscall-heavy microbenchmarks it can cut the rate of syscalls by
2/3.
The kernel recently gained support for and Intel CPU feature
called Process Context IDentifiers (PCID) thanks to Andy
Lutomirski. This feature is intended to allow you to switch
between contexts without flushing the TLB.
Implementation:
PCIDs can be used to avoid flushing the TLB at kernel entry/exit.
This is speeds up both interrupts and syscalls.
First, the kernel and userspace must be assigned different ASIDs.
On entry from userspace, move over to the kernel page tables
*and* ASID. On exit, restore the user page tables and ASID.
Fortunately, the ASID is programmed via CR3, which is already
being used to switch between the user and kernel page tables.
This gives us convenient, one-stop shopping.
The CR3 write which is used to switch between processes provides
all the TLB flushing normally required at context switch time.
But, with KAISER, that CR3 write only flushes the current
(kernel) ASID. An extra TLB flush operation is now required in
order to flush the user ASID. This new instruction (INVPCID) is
probably ~100 cycles, but this is done with the assumption that
the time lost in context switches is more than made up for by
lower cost of interrupts and syscalls.
Support:
PCIDs are generally available on Sandybridge and newer CPUs. However,
the accompanying INVPCID instruction did not become available until
Haswell (the ones with "v4", or called fourth-generation Core). This
instruction allows non-current-PCID TLB entries to be flushed without
switching CR3 and global pages to be flushed without a double
MOV-to-CR4.
Without INVPCID, PCIDs are much harder to use. TLB invalidation gets
much more onerous:
1. Every kernel TLB flush (even for a single page) requires an
interrupts-off MOV-to-CR4 which is very expensive. This is because
there is no way to flush a kernel address that might be loaded
in *EVERY* PCID. Right now, there are "only" ~12 of these per-cpu,
but that's too painful to use the MOV-to-CR3 to flush them. That
leaves only the MOV-to-CR4.
2. Every userspace flush (even for a single page requires one of the
following:
a. A pair of flushing (bit 63 clear) CR3 writes: one for
the kernel ASID and another for userspace.
b. A pair of non-flushing CR3 writes (bit 63 set) with the
flush done for each. For instance, what is currently a
single instruction without KAISER:
invpcid_flush_one(current_pcid, addr);
becomes this with KAISER:
invpcid_flush_one(current_kern_pcid, addr);
invpcid_flush_one(current_user_pcid, addr);
and this without INVPCID:
__native_flush_tlb_single(addr);
write_cr3(mm->pgd | current_user_pcid | NOFLUSH);
__native_flush_tlb_single(addr);
write_cr3(mm->pgd | current_kern_pcid | NOFLUSH);
So, for now, fully disable PCIDs with KAISER when INVPCID is not
available. This is fixable, but it's an optimization that can be
performed later.
Hugh Dickins also points out that PCIDs really have two distinct
use-cases in the context of KAISER. The first way they can be used
is as "TLB preservation across context-switch", which is what
Andy Lutomirksi's 4.14 PCID code does. They can also be used as
a "KAISER syscall/interrupt accelerator". If we just use them to
speed up syscall/interrupts (and ignore the context-switch TLB
preservation), then the deficiency of not having INVPCID
becomes much less onerous.
Signed-off-by: Dave Hansen <dave.hansen@...ux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: daniel.gruss@...k.tugraz.at
Cc: hughd@...gle.com
Cc: keescook@...gle.com
Cc: linux-mm@...ck.org
Cc: luto@...nel.org
Cc: michael.schwarz@...k.tugraz.at
Cc: moritz.lipp@...k.tugraz.at
Cc: richard.fellner@...dent.tugraz.at
Link: https://lkml.kernel.org/r/20171123003509.EC42DD15@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@...nel.org>
41492416f4 x86/mm/kaiser: Allow flushing for future ASID switches
2f47e7e19f x86/mm/kaiser: Use PCID feature to make user and kernel switches faster
a606c92732 Fix: "x86/mm/kaiser: Unmap kernel from userspace page tables (core patch)"
4b0560b639 Merge branch 'WIP.x86/mm'
+-------------------------------------------------------+------------+------------+------------+------------+
| | 41492416f4 | 2f47e7e19f | a606c92732 | 4b0560b639 |
+-------------------------------------------------------+------------+------------+------------+------------+
| boot_successes | 32 | 0 | 4 | 4 |
| boot_failures | 4 | 15 | 12 | 12 |
| WARNING:at_drivers/pci/pci-sysfs.c:#pci_mmap_resource | 4 | 3 | 0 | 2 |
| RIP:pci_mmap_resource | 4 | 3 | 0 | 2 |
| BUG:kernel_hang_in_test_stage | 0 | 3 | | |
| WARNING:at_mm/early_ioremap.c:#__early_ioremap | 0 | 10 | 9 | 10 |
| RIP:__early_ioremap | 0 | 10 | 9 | 10 |
| kernel_BUG_at_arch/x86/kernel/mpparse.c | 0 | 1 | 3 | 2 |
| PANIC:early_exception | 0 | 1 | 3 | 2 |
| RIP:default_get_smp_config | 0 | 1 | 3 | 2 |
| BUG:kernel_hang_in_boot_stage | 0 | 1 | | |
+-------------------------------------------------------+------------+------------+------------+------------+
[ 0.026886] pinctrl core: initialized pinctrl subsystem
[ 0.027309] regulator-dummy: no parameters
[ 0.028260] NET: Registered protocol family 16
[ 0.029939] cpuidle: using governor ladder
[ 0.030048] ------------[ cut here ]------------
[ 0.030613] WARNING: CPU: 0 PID: 1 at mm/early_ioremap.c:114 __early_ioremap+0x2d/0x223
[ 0.031000] CPU: 0 PID: 1 Comm: swapper Not tainted 4.14.0-01247-g2f47e7e #1
[ 0.031000] task: ffff88000002a000 task.stack: ffffc90000000000
[ 0.031000] RIP: 0010:__early_ioremap+0x2d/0x223
[ 0.031000] RSP: 0000:ffffc90000003dd8 EFLAGS: 00010202
[ 0.031000] RAX: 8000000000000163 RBX: 000000000000040e RCX: 0000000000000002
[ 0.031000] RDX: 8000000000000163 RSI: 0000000000000002 RDI: 000000000000040e
[ 0.031000] RBP: 0000000000000002 R08: 0000000000000001 R09: 0000000000000000
[ 0.031000] R10: 0000000000000000 R11: 000000009279605b R12: 000000000000040e
[ 0.031000] R13: cccccccccccccccd R14: 0000000000000000 R15: 0000000000000000
[ 0.031000] FS: 0000000000000000(0000) GS:ffffffff82ca4000(0000) knlGS:0000000000000000
[ 0.031000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.031000] CR2: 0000000000000000 CR3: 0000000002c5d001 CR4: 00000000001606f0
[ 0.031000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 0.031000] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 0.031000] Call Trace:
[ 0.031000] ? kernfs_add_one+0x1d9/0x1f0
[ 0.031000] early_memremap+0x33/0x3d
[ 0.031000] ? cnb20le_res+0x2f2/0x2f2
[ 0.031000] __acpi_map_table+0x1d/0x28
[ 0.031000] acpi_os_map_iomem+0x1cf/0x2a0
[ 0.031000] ? cnb20le_res+0x2f2/0x2f2
[ 0.031000] acpi_os_map_memory+0xd/0x20
[ 0.031000] acpi_find_root_pointer+0x1f/0x1ec
[ 0.031000] ? cnb20le_res+0x2f2/0x2f2
[ 0.031000] acpi_os_get_root_pointer+0x18/0x25
[ 0.031000] broadcom_postcore_init+0xc/0x6c
[ 0.031000] do_one_initcall+0xc4/0x1f7
[ 0.031000] kernel_init_freeable+0x1c2/0x2b2
[ 0.031000] ? rest_init+0x1a0/0x1a0
[ 0.031000] kernel_init+0xd/0x1bc
[ 0.031000] ret_from_fork+0x1f/0x30
[ 0.031000] Code: 41 56 48 89 f1 41 55 41 54 55 53 48 83 ec 18 48 ff 05 bd 86 bd 00 83 3d 7a 68 aa ff 00 48 89 54 24 08 74 09 48 ff 05 b0 86 bd 00 <0f> ff 48 ff 05 af 86 bd 00 31 d2 48 8b 04 d5 80 7f 91 83 41 89
[ 0.031000] ---[ end trace 3959ffc51c8abd55 ]---
[ 0.031029] ------------[ cut here ]------------
[ 0.031029] ------------[ cut here ]------------
[ 0.031555] WARNING: CPU: 0 PID: 1 at mm/early_ioremap.c:114 __early_ioremap+0x2d/0x223
[ 0.032000] CPU: 0 PID: 1 Comm: swapper Tainted: G W 4.14.0-01247-g2f47e7e #1
[ 0.032000] task: ffff88000002a000 task.stack: ffffc90000000000
[ 0.032000] RIP: 0010:__early_ioremap+0x2d/0x223
[ 0.032000] RSP: 0000:ffffc90000003dd8 EFLAGS: 00010202
[ 0.032000] RAX: 8000000000000163 RBX: 000000000009fc00 RCX: 0000000000000400
[ 0.032000] RDX: 8000000000000163 RSI: 0000000000000400 RDI: 000000000009fc00
[ 0.032000] RBP: 0000000000000400 R08: 0000000000000002 R09: 0000000000000000
[ 0.032000] R10: 0000000000000000 R11: 000000009279605b R12: 000000000009fc00
[ 0.032000] R13: cccccccccccccccd R14: 0000000000000000 R15: 0000000000000000
[ 0.032000] FS: 0000000000000000(0000) GS:ffffffff82ca4000(0000) knlGS:0000000000000000
[ 0.032000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.032000] CR2: 0000000000000000 CR3: 0000000002c5d001 CR4: 00000000001606f0
[ 0.032000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 0.032000] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 0.032000] Call Trace:
[ 0.032000] early_memremap+0x33/0x3d
[ 0.032000] __acpi_map_table+0x1d/0x28
[ 0.032000] acpi_os_map_iomem+0x1cf/0x2a0
[ 0.032000] acpi_os_map_memory+0xd/0x20
[ 0.032000] acpi_find_root_pointer+0x9a/0x1ec
[ 0.032000] ? cnb20le_res+0x2f2/0x2f2
[ 0.032000] acpi_os_get_root_pointer+0x18/0x25
[ 0.032000] broadcom_postcore_init+0xc/0x6c
[ 0.032000] do_one_initcall+0xc4/0x1f7
[ 0.032000] kernel_init_freeable+0x1c2/0x2b2
[ 0.032000] ? rest_init+0x1a0/0x1a0
[ 0.032000] kernel_init+0xd/0x1bc
[ 0.032000] ret_from_fork+0x1f/0x30
[ 0.032000] Code: 41 56 48 89 f1 41 55 41 54 55 53 48 83 ec 18 48 ff 05 bd 86 bd 00 83 3d 7a 68 aa ff 00 48 89 54 24 08 74 09 48 ff 05 b0 86 bd 00 <0f> ff 48 ff 05 af 86 bd 00 31 d2 48 8b 04 d5 80 7f 91 83 41 89
[ 0.032000] ---[ end trace 3959ffc51c8abd56 ]---
[ 0.032038] ------------[ cut here ]------------
[ 0.032038] ------------[ cut here ]------------
[ 0.032588] WARNING: CPU: 0 PID: 1 at mm/early_ioremap.c:114 __early_ioremap+0x2d/0x223
[ 0.033000] CPU: 0 PID: 1 Comm: swapper Tainted: G W 4.14.0-01247-g2f47e7e #1
[ 0.033000] task: ffff88000002a000 task.stack: ffffc90000000000
[ 0.033000] RIP: 0010:__early_ioremap+0x2d/0x223
[ 0.033000] RSP: 0000:ffffc90000003dd8 EFLAGS: 00010206
[ 0.033000] RAX: 8000000000000163 RBX: 00000000000e0000 RCX: 0000000000020000
[ 0.033000] RDX: 8000000000000163 RSI: 0000000000020000 RDI: 00000000000e0000
[ 0.033000] RBP: 0000000000020000 R08: 0000000000000400 R09: 0000000000000000
[ 0.033000] R10: 0000000000000052 R11: 000000009279605b R12: 00000000000e0000
[ 0.033000] R13: ffffffffff200c00 R14: 0000000000000000 R15: 0000000000000000
[ 0.033000] FS: 0000000000000000(0000) GS:ffffffff82ca4000(0000) knlGS:0000000000000000
[ 0.033000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.033000] CR2: 0000000000000000 CR3: 0000000002c5d001 CR4: 00000000001606f0
[ 0.033000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 0.033000] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 0.033000] Call Trace:
[ 0.033000] early_memremap+0x33/0x3d
[ 0.033000] __acpi_map_table+0x1d/0x28
[ 0.033000] acpi_os_map_iomem+0x1cf/0x2a0
[ 0.033000] acpi_os_map_memory+0xd/0x20
[ 0.033000] acpi_find_root_pointer+0x132/0x1ec
[ 0.033000] ? cnb20le_res+0x2f2/0x2f2
[ 0.033000] acpi_os_get_root_pointer+0x18/0x25
[ 0.033000] broadcom_postcore_init+0xc/0x6c
[ 0.033000] do_one_initcall+0xc4/0x1f7
[ 0.033000] kernel_init_freeable+0x1c2/0x2b2
[ 0.033000] ? rest_init+0x1a0/0x1a0
[ 0.033000] kernel_init+0xd/0x1bc
[ 0.033000] ret_from_fork+0x1f/0x30
[ 0.033000] Code: 41 56 48 89 f1 41 55 41 54 55 53 48 83 ec 18 48 ff 05 bd 86 bd 00 83 3d 7a 68 aa ff 00 48 89 54 24 08 74 09 48 ff 05 b0 86 bd 00 <0f> ff 48 ff 05 af 86 bd 00 31 d2 48 8b 04 d5 80 7f 91 83 41 89
[ 0.033000] ---[ end trace 3959ffc51c8abd57 ]---
[ 0.033873] PCI: Using configuration type 1 for base access
# HH:MM RESULT GOOD BAD GOOD_BUT_DIRTY DIRTY_NOT_BAD
git bisect start e407bc20c82ec3d644f03148179ad62ef1eded07 bebc6082da0a9f5d47a1ea2edc099bf671058bd4 --
git bisect bad a0c6abbabf34cb7b4c97e2e368f089e6e3c0f3fa # 08:02 B 0 8 21 0 Merge 'linux-review/Shannon-Nelson/xfrm-add-documentation-for-xfrm-device-offload-api/20171121-183400' into devel-hourly-2017112520
git bisect bad 05626bec547df3e1ea91810fe8cc40eaa49e9a0a # 08:39 B 3 8 3 3 Merge 'linux-review/Johan-Hovold/USB-chipidea-msm-fix-ulpi-node-lookup/20171116-063432' into devel-hourly-2017112520
git bisect bad fb0b88166f094f3f77108c1e9a2e2fcd9c4d3b4a # 10:50 B 2 1 2 3 Merge 'linux-review/Jesse-Chan/phy-qcom-ufs-add-missing-MODULE_DESCRIPTION-LICENSE/20171122-005904' into devel-hourly-2017112520
git bisect bad a520eb1857b691f3909377d4e72bc19760bb6689 # 11:40 B 0 9 22 0 Merge 'uml/linux-next' into devel-hourly-2017112520
git bisect bad e683d1330e1bb163f945969baa9691b954c88538 # 12:02 B 0 3 16 0 Merge 'linux-review/Jesse-Chan/mtd-nand-denali_pci-add-missing-MODULE_DESCRIPTION-AUTHOR-LICENSE/20171121-221924' into devel-hourly-2017112520
git bisect good 8686ffa141fb8e0520214198c8494907e00bdef5 # 12:32 G 12 0 1 1 Merge 'linux-review/Michal-Hocko/xfs-handle-register_shrinker-error/20171125-164820' into devel-hourly-2017112520
git bisect good 1d28afe4fd6c4c77c34e8a6acc5c12b4a4ca40f4 # 13:03 G 12 0 0 0 Merge 'andersson-remoteproc/rproc-next' into devel-hourly-2017112520
git bisect bad 2b18f915d6386c043ef49284cc6cfd0fb0b9f5ba # 13:45 B 0 2 15 0 Merge 'tip/master' into devel-hourly-2017112520
git bisect good f2be8bd52e7410c70145f73511a2e80f4797e1a5 # 14:20 G 12 0 1 1 Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 2bcc673101268dc50e52b83226c5bbf38391e16d # 14:56 G 12 0 0 0 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad a606c927321e1d58d02eb16787550d2f3ae4c8a2 # 15:16 B 2 4 2 2 Fix: "x86/mm/kaiser: Unmap kernel from userspace page tables (core patch)"
git bisect good 3643b7e05b16a9fc4077ec56b655a1f8547d259c # 15:39 G 12 0 1 1 Merge branch 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 48bb67c264d721142d32fa3da9a17e4e11c18d41 # 16:08 G 11 0 2 2 x86/asm: Fix assumptions that the HW TSS is at the beginning of cpu_tss
git bisect good aaf4ad10f1d60618459e9d8b5257e48440c4dec4 # 16:56 G 12 0 2 2 x86/mm/kaiser: Make sure static PGDs are 8k in size
git bisect bad 2f47e7e19f351692bc8048a0e6f3960dc6734cfd # 17:17 B 1 10 1 1 x86/mm/kaiser: Use PCID feature to make user and kernel switches faster
git bisect good e332b7a62359fe95fc61eba443d8ab1e9099514c # 17:37 G 12 0 0 0 x86/mm/kaiser: Map virtually-addressed performance monitoring buffers
git bisect good 7f3a99b7858cbfb41ddb76bea842768a05458774 # 18:15 G 12 0 0 0 x86/mm: Remove hard-coded ASID limit checks
git bisect good 41492416f44b01574b4dd5da69daabb14e35a6f6 # 18:40 G 12 0 1 1 x86/mm/kaiser: Allow flushing for future ASID switches
# first bad commit: [2f47e7e19f351692bc8048a0e6f3960dc6734cfd] x86/mm/kaiser: Use PCID feature to make user and kernel switches faster
git bisect good 41492416f44b01574b4dd5da69daabb14e35a6f6 # 18:45 G 35 0 3 4 x86/mm/kaiser: Allow flushing for future ASID switches
# extra tests with debug options
git bisect bad 2f47e7e19f351692bc8048a0e6f3960dc6734cfd # 18:58 B 0 11 24 0 x86/mm/kaiser: Use PCID feature to make user and kernel switches faster
# extra tests on HEAD of linux-devel/devel-hourly-2017112520
git bisect bad e407bc20c82ec3d644f03148179ad62ef1eded07 # 18:58 B 1 288 0 53 0day head guard for 'devel-hourly-2017112520'
# extra tests on tree/branch tip/WIP.x86/mm
git bisect bad a606c927321e1d58d02eb16787550d2f3ae4c8a2 # 19:03 B 0 9 26 3 Fix: "x86/mm/kaiser: Unmap kernel from userspace page tables (core patch)"
# extra tests with first bad commit reverted
git bisect good f9902ab9b24734c21dcd528d692a3b4e4f683877 # 19:27 G 12 0 2 2 Revert "x86/mm/kaiser: Use PCID feature to make user and kernel switches faster"
# extra tests on tree/branch tip/master
git bisect bad 4b0560b639decf934c503b8d85098ab9b501e8bc # 19:41 B 2 5 2 2 Merge branch 'WIP.x86/mm'
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/lkp Intel Corporation
Download attachment "dmesg-yocto-lkp-hsw01-102:20171126171552:x86_64-randconfig-s1-11260311:4.14.0-01247-g2f47e7e:1.gz" of type "application/gzip" (26457 bytes)
Download attachment "dmesg-yocto-lkp-hsw01-42:20171126184259:x86_64-randconfig-s1-11260311:4.14.0-01246-g4149241:1.gz" of type "application/gzip" (26597 bytes)
View attachment "reproduce-yocto-lkp-hsw01-102:20171126171552:x86_64-randconfig-s1-11260311:4.14.0-01247-g2f47e7e:1" of type "text/plain" (896 bytes)
View attachment "config-4.14.0-01247-g2f47e7e" of type "text/plain" (97852 bytes)
Powered by blists - more mailing lists