lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c6a9632e-cdcb-cf05-183e-a124e9cec0e2@intel.com>
Date:   Wed, 9 Mar 2022 15:12:51 -0800
From:   "Chang S. Bae" <chang.seok.bae@...el.com>
To:     Dave Hansen <dave.hansen@...el.com>,
        <linux-kernel@...r.kernel.org>, <x86@...nel.org>,
        <linux-pm@...r.kernel.org>
CC:     <tglx@...utronix.de>, <dave.hansen@...ux.intel.com>,
        <peterz@...radead.org>, <bp@...en8.de>, <rafael@...nel.org>,
        <ravi.v.shankar@...el.com>
Subject: Re: [PATCH v2 1/2] x86/fpu: Add a helper to prepare AMX state for
 low-power CPU idle

On 3/9/2022 2:46 PM, Dave Hansen wrote:
> On 3/9/22 14:34, Chang S. Bae wrote:
>> +/*
>> + * Initialize register state that may prevent from entering low-power idle.
>> + * This function will be invoked from the cpuidle driver only when needed.
>> + */
>> +void fpu_idle_fpregs(void)
>> +{
>> +	if (!fpu_state_size_dynamic())
>> +		return;
> 
> Is this check just an optimization?  

No. 0day reported the splat [3] with the earlier code in v1 [1]:

if (fpu_state_size_dynamic() && (xfeatures_in_use() & 
XFEATURE_MASK_XTILE)) { ... }

It looks like GCC-9 reordered to hit XGETBV without checking 
fpu_state_size_dynamic(). So this line was separated to avoid that.

> I'm having trouble imagining a situation where we would have:
> 
> 	(xfeatures_in_use() & XFEATURE_MASK_XTILE) == true
> but
> 	fpu_state_size_dynamic() == false

Yes, I don't think we have such situation in practice.

> 
>> +	if (xfeatures_in_use() & XFEATURE_MASK_XTILE) {
>> +		tile_release();
>> +		fpregs_deactivate(&current->thread.fpu);
>> +	}
>> +}
> 
> xfeatures_in_use() isn't exactly expensive either.

True.

Thanks,
Chang

[1] 
https://lore.kernel.org/lkml/20211104225226.5031-3-chang.seok.bae@intel.com/

[2] 
https://lore.kernel.org/lkml/20211104225226.5031-4-chang.seok.bae@intel.com/

[3] 0-day report:

FYI, we noticed the following commit (built with gcc-9):

commit: 66951fd5ad7a2aabae7dc54be46a4eac667a082c ("x86/fpu: Prepare AMX 
state for CPU idle") internal-devel devel-hourly-20220113-144327

[chang: this refers to [2]. ]
...

on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 
-m 16G

caused below changes (please refer to attached dmesg/kmsg for entire 
log/backtrace):

[    1.767825][    T1] x86: Booting SMP configuration:
[    1.768363][    T1] .... node  #0, CPUs:      #1
[    0.113687][    T0] kvm-clock: cpu 1, msr 38a24a041, secondary cpu clock
[    0.113687][    T0] masked ExtINT on CPU#1
[    0.113687][    T0] smpboot: CPU 1 Converting physical 0 to logical die 1
[    1.780411][    T0] general protection fault, maybe for address 0x1: 
0000 [#1] SMP KASAN PTI
[    1.781350][    T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
5.16.0-00004-g66951fd5ad7a #1
[    1.781350][    T0] Hardware name: QEMU Standard PC (i440FX + PIIX, 
1996), BIOS 1.12.0-1 04/01/2014
[ 1.781350][ T0] RIP: 0010:fpu_idle_fpregs 
(kbuild/src/consumer/arch/x86/include/asm/fpu/xcr.h:12 
kbuild/src/consumer/arch/x86/include/asm/fpu/xcr.h:32 
kbuild/src/consumer/arch/x86/kernel/fpu/core.c:772)
[ 1.781350][ T0] Code: 89 74 24 04 e8 d4 6e 82 00 8b 74 24 04 e9 f8 fe 
ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 eb 01 c3 b9 01 00 
00 00 <0f> 01 d0 a9 00 00 06 00 75 01 c3 55 53 c4 e2 78 49 c0 65 48 8b 
2c All code ========
    0:   89 74 24 04              mov    %esi,0x4(%rsp)
    4:   e8 d4 6e 82 00           callq  0x826edd
    9:   8b 74 24 04              mov    0x4(%rsp),%esi
    d:   e9 f8 fe ff ff           jmpq   0xffffffffffffff0a
   12:   66 66 2e 0f 1f 84 00     data16 nopw %cs:0x0(%rax,%rax,1)
   19:   00 00 00 00
   1d:   0f 1f 44 00 00           nopl   0x0(%rax,%rax,1)
   22:   eb 01                    jmp    0x25
   24:   c3                       retq
   25:   b9 01 00 00 00           mov    $0x1,%ecx
   2a:*  0f 01 d0                 xgetbv           <-- trapping instruction
   2d:   a9 00 00 06 00           test   $0x60000,%eax
   32:   75 01                    jne    0x35
   34:   c3                       retq
   35:   55                       push   %rbp
   36:   53                       push   %rbx
   37:   c4 e2 78 49              (bad)
   3b:   c0 65 48 8b              shlb   $0x8b,0x48(%rbp)
   3f:   2c                       .byte 0x2c

Code starting with the faulting instruction 
===========================================
    0:   0f 01 d0                 xgetbv
    3:   a9 00 00 06 00           test   $0x60000,%eax
    8:   75 01                    jne    0xb
    a:   c3                       retq
    b:   55                       push   %rbp
    c:   53                       push   %rbx
    d:   c4 e2 78 49              (bad)
   11:   c0 65 48 8b              shlb   $0x8b,0x48(%rbp)
   15:   2c                       .byte 0x2c
[    1.781350][    T0] RSP: 0000:ffffffff96007e50 EFLAGS: 00010047
[    1.781350][    T0] RAX: 0000000000000001 RBX: ffffffff96022280 RCX: 
0000000000000001
[    1.781350][    T0] RDX: 0000000000000000 RSI: 0000000000000008 RDI: 
ffffffff96db7da0
[    1.781350][    T0] RBP: 0000000000000000 R08: 0000000000000001 R09: 
fffffbfff2db6fb5
[    1.781350][    T0] R10: ffffffff96db7da7 R11: fffffbfff2db6fb4 R12: 
fffffbfff2c04450
[    1.781350][    T0] R13: ffffffff96db7da0 R14: dffffc0000000000 R15: 
0000000000000000
[    1.781350][    T0] FS:  0000000000000000(0000) 
GS:ffff88839d600000(0000) knlGS:0000000000000000
[    1.781350][    T0] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.781350][    T0] CR2: ffff88843ffff000 CR3: 0000000388a14000 CR4: 
00000000000406f0
[    1.781350][    T0] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[    1.781350][    T0] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
0000000000000400
[    1.781350][    T0] Call Trace:
[    1.781350][    T0]  <TASK>
[ 1.781350][ T0] ? arch_cpu_idle_enter 
(kbuild/src/consumer/arch/x86/kernel/process.c:712)
[ 1.781350][ T0] ? do_idle (kbuild/src/consumer/kernel/sched/idle.c:294)
[ 1.781350][ T0] ? _raw_read_unlock_irqrestore 
(kbuild/src/consumer/kernel/locking/spinlock.c:161)
[ 1.781350][ T0] ? arch_cpu_idle_exit+0xc0/0xc0 [ 1.781350][ T0] ? 
schedule (kbuild/src/consumer/arch/x86/include/asm/bitops.h:207 
(discriminator 1) 
kbuild/src/consumer/include/asm-generic/bitops/instrumented-non-atomic.h:135 
(discriminator 1) kbuild/src/consumer/include/linux/thread_info.h:118 
(discriminator 1) kbuild/src/consumer/include/linux/sched.h:2120 
(discriminator 1) kbuild/src/consumer/kernel/sched/core.c:6328 
(discriminator 1)) [ 1.781350][ T0] ? cpu_startup_entry 
(kbuild/src/consumer/kernel/sched/idle.c:402 (discriminator 1)) [ 
1.781350][ T0] ? start_kernel (kbuild/src/consumer/init/main.c:1137)
[ 1.781350][ T0] ? secondary_startup_64_no_verify 
(kbuild/src/consumer/arch/x86/kernel/head_64.S:283)
[    1.781350][    T0]  </TASK>
[    1.781350][    T0] Modules linked in:
[    1.781350][    T0] ---[ end trace 2bef003d678c9f1a ]---
[    1.781350][    T0] general protection fault, maybe for address 0x1: 
0000 [#2] SMP KASAN PTI
[    1.781350][    T0] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G      D 
          5.16.0-00004-g66951fd5ad7a #1
[ 1.781350][ T0] RIP: 0010:fpu_idle_fpregs 
(kbuild/src/consumer/arch/x86/include/asm/fpu/xcr.h:12 
kbuild/src/consumer/arch/x86/include/asm/fpu/xcr.h:32 
kbuild/src/consumer/arch/x86/kernel/fpu/core.c:772)
[    1.781350][    T0] Hardware name: QEMU Standard PC (i440FX + PIIX, 
1996), BIOS 1.12.0-1 04/01/2014
[ 1.781350][ T0] Code: 89 74 24 04 e8 d4 6e 82 00 8b 74 24 04 e9 f8 fe 
ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 eb 01 c3 b9 01 00 
00 00 <0f> 01 d0 a9 00 00 06 00 75 01 c3 55 53 c4 e2 78 49 c0 65 48 8b 
2c All code ========
    0:   89 74 24 04              mov    %esi,0x4(%rsp)
    4:   e8 d4 6e 82 00           callq  0x826edd
    9:   8b 74 24 04              mov    0x4(%rsp),%esi
    d:   e9 f8 fe ff ff           jmpq   0xffffffffffffff0a
   12:   66 66 2e 0f 1f 84 00     data16 nopw %cs:0x0(%rax,%rax,1)
   19:   00 00 00 00
   1d:   0f 1f 44 00 00           nopl   0x0(%rax,%rax,1)
   22:   eb 01                    jmp    0x25
   24:   c3                       retq
   25:   b9 01 00 00 00           mov    $0x1,%ecx
   2a:*  0f 01 d0                 xgetbv           <-- trapping instruction
   2d:   a9 00 00 06 00           test   $0x60000,%eax
   32:   75 01                    jne    0x35
   34:   c3                       retq
   35:   55                       push   %rbp
   36:   53                       push   %rbx
   37:   c4 e2 78 49              (bad)
   3b:   c0 65 48 8b              shlb   $0x8b,0x48(%rbp)
   3f:   2c                       .byte 0x2c

Code starting with the faulting instruction 
===========================================
    0:   0f 01 d0                 xgetbv
    3:   a9 00 00 06 00           test   $0x60000,%eax
    8:   75 01                    jne    0xb
    a:   c3                       retq
    b:   55                       push   %rbp
    c:   53                       push   %rbx
    d:   c4 e2 78 49              (bad)
   11:   c0 65 48 8b              shlb   $0x8b,0x48(%rbp)
   15:   2c                       .byte 0x2c
[ 1.781350][ T0] RIP: 0010:fpu_idle_fpregs 
(kbuild/src/consumer/arch/x86/include/asm/fpu/xcr.h:12 
kbuild/src/consumer/arch/x86/include/asm/fpu/xcr.h:32 
kbuild/src/consumer/arch/x86/kernel/fpu/core.c:772)
[    1.781350][    T0] RSP: 0000:ffffffff96007e50 EFLAGS: 00010047
[ 1.781350][ T0] Code: 89 74 24 04 e8 d4 6e 82 00 8b 74 24 04 e9 f8 fe 
ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 eb 01 c3 b9 01 00 
00 00 <0f> 01 d0 a9 00 00 06 00 75 01 c3 55 53 c4 e2 78 49 c0 65 48 8b 
2c All code ========
    0:   89 74 24 04              mov    %esi,0x4(%rsp)
    4:   e8 d4 6e 82 00           callq  0x826edd
    9:   8b 74 24 04              mov    0x4(%rsp),%esi
    d:   e9 f8 fe ff ff           jmpq   0xffffffffffffff0a
   12:   66 66 2e 0f 1f 84 00     data16 nopw %cs:0x0(%rax,%rax,1)
   19:   00 00 00 00
   1d:   0f 1f 44 00 00           nopl   0x0(%rax,%rax,1)
   22:   eb 01                    jmp    0x25
   24:   c3                       retq
   25:   b9 01 00 00 00           mov    $0x1,%ecx
   2a:*  0f 01 d0                 xgetbv           <-- trapping instruction
   2d:   a9 00 00 06 00           test   $0x60000,%eax
   32:   75 01                    jne    0x35
   34:   c3                       retq
   35:   55                       push   %rbp
   36:   53                       push   %rbx
   37:   c4 e2 78 49              (bad)
   3b:   c0 65 48 8b              shlb   $0x8b,0x48(%rbp)
   3f:   2c                       .byte 0x2c

Code starting with the faulting instruction 
===========================================
    0:   0f 01 d0                 xgetbv
    3:   a9 00 00 06 00           test   $0x60000,%eax
    8:   75 01                    jne    0xb
    a:   c3                       retq
    b:   55                       push   %rbp
    c:   53                       push   %rbx
    d:   c4 e2 78 49              (bad)
   11:   c0 65 48 8b              shlb   $0x8b,0x48(%rbp)
   15:   2c                       .byte 0x2c
[    1.781350][    T0]
[    1.781350][    T0] RSP: 0000:ffffc9000010fdf8 EFLAGS: 00010047
[    1.781350][    T0] RAX: 0000000000000001 RBX: ffffffff96022280 RCX: 
0000000000000001
[    1.781350][    T0]
[    1.781350][    T0] RAX: 0000000000000001 RBX: ffff8881003e0000 RCX: 
0000000000000001
[    1.781350][    T0] RDX: 0000000000000000 RSI: 0000000000000008 RDI: 
ffffffff96db7da0
[    1.781350][    T0] RBP: 0000000000000000 R08: 0000000000000001 R09: 
fffffbfff2db6fb5
[    1.781350][    T0] RBP: 0000000000000001 R08: 0000000000000001 R09: 
fffffbfff2db6fb5
[    1.781350][    T0] R10: ffffffff96db7da7 R11: fffffbfff2db6fb4 R12: 
fffffbfff2c04450
[    1.781350][    T0] R10: ffffffff96db7da7 R11: fffffbfff2db6fb4 R12: 
ffffed102007c000
[    1.781350][    T0] R13: ffffffff96db7da0 R14: dffffc0000000000 R15: 
0000000000000000
[    1.781350][    T0] FS:  0000000000000000(0000) 
GS:ffff88839d600000(0000) knlGS:0000000000000000
[    1.781350][    T0] FS:  0000000000000000(0000) 
GS:ffff88839d700000(0000) knlGS:0000000000000000
[    1.781350][    T0] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.781350][    T0] CR2: ffff88843ffff000 CR3: 0000000388a14000 CR4: 
00000000000406f0
[    1.781350][    T0] CR2: 0000000000000000 CR3: 0000000388a14000 CR4: 
00000000000406e0
[    1.781350][    T0] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[    1.781350][    T0] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
0000000000000400
[    1.781350][    T0] Kernel panic - not syncing: Fatal exception
[    1.781350][    T0] Call Trace:
[    1.781350][    T0]  <TASK>
[ 1.781350][ T0] ? arch_cpu_idle_enter 
(kbuild/src/consumer/arch/x86/kernel/process.c:712)
[ 1.781350][ T0] ? do_idle (kbuild/src/consumer/kernel/sched/idle.c:294)
[ 1.781350][ T0] ? _raw_spin_lock_irqsave 
(kbuild/src/consumer/arch/x86/include/asm/atomic.h:202 
kbuild/src/consumer/include/linux/atomic/atomic-instrumented.h:513 
kbuild/src/consumer/include/asm-generic/qspinlock.h:82 
kbuild/src/consumer/include/linux/spinlock.h:185 
kbuild/src/consumer/include/linux/spinlock_api_smp.h:111 
kbuild/src/consumer/kernel/locking/spinlock.c:162)
[ 1.781350][ T0] ? arch_cpu_idle_exit+0xc0/0xc0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ