linux-kernel - Re: [PATCH] x86/sgx: fix a NULL pointer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <op.18o7z2biwjvjmi@hhuan26-mobl.amr.corp.intel.com>
Date:   Wed, 26 Jul 2023 11:56:16 -0500
From:   "Haitao Huang" <haitao.huang@...ux.intel.com>
To:     "Hansen, Dave" <dave.hansen@...el.com>,
        "linux-sgx@...r.kernel.org" <linux-sgx@...r.kernel.org>,
        "x86@...nel.org" <x86@...nel.org>, "bp@...en8.de" <bp@...en8.de>,
        "jarkko@...nel.org" <jarkko@...nel.org>,
        "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "hpa@...or.com" <hpa@...or.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "Huang, Kai" <kai.huang@...el.com>
Cc:     "kristen@...ux.intel.com" <kristen@...ux.intel.com>,
        "Chatre, Reinette" <reinette.chatre@...el.com>,
        "stable@...r.kernel.org" <stable@...r.kernel.org>,
        "Christopherson,, Sean" <seanjc@...gle.com>
Subject: Re: [PATCH] x86/sgx: fix a NULL pointer

On Thu, 20 Jul 2023 19:52:22 -0500, Huang, Kai <kai.huang@...el.com> wrote:

> On Fri, 2023-07-21 at 00:32 +0000, Huang, Kai wrote:
>> On Wed, 2023-07-19 at 08:53 -0500, Haitao Huang wrote:
>> > Hi Dave and Kai
>> > On Tue, 18 Jul 2023 19:21:54 -0500, Dave Hansen  
>> <dave.hansen@...el.com>
>> > wrote:
>> >
>> > > On 7/18/23 17:14, Huang, Kai wrote:
>> > > > Also perhaps the patch title is too vague.  Adding more  
>> information
>> > > > doesn't hurt
>> > > > I think, e.g., mentioning it is a fix for NULL pointer  
>> dereference in
>> > > > the EAUG
>> > > > flow.
>> > >
>> > > Yeah, let's say something like:
>> > >
>> > > 	x86/sgx: Resolve SECS reclaim vs. page fault race
>> > >
>> > The patch is not to resolve SECS vs #PF race though the race is a
>> > necessary condition to cause the NULL pointer. The same condition  
>> does not
>> > cause NULL pointer in the ELDU path of #PF, only in EAUG path of #PF.
>> >
>> > And the issue really is the NULL pointer not checked and fix was to  
>> reuse
>> > the same code to reload SECS in ELDU code path for EAUG code path
>> >
>> >
>> > How about this:
>> >
>> > x86/sgx:  Reload reclaimed SECS for EAUG on #PF
>> >
>> > or
>> >
>> > x86/sgx: Fix a NULL pointer to SECS used for EAUG on #PF
>> >
>>
>> Perhaps you can add "EAUG" part to what Dave suggested?
>>
>> 	x86/sgx: Resolves SECS reclaim vs. page fault race on EAUG
>>
>> (assuming Dave is fine with this :-))
Sure, I can use this too.

> Btw, do you have a real call trace?  If you have, I think you can add  
> that to
> the changelog too because that catches people's eye immediately.

Previously I was not able to reproduce without SGX cgroup patches. Now I  
managed to get a trace with a QEMU setup with small EPC (8M), large RAM  
(128G) and 128 vCPUs:

[ 1682.914263] BUG: kernel NULL pointer dereference, address:  
0000000000000000
[ 1682.922966] #PF: supervisor read access in kernel mode
[ 1682.929115] #PF: error_code(0x0000) - not-present page
[ 1682.935264] PGD 0 P4D 0
[ 1682.938383] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 1682.943620] CPU: 43 PID: 2681 Comm: test_sgx Not tainted  
6.3.0-rc4sgxcet #12
[ 1682.951989] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS  
rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014
[ 1682.965504] RIP: 0010:sgx_encl_eaug_page+0xc7/0x210
[ 1682.971359] Code: 25 49 8b 96 98 04 00 00 48 8d 40 48 48 89 42 08 48 89  
56 48 49 8d 96 98 04 00 00 48 89 56 50 49 89 86 98 04 00 00 49 8b 46 60  
<8b> 10 48 c1 e2 05 488
[ 1682.993330] RSP: 0000:ffffb2e64725bc00 EFLAGS: 00010246
[ 1682.999585] RAX: 0000000000000000 RBX: ffff987e5abac428 RCX:  
0000000000000000
[ 1683.008059] RDX: 0000000000000001 RSI: 0000000000000000 RDI:  
ffff987e61aee000
[ 1683.016533] RBP: ffffb2e64725bcf0 R08: 0000000000000000 R09:  
ffffb2e64725bb58
[ 1683.025008] R10: 0000000000000000 R11: 00007f3f5c418fff R12:  
ffff987e61aee020
[ 1683.033479] R13: ffff987e505bc080 R14: ffff987e61aee000 R15:  
ffffb2e6420fcb20
[ 1683.041949] FS:  00007f3f5cb48740(0000) GS:ffff989cfe8c0000(0000)  
knlGS:0000000000000000
[ 1683.051540] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1683.058478] CR2: 0000000000000000 CR3: 0000000115896002 CR4:  
0000000000770ee0
[ 1683.067018] DR0: 0000000000000000 DR1: 0000000000000000 DR2:  
0000000000000000
[ 1683.075539] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:  
0000000000000400
[ 1683.084085] PKRU: 55555554
[ 1683.087465] Call Trace:
[ 1683.090547]  <TASK>
[ 1683.093220]  ? __kmem_cache_alloc_node+0x16a/0x440
[ 1683.099034]  ? xa_load+0x6e/0xa0
[ 1683.103038]  sgx_vma_fault+0x119/0x230
[ 1683.107630]  __do_fault+0x36/0x140
[ 1683.111828]  do_fault+0x12f/0x400
[ 1683.115928]  __handle_mm_fault+0x728/0x1110
[ 1683.121050]  handle_mm_fault+0x105/0x310
[ 1683.125850]  do_user_addr_fault+0x1ee/0x750
[ 1683.130957]  ? __this_cpu_preempt_check+0x13/0x20
[ 1683.136667]  exc_page_fault+0x76/0x180
[ 1683.141265]  asm_exc_page_fault+0x27/0x30
[ 1683.146160] RIP: 0033:0x7ffc6496beea
[ 1683.150563] Code: 43 48 8b 4d 10 48 c7 c3 28 00 00 00 48 83 3c 19 00 75  
31 48 83 c3 08 48 81 fb 00 01 00 00 75 ec 48 8b 19 48 8d 0d 00 00 00 00  
<0f> 01 d7 48 8b 5d 101
[ 1683.172773] RSP: 002b:00007ffc64935b68 EFLAGS: 00000202
[ 1683.179138] RAX: 0000000000000003 RBX: 00007f3800000000 RCX:  
00007ffc6496beea
[ 1683.187675] RDX: 0000000000000000 RSI: 0000000000000000 RDI:  
0000000000000000
[ 1683.196200] RBP: 00007ffc64935b70 R08: 0000000000000000 R09:  
0000000000000000
[ 1683.204724] R10: 0000000000000000 R11: 0000000000000000 R12:  
0000000000000000
[ 1683.213310] R13: 0000000000000000 R14: 0000000000000000 R15:  
0000000000000000
[ 1683.221850]  </TASK>
[ 1683.224636] Modules linked in: isofs intel_rapl_msr intel_rapl_common  
binfmt_misc kvm_intel nls_iso8859_1 kvm ppdev irqbypass input_leds  
parport_pc joydev parport rapi
[ 1683.291173] CR2: 0000000000000000
[ 1683.295271] ---[ end trace 0000000000000000 ]---



I'll add this to the commit as well.

Thanks
Haitao