lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <48953bf2-cee9-f818-dc50-5fb5b9b410bf@oracle.com>
Date:   Tue, 24 Jan 2023 16:08:46 +0000
From:   Liam Merwick <liam.merwick@...cle.com>
To:     Sean Christopherson <seanjc@...gle.com>,
        Chao Peng <chao.p.peng@...ux.intel.com>
Cc:     kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, linux-fsdevel@...r.kernel.org,
        linux-arch@...r.kernel.org, linux-api@...r.kernel.org,
        linux-doc@...r.kernel.org, qemu-devel@...gnu.org,
        Paolo Bonzini <pbonzini@...hat.com>,
        Jonathan Corbet <corbet@....net>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Arnd Bergmann <arnd@...db.de>,
        Naoya Horiguchi <naoya.horiguchi@....com>,
        Miaohe Lin <linmiaohe@...wei.com>, x86@...nel.org,
        "H . Peter Anvin" <hpa@...or.com>, Hugh Dickins <hughd@...gle.com>,
        Jeff Layton <jlayton@...nel.org>,
        "J . Bruce Fields" <bfields@...ldses.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Shuah Khan <shuah@...nel.org>, Mike Rapoport <rppt@...nel.org>,
        Steven Price <steven.price@....com>,
        "Maciej S . Szmigiero" <mail@...iej.szmigiero.name>,
        Vlastimil Babka <vbabka@...e.cz>,
        Vishal Annapurve <vannapurve@...gle.com>,
        Yu Zhang <yu.c.zhang@...ux.intel.com>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        luto@...nel.org, jun.nakajima@...el.com, dave.hansen@...el.com,
        ak@...ux.intel.com, david@...hat.com, aarcange@...hat.com,
        ddutile@...hat.com, dhildenb@...hat.com,
        Quentin Perret <qperret@...gle.com>, tabba@...gle.com,
        Michael Roth <michael.roth@....com>, mhocko@...e.com,
        wei.w.wang@...el.com, Liam Merwick <liam.merwick@...cle.com>
Subject: Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM

On 14/01/2023 00:37, Sean Christopherson wrote:
> On Fri, Dec 02, 2022, Chao Peng wrote:
>> This patch series implements KVM guest private memory for confidential
>> computing scenarios like Intel TDX[1]. If a TDX host accesses
>> TDX-protected guest memory, machine check can happen which can further
>> crash the running host system, this is terrible for multi-tenant
>> configurations. The host accesses include those from KVM userspace like
>> QEMU. This series addresses KVM userspace induced crash by introducing
>> new mm and KVM interfaces so KVM userspace can still manage guest memory
>> via a fd-based approach, but it can never access the guest memory
>> content.
>>
>> The patch series touches both core mm and KVM code. I appreciate
>> Andrew/Hugh and Paolo/Sean can review and pick these patches. Any other
>> reviews are always welcome.
>>    - 01: mm change, target for mm tree
>>    - 02-09: KVM change, target for KVM tree
> 
> A version with all of my feedback, plus reworked versions of Vishal's selftest,
> is available here:
> 
>    git@...hub.com:sean-jc/linux.git x86/upm_base_support
> 
> It compiles and passes the selftest, but it's otherwise barely tested.  There are
> a few todos (2 I think?) and many of the commits need changelogs, i.e. it's still
> a WIP.
> 

When running LTP (https://github.com/linux-test-project/ltp) on the v10
bits (and also with Sean's branch above) I encounter the following NULL
pointer dereference with testcases/kernel/syscalls/madvise/madvise01
(100% reproducible).

It appears that in restrictedmem_error_page() 
inode->i_mapping->private_data is NULL
in the list_for_each_entry_safe(inode, next, &sb->s_inodes, i_sb_list)
but I don't know why.


[ 5365.177168] BUG: kernel NULL pointer dereference, address: 
0000000000000028
[ 5365.178881] #PF: supervisor read access in kernel mode
[ 5365.180006] #PF: error_code(0x0000) - not-present page
[ 5365.181322] PGD 8000000109dad067 P4D 8000000109dad067 PUD 107707067 PMD 0
[ 5365.183474] Oops: 0000 [#1] PREEMPT SMP PTI
[ 5365.184792] CPU: 0 PID: 22086 Comm: madvise01 Not tainted 
6.1.0-1.el8.seanjcupm.x86_64 #1
[ 5365.186572] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
BIOS 1.5.1 06/16/2021
[ 5365.188816] RIP: 0010:restrictedmem_error_page+0xc7/0x1b0
[ 5365.190081] Code: 99 00 48 8b 55 00 48 8b 02 48 8d 8a e8 fe ff ff 48 
2d 18 01 00 00 48 39 d5 0f 84 8a 00 00 00 48 8b 51 30 48 8b 92 b8 00 00 
00 <48> 8b 4a 28 4c 39 b1 d8 00 00 00 74 22 48 8b 88 18 01 00 00 48 8d
[ 5365.193984] RSP: 0018:ffff9b7343c07d80 EFLAGS: 00010206
[ 5365.195142] RAX: ffff8e5b410cfc70 RBX: 0000000000000001 RCX: 
ffff8e5b4048e580
[ 5365.196888] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
0000000000000000
[ 5365.198399] RBP: ffff8e5b410cfd88 R08: 0000000000000000 R09: 
0000000000000000
[ 5365.200200] R10: 0000000000000000 R11: 0000000000000000 R12: 
0000000000000000
[ 5365.201843] R13: ffff8e5b410cfd80 R14: ffff8e5b47cc7618 R15: 
ffffd49d44c05080
[ 5365.203472] FS:  00007fc96de9b5c0(0000) GS:ffff8e5deda00000(0000) 
knlGS:0000000000000000
[ 5365.205485] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5365.206791] CR2: 0000000000000028 CR3: 000000012047e002 CR4: 
0000000000170ef0
[ 5365.208131] Call Trace:
[ 5365.208752]  <TASK>
[ 5365.209229]  me_pagecache_clean+0x58/0x100
[ 5365.210196]  identify_page_state+0x84/0xd0
[ 5365.211180]  memory_failure+0x231/0x8b0
[ 5365.212148]  madvise_inject_error.cold+0x8d/0xa4
[ 5365.213317]  do_madvise+0x363/0x3a0
[ 5365.214177]  __x64_sys_madvise+0x2c/0x40
[ 5365.215159]  do_syscall_64+0x3f/0xa0
[ 5365.216016]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 5365.217130] RIP: 0033:0x7fc96d8399ab
[ 5365.217953] Code: 73 01 c3 48 8b 0d dd 54 38 00 f7 d8 64 89 01 48 83 
c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 1c 00 00 00 0f 
05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ad 54 38 00 f7 d8 64 89 01 48
[ 5365.222323] RSP: 002b:00007fff62a99b18 EFLAGS: 00000206 ORIG_RAX: 
000000000000001c
[ 5365.224026] RAX: ffffffffffffffda RBX: 000000000041ce00 RCX: 
00007fc96d8399ab
[ 5365.225375] RDX: 0000000000000064 RSI: 000000000000a000 RDI: 
00007fc96de8e000
[ 5365.226999] RBP: 00007fc96de9b540 R08: 0000000000000001 R09: 
0000000000415c80
[ 5365.228641] R10: 0000000000000001 R11: 0000000000000206 R12: 
0000000000000008
[ 5365.230074] R13: 0000000000000000 R14: 0000000000000000 R15: 
0000000000000000

Regards,
Liam

> As for next steps, can you (handwaving all of the TDX folks) take a look at what
> I pushed and see if there's anything horrifically broken, and that it still works
> for TDX?
> 
> Fuad (and pKVM folks) same ask for you with respect to pKVM.  Absolutely no rush
> (and I mean that).
> 
> On my side, the two things on my mind are (a) tests and (b) downstream dependencies
> (SEV and TDX).  For tests, I want to build a lists of tests that are required for
> merging so that the criteria for merging are clear, and so that if the list is large
> (haven't thought much yet), the work of writing and running tests can be distributed.
> 
> Regarding downstream dependencies, before this lands, I want to pull in all the
> TDX and SNP series and see how everything fits together.  Specifically, I want to
> make sure that we don't end up with a uAPI that necessitates ugly code, and that we
> don't miss an opportunity to make things simpler.  The patches in the SNP series to
> add "legacy" SEV support for UPM in particular made me slightly rethink some minor
> details.  Nothing remotely major, but something that needs attention since it'll
> be uAPI.
> 
> I'm off Monday, so it'll be at least Tuesday before I make any more progress on
> my side.
> 
> Thanks!
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ