[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0893e873-20c4-7e07-e7e4-3971dbb79118@redhat.com>
Date: Thu, 13 Jan 2022 16:56:53 +0100
From: David Hildenbrand <david@...hat.com>
To: Chao Peng <chao.p.peng@...ux.intel.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, linux-fsdevel@...r.kernel.org,
qemu-devel@...gnu.org, Paolo Bonzini <pbonzini@...hat.com>,
Jonathan Corbet <corbet@....net>,
Sean Christopherson <seanjc@...gle.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Joerg Roedel <joro@...tes.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
x86@...nel.org, "H . Peter Anvin" <hpa@...or.com>,
Hugh Dickins <hughd@...gle.com>,
Jeff Layton <jlayton@...nel.org>,
"J . Bruce Fields" <bfields@...ldses.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Yu Zhang <yu.c.zhang@...ux.intel.com>,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
luto@...nel.org, john.ji@...el.com, susie.li@...el.com,
jun.nakajima@...el.com, dave.hansen@...el.com, ak@...ux.intel.com
Subject: Re: [PATCH v3 kvm/queue 01/16] mm/shmem: Introduce
F_SEAL_INACCESSIBLE
On 06.01.22 14:06, Chao Peng wrote:
> On Tue, Jan 04, 2022 at 03:22:07PM +0100, David Hildenbrand wrote:
>> On 23.12.21 13:29, Chao Peng wrote:
>>> From: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
>>>
>>> Introduce a new seal F_SEAL_INACCESSIBLE indicating the content of
>>> the file is inaccessible from userspace in any possible ways like
>>> read(),write() or mmap() etc.
>>>
>>> It provides semantics required for KVM guest private memory support
>>> that a file descriptor with this seal set is going to be used as the
>>> source of guest memory in confidential computing environments such
>>> as Intel TDX/AMD SEV but may not be accessible from host userspace.
>>>
>>> At this time only shmem implements this seal.
>>>
>>> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
>>> Signed-off-by: Chao Peng <chao.p.peng@...ux.intel.com>
>>> ---
>>> include/uapi/linux/fcntl.h | 1 +
>>> mm/shmem.c | 37 +++++++++++++++++++++++++++++++++++--
>>> 2 files changed, 36 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
>>> index 2f86b2ad6d7e..e2bad051936f 100644
>>> --- a/include/uapi/linux/fcntl.h
>>> +++ b/include/uapi/linux/fcntl.h
>>> @@ -43,6 +43,7 @@
>>> #define F_SEAL_GROW 0x0004 /* prevent file from growing */
>>> #define F_SEAL_WRITE 0x0008 /* prevent writes */
>>> #define F_SEAL_FUTURE_WRITE 0x0010 /* prevent future writes while mapped */
>>> +#define F_SEAL_INACCESSIBLE 0x0020 /* prevent file from accessing */
>>
>> I think this needs more clarification: the file content can still be
>> accessed using in-kernel mechanisms such as MEMFD_OPS for KVM. It
>> effectively disallows traditional access to a file (read/write/mmap)
>> that will result in ordinary MMU access to file content.
>>
>> Not sure how to best clarify that: maybe, prevent ordinary MMU access
>> (e.g., read/write/mmap) to file content?
>
> Or: prevent userspace access (e.g., read/write/mmap) to file content?
The issue with that phrasing is that userspace will be able to access
that content, just via a different mechanism eventually ... e.g., via
the KVM MMU indirectly. If that makes it clearer what I mean :)
>>
>>> /* (1U << 31) is reserved for signed error codes */
>>>
>>> /*
>>> diff --git a/mm/shmem.c b/mm/shmem.c
>>> index 18f93c2d68f1..faa7e9b1b9bc 100644
>>> --- a/mm/shmem.c
>>> +++ b/mm/shmem.c
>>> @@ -1098,6 +1098,10 @@ static int shmem_setattr(struct user_namespace *mnt_userns,
>>> (newsize > oldsize && (info->seals & F_SEAL_GROW)))
>>> return -EPERM;
>>>
>>> + if ((info->seals & F_SEAL_INACCESSIBLE) &&
>>> + (newsize & ~PAGE_MASK))
>>> + return -EINVAL;
>>> +
>>
>> What happens when sealing and there are existing mmaps?
>
> I think this is similar to ftruncate, in either case we just allow that.
> The existing mmaps will be unmapped and KVM will be notified to
> invalidate the mapping in the secondary MMU as well. This assume we
> trust the userspace even though it can not access the file content.
Can't we simply check+forbid instead?
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists