linux-kernel - Re: [PATCH] Drivers: hv: Introduce mshv

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <8f83fbdb-0aee-4602-ad8a-58bbd22dbdc9@linux.microsoft.com>
Date: Wed, 7 May 2025 12:20:36 -0700
From: Roman Kisel <romank@...ux.microsoft.com>
To: Saurabh Singh Sengar <ssengar@...rosoft.com>,
 Naman Jain <namjain@...ux.microsoft.com>, KY Srinivasan <kys@...rosoft.com>,
 Haiyang Zhang <haiyangz@...rosoft.com>, Wei Liu <wei.liu@...nel.org>,
 Dexuan Cui <decui@...rosoft.com>
Cc: Anirudh Rayabharam <anrayabh@...ux.microsoft.com>,
 Saurabh Sengar <ssengar@...ux.microsoft.com>,
 Stanislav Kinsburskii <skinsburskii@...ux.microsoft.com>,
 Nuno Das Neves <nunodasneves@...ux.microsoft.com>,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
 "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>
Subject: Re: [PATCH] Drivers: hv: Introduce mshv_vtl driver

On 5/7/2025 6:02 AM, Saurabh Singh Sengar wrote:
> 
[..]

>> +	}
>> +
>> +	local_irq_save(flags);
>> +	in = *this_cpu_ptr(hyperv_pcpu_input_arg);
>> +	out = *this_cpu_ptr(hyperv_pcpu_output_arg);
>> +
>> +	if (copy_from_user(in, (void __user *)hvcall.input_ptr,
>> hvcall.input_size)) {
> 
> Here is an issue related to usage of user copy functions when interrupt are disabled.
> It was reported by Michael K here:
> https://github.com/microsoft/OHCL-Linux-Kernel/issues/33

 From the practical point of view, that memory will be touched by the
user mode by virtue of Rust requiring initialization so the a possible
page fault would be resolved before the IOCTL. OpenHCL runs without swap
so the the memory will not be paged out to require page faults to be
brought in back.

I do agree that might be turned into a footgun by the user land if
they malloc a page w/o prefaulting (so it's just a VA range, not backed
with the physical page), and then send its address straight over here
right after w/o writing any data to it. Perhaps likelier with the output
data. Anyway, yes, relying on the user land doing sane things isn't
the best approach to the kernel programming.

If we're inclined to fix this, I'd encourage to take an approach that
works for the confidential VMs as well so we don't have to fix that
again when start upstreaming what we have for SNP and TDX. The
allocation *must* be visible to the hypervisor in the confidential
scenarios.

Or, maybe we could avoid the allocations by reading the first byte
of the user land buffer to "pre-fault" the page outside of the
scope that disables interrupts. Why allocate if we can avoid that?
Could set up also the SMP remote calls to run this on the desired
CPU.

Summarizing for the case you want to change this:

1. Keep interrupts disabled when reading/writing to/from the Hyper-V
    driver allocated input and output pages.
2. If you decide to allocate separate pages, make sure they are
    visible to the hypervisor in the confidential scenarios. I know
    we're not talking SNP and TDX here just yet but it would be
    a waste of time imho to build something here and scrape that
    later. The issues with allocations are:
        a) If allocating on-demand, we might fail the hypercall
           because of OOM. That's certainly bad as the whole VM
           will break down.
        b) If allocating for the whole lifetime of the VM,
           let us remember that we avoid using hypercalls
           due to their runtime cost. We'll be keeping around
           2 pages per CPU for the few times we need them.
3. Consider reading a byte from the user land buffers to make the page
    fault happen outside of disabling interrupts. There is no
    outswap (maybe could have disabling swap in Kconfig) so the page
    will stay in the memory.

If you're not changing this, feel free to keep my "Reviewed-by".

> 
> - Saurabh

-- 
Thank you,
Roman