[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <E1226897-C6D1-439C-AB3B-012F8C4A72DF@nutanix.com>
Date: Fri, 14 Nov 2025 14:53:04 +0000
From: Jon Kohler <jon@...anix.com>
To: Jason Wang <jasowang@...hat.com>
CC: "Michael S. Tsirkin" <mst@...hat.com>,
Eugenio Pérez
<eperezma@...hat.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"virtualization@...ts.linux.dev" <virtualization@...ts.linux.dev>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Linus Torvalds
<torvalds@...ux-foundation.org>,
Borislav Petkov <bp@...en8.de>,
Sean
Christopherson <seanjc@...gle.com>
Subject: Re: [PATCH net-next] vhost: use "checked" versions of get_user() and
put_user()
> On Nov 12, 2025, at 8:09 PM, Jason Wang <jasowang@...hat.com> wrote:
>
> !-------------------------------------------------------------------|
> CAUTION: External Email
>
> |-------------------------------------------------------------------!
>
> On Thu, Nov 13, 2025 at 8:14 AM Jon Kohler <jon@...anix.com> wrote:
>>
>> vhost_get_user and vhost_put_user leverage __get_user and __put_user,
>> respectively, which were both added in 2016 by commit 6b1e6cc7855b
>> ("vhost: new device IOTLB API").
>
> It has been used even before this commit.
Ah, thanks for the pointer. I’d have to go dig to find its genesis, but
its more to say, this existed prior to the LFENCE commit.
>
>> In a heavy UDP transmit workload on a
>> vhost-net backed tap device, these functions showed up as ~11.6% of
>> samples in a flamegraph of the underlying vhost worker thread.
>>
>> Quoting Linus from [1]:
>> Anyway, every single __get_user() call I looked at looked like
>> historical garbage. [...] End result: I get the feeling that we
>> should just do a global search-and-replace of the __get_user/
>> __put_user users, replace them with plain get_user/put_user instead,
>> and then fix up any fallout (eg the coco code).
>>
>> Switch to plain get_user/put_user in vhost, which results in a slight
>> throughput speedup. get_user now about ~8.4% of samples in flamegraph.
>>
>> Basic iperf3 test on a Intel 5416S CPU with Ubuntu 25.10 guest:
>> TX: taskset -c 2 iperf3 -c <rx_ip> -t 60 -p 5200 -b 0 -u -i 5
>> RX: taskset -c 2 iperf3 -s -p 5200 -D
>> Before: 6.08 Gbits/sec
>> After: 6.32 Gbits/sec
>
> I wonder if we need to test on archs like ARM.
Are you thinking from a performance perspective? Or a correctness one?
Powered by blists - more mailing lists