netdev - Re: [PATCH net-next 0/3] vhost: accelerate metadata access through vmap()

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f6ce1fbb-b634-b17d-e9cf-36c662f49d75@redhat.com>
Date:   Mon, 24 Dec 2018 16:44:14 +0800
From:   Jason Wang <jasowang@...hat.com>
To:     "Michael S. Tsirkin" <mst@...hat.com>,
        David Miller <davem@...emloft.net>
Cc:     kvm@...r.kernel.org, virtualization@...ts.linux-foundation.org,
        netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net-next 0/3] vhost: accelerate metadata access through
 vmap()


On 2018/12/17 上午3:57, Michael S. Tsirkin wrote:
> On Sat, Dec 15, 2018 at 11:43:08AM -0800, David Miller wrote:
>> From: Jason Wang <jasowang@...hat.com>
>> Date: Fri, 14 Dec 2018 12:29:54 +0800
>>
>>> On 2018/12/14 上午4:12, Michael S. Tsirkin wrote:
>>>> On Thu, Dec 13, 2018 at 06:10:19PM +0800, Jason Wang wrote:
>>>>> Hi:
>>>>>
>>>>> This series tries to access virtqueue metadata through kernel virtual
>>>>> address instead of copy_user() friends since they had too much
>>>>> overheads like checks, spec barriers or even hardware feature
>>>>> toggling.
>>>>>
>>>>> Test shows about 24% improvement on TX PPS. It should benefit other
>>>>> cases as well.
>>>>>
>>>>> Please review
>>>> I think the idea of speeding up userspace access is a good one.
>>>> However I think that moving all checks to start is way too aggressive.
>>>
>>> So did packet and AF_XDP. Anyway, sharing address space and access
>>> them directly is the fastest way. Performance is the major
>>> consideration for people to choose backend. Compare to userspace
>>> implementation, vhost does not have security advantages at any
>>> level. If vhost is still slow, people will start to develop backends
>>> based on e.g AF_XDP.
>> Exactly, this is precisely how this kind of problem should be solved.
>>
>> Michael, I strongly support the approach Jason is taking here, and I
>> would like to ask you to seriously reconsider your objections.
>>
>> Thank you.
> Okay. Won't be the first time I'm wrong.
>
> Let's say we ignore security aspects, but we need to make sure the
> following all keep working (broken with this revision):
> - file backed memory (I didn't see where we mark memory dirty -
>    if we don't we get guest memory corruption on close, if we do
>    then host crash as https://lwn.net/Articles/774411/ seems to apply here?)


We only pin metadata pages, so I don't think they can be used for DMA. 
So it was probably not an issue. The real issue is zerocopy codes, maybe 
it's time to disable it by default?


> - THP


We will miss 2 or 4 pages for THP, I wonder whether or not it's measurable.


> - auto-NUMA


I'm not sure auto-NUMA will help for the case of IPC. It can damage the 
performance in the worst case if vhost and userspace are running in two 
different nodes. Anyway I can measure.


>
> Because vhost isn't like AF_XDP where you can just tell people "use
> hugetlbfs" and "data is removed on close" - people are using it in lots
> of configurations with guest memory shared between rings and unrelated
> data.


This series doesn't share data, only metadata is shared.


>
> Jason, thoughts on these?
>

Based on the above, I can measure the impact of THP to see how it impacts.

For unsafe variants, it can only work for when we can batch the access 
and it needs non trivial rework on the vhost codes with unexpected 
amount of work for archs other than x86. I'm not sure it's worth to try.

Thanks