linux-kernel - Re: [PATCH net V2 4/4] vhost: log dirty page correctly

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2a78e991-1917-256b-4f09-60c228c17979@redhat.com>
Date:   Wed, 26 Dec 2018 13:43:26 +0800
From:   Jason Wang <jasowang@...hat.com>
To:     "Michael S. Tsirkin" <mst@...hat.com>
Cc:     kvm@...r.kernel.org, virtualization@...ts.linux-foundation.org,
        netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
        Jintack Lim <jintack@...columbia.edu>
Subject: Re: [PATCH net V2 4/4] vhost: log dirty page correctly


On 2018/12/26 上午12:25, Michael S. Tsirkin wrote:
> On Tue, Dec 25, 2018 at 05:43:25PM +0800, Jason Wang wrote:
>> On 2018/12/25 上午1:41, Michael S. Tsirkin wrote:
>>> On Mon, Dec 24, 2018 at 11:43:31AM +0800, Jason Wang wrote:
>>>> On 2018/12/14 下午9:20, Michael S. Tsirkin wrote:
>>>>> On Fri, Dec 14, 2018 at 10:43:03AM +0800, Jason Wang wrote:
>>>>>> On 2018/12/13 下午10:31, Michael S. Tsirkin wrote:
>>>>>>>> Just to make sure I understand this. It looks to me we should:
>>>>>>>>
>>>>>>>> - allow passing GIOVA->GPA through UAPI
>>>>>>>>
>>>>>>>> - cache GIOVA->GPA somewhere but still use GIOVA->HVA in device IOTLB for
>>>>>>>> performance
>>>>>>>>
>>>>>>>> Is this what you suggest?
>>>>>>>>
>>>>>>>> Thanks
>>>>>>> Not really. We already have GPA->HVA, so I suggested a flag to pass
>>>>>>> GIOVA->GPA in the IOTLB.
>>>>>>>
>>>>>>> This has advantages for security since a single table needs
>>>>>>> then to be validated to ensure guest does not corrupt
>>>>>>> QEMU memory.
>>>>>>>
>>>>>> I wonder how much we can gain through this. Currently, qemu IOMMU gives
>>>>>> GIOVA->GPA mapping, and qemu vhost code will translate GPA to HVA then pass
>>>>>> GIOVA->HVA to vhost. It looks no difference to me.
>>>>>>
>>>>>> Thanks
>>>>> The difference is in security not in performance.  Getting a bad HVA
>>>>> corrupts QEMU memory and it might be guest controlled. Very risky.
>>>> How can this be controlled by guest? HVA was generated from qemu ram blocks
>>>> which is totally under the control of qemu memory core instead of guest.
>>>>
>>>>
>>>> Thanks
>>> It is ultimately under guest influence as guest supplies IOVA->GPA
>>> translations.  qemu translates GPA->HVA and gives the translated result
>>> to the kernel.  If it's not buggy and kernel isn't buggy it's all
>>> fine.
>>
>> If qemu provides buggy GPA->HVA, we can't workaround this. And I don't get
>> the point why we even want to try this. Buggy qemu code can crash itself in
>> many ways.
>>
>>
>>> But that's the approach that was proven not to work in the 20th century.
>>> In the 21st century we are trying defence in depth approach.
>>>
>>> My point is that a single code path that is responsible for
>>> the HVA translations is better than two.
>>>
>> So the difference whether or not use memory table information:
>>
>> Current:
>>
>> 1) SET_MEM_TABLE: GPA->HVA
>>
>> 2) Qemu GIOVA->GPA
>>
>> 3) Qemu GPA->HVA
>>
>> 4) IOTLB_UPDATE: GIOVA->HVA
>>
>> If I understand correctly you want to drop step 3 consider it might be buggy
>> which is just 19 lines of code in qemu (vhost_memory_region_lookup()). This
>> will ends up:
>>
>> 1) Do GPA->HVA translation in IOTLB_UPDATE path (I believe we won't want to
>> do it during device IOTLB lookup).
>>
>> 2) Extra bits to enable this capability.
>>
>> So this looks need more codes in kernel than what qemu did in userspace.  Is
>> this really worthwhile?
>>
>> Thanks
> So there are several points I would like to make
>
> 1. At the moment without an iommu it is possible to
>     change GPA-HVA mappings and everything keeps working
>     because a change in memory tables flushes the rings.


Interesting, I don't know this before. But when can this happen?


>     However I don't see the iotlb cache being invalidated
>     on that path - did I miss it? If it is not there it's
>     a related minor bug.


It might have a bug. But a question is consider the case without IOMMU. 
We only update mem table (SET_MEM_TABLE), but not vring address. This 
looks like a bug as well?


>
> 2. qemu already has a GPA. Discarding it and re-calculating
>     when logging is on just seems wrong.
>     However if you would like to *also* keep the HVA in the iotlb
>     to avoid doing extra translations, that sounds like a
>     reasonable optimization.


Yes, traverse GPA->HVA mapping seems unnecessary.


>
> 3. it also means that the hva->gpa translation only runs
>     when logging is enabled. That is a rarely excercised
>     path so any bugs there will not be caught.


I wonder maybe some kind of unit-test may help here.


>
> So I really would like us long term to move away from
> hva->gpa translations, keep them for legacy userspace only
> but I don't really mind how we do it.
>
> How about
> - a new flag to pass an iotlb with *both* a gpa and hva
> - for legacy userspace, calculate the gpa on iotlb update
>    so the device then uses a shared code path
>
> what do you think?
>
>

I don't object this idea so I can try, just want to figure out why it 
was a must.

Thanks