lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKgT0UfgmwkZpU2=6DCsPZOwhDPyZDkbrPGH-dhVfA3ZVusJyQ@mail.gmail.com>
Date:	Sun, 13 Dec 2015 21:46:46 -0800
From:	Alexander Duyck <alexander.duyck@...il.com>
To:	Yang Zhang <yang.zhang.wz@...il.com>
Cc:	Alexander Duyck <aduyck@...antis.com>, kvm@...r.kernel.org,
	"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
	x86@...nel.org,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	qemu-devel@...gnu.org, Lan Tianyu <tianyu.lan@...el.com>,
	"Michael S. Tsirkin" <mst@...hat.com>, konrad.wilk@...cle.com,
	"Dr. David Alan Gilbert" <dgilbert@...hat.com>,
	Alexander Graf <agraf@...e.de>,
	Alex Williamson <alex.williamson@...hat.com>
Subject: Re: [RFC PATCH 0/3] x86: Add support for guest DMA dirty page tracking

On Sun, Dec 13, 2015 at 9:22 PM, Yang Zhang <yang.zhang.wz@...il.com> wrote:
> On 2015/12/14 12:54, Alexander Duyck wrote:
>>
>> On Sun, Dec 13, 2015 at 6:27 PM, Yang Zhang <yang.zhang.wz@...il.com>
>> wrote:
>>>
>>> On 2015/12/14 5:28, Alexander Duyck wrote:
>>>>
>>>>
>>>> This patch set is meant to be the guest side code for a proof of concept
>>>> involving leaving pass-through devices in the guest during the warm-up
>>>> phase of guest live migration.  In order to accomplish this I have added
>>>> a
>>>> new function called dma_mark_dirty that will mark the pages associated
>>>> with
>>>> the DMA transaction as dirty in the case of either an unmap or a
>>>> sync_.*_for_cpu where the DMA direction is either DMA_FROM_DEVICE or
>>>> DMA_BIDIRECTIONAL.  The pass-through device must still be removed before
>>>> the stop-and-copy phase, however allowing the device to be present
>>>> should
>>>> significantly improve the performance of the guest during the warm-up
>>>> period.
>>>>
>>>> This current implementation is very preliminary and there are number of
>>>> items still missing.  Specifically in order to make this a more complete
>>>> solution we need to support:
>>>> 1.  Notifying hypervisor that drivers are dirtying DMA pages received
>>>> 2.  Bypassing page dirtying when it is not needed.
>>>>
>>>
>>> Shouldn't current log dirty mechanism already cover them?
>>
>>
>> The guest has no way of currently knowing that the hypervisor is doing
>> dirty page logging, and the log dirty mechanism currently has no way
>> of tracking device DMA accesses.  This change is meant to bridge the
>> two so that the guest device driver will force the SWIOTLB DMA API to
>> mark pages written to by the device as dirty.
>
>
> OK. This is what we called "dummy write mechanism". Actually, this is just a
> workaround before iommu dirty bit ready. Eventually, we need to change to
> use the hardware dirty bit. Besides, we may still lost the data if dma
> happens during/just before stop and copy phase.

Right, this is a "dummy write mechanism" in order to allow for entry
tracking.  This only works completely if we force the hardware to
quiesce via a hot-plug event before we reach the stop-and-copy phase
of the migration.

The IOMMU dirty bit approach is likely going to have a significant
number of challenges involved.  Looking over the driver and the data
sheet it looks like the current implementation is using a form of huge
pages in the IOMMU, as such we will need to tear that down and replace
it with 4K pages if we don't want to dirty large regions with each DMA
transaction, and I'm not sure that is something we can change while
DMA is active to the affected regions.  In addition the data sheet
references the fact that the page table entries are stored in a
translation cache and in order to sync things up you have to
invalidate the entries.  I'm not sure what the total overhead would be
for invalidating something like a half million 4K pages to migrate a
guest with just 2G of RAM, but I would think that might be a bit
expensive given the fact that IOMMU accesses aren't known for being
incredibly fast when invalidating DMA on the host.

- Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ