[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <01a8ee12-7a95-7245-3a00-2745aa846fca@amd.com>
Date: Wed, 9 Aug 2023 08:44:44 -0700
From: Brett Creeley <bcreeley@....com>
To: Alex Williamson <alex.williamson@...hat.com>,
Brett Creeley <brett.creeley@....com>
Cc: kvm@...r.kernel.org, netdev@...r.kernel.org, jgg@...dia.com,
yishaih@...dia.com, shameerali.kolothum.thodi@...wei.com,
kevin.tian@...el.com, horms@...nel.org, shannon.nelson@....com
Subject: Re: [PATCH v14 vfio 6/8] vfio/pds: Add support for dirty page
tracking
On 8/8/2023 3:27 PM, Alex Williamson wrote:
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
>
>
> On Mon, 7 Aug 2023 13:57:53 -0700
> Brett Creeley <brett.creeley@....com> wrote:
> ...
>> +static int pds_vfio_dirty_enable(struct pds_vfio_pci_device *pds_vfio,
>> + struct rb_root_cached *ranges, u32 nnodes,
>> + u64 *page_size)
>> +{
>> + struct pci_dev *pdev = pds_vfio->vfio_coredev.pdev;
>> + struct device *pdsc_dev = &pci_physfn(pdev)->dev;
>> + struct pds_vfio_dirty *dirty = &pds_vfio->dirty;
>> + u64 region_start, region_size, region_page_size;
>> + struct pds_lm_dirty_region_info *region_info;
>> + struct interval_tree_node *node = NULL;
>> + u8 max_regions = 0, num_regions;
>> + dma_addr_t regions_dma = 0;
>> + u32 num_ranges = nnodes;
>> + u32 page_count;
>> + u16 len;
>> + int err;
>> +
>> + dev_dbg(&pdev->dev, "vf%u: Start dirty page tracking\n",
>> + pds_vfio->vf_id);
>> +
>> + if (pds_vfio_dirty_is_enabled(pds_vfio))
>> + return -EINVAL;
>> +
>> + /* find if dirty tracking is disabled, i.e. num_regions == 0 */
>> + err = pds_vfio_dirty_status_cmd(pds_vfio, 0, &max_regions,
>> + &num_regions);
>> + if (err < 0) {
>> + dev_err(&pdev->dev, "Failed to get dirty status, err %pe\n",
>> + ERR_PTR(err));
>> + return err;
>> + } else if (num_regions) {
>> + dev_err(&pdev->dev,
>> + "Dirty tracking already enabled for %d regions\n",
>> + num_regions);
>> + return -EEXIST;
>> + } else if (!max_regions) {
>> + dev_err(&pdev->dev,
>> + "Device doesn't support dirty tracking, max_regions %d\n",
>> + max_regions);
>> + return -EOPNOTSUPP;
>> + }
>> +
>> + /*
>> + * Only support 1 region for now. If there are any large gaps in the
>> + * VM's address regions, then this would be a waste of memory as we are
>> + * generating 2 bitmaps (ack/seq) from the min address to the max
>> + * address of the VM's address regions. In the future, if we support
>> + * more than one region in the device/driver we can split the bitmaps
>> + * on the largest address region gaps. We can do this split up to the
>> + * max_regions times returned from the dirty_status command.
>> + */
>
> Isn't this a pretty unfortunately limitation given QEMU makes a 1TB
> hole on AMD hosts? Or maybe I misunderstand.
>
> https://gitlab.com/qemu-project/qemu/-/commit/8504f129450b909c88e199ca44facd35d38ba4de
>
> Thanks,
> Alex
>
Yes, this is currently an unfortunate limitation. However, our device is
flexible enough to support >1 regions. There has been some work in this
area, but we aren't quite there yet. The goal was to get this initial
support accepted and submit follow on work to support >1 regions.
Thanks,
Brett
Powered by blists - more mailing lists