[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230809113300.2c4b0888.alex.williamson@redhat.com>
Date: Wed, 9 Aug 2023 11:33:00 -0600
From: Alex Williamson <alex.williamson@...hat.com>
To: Brett Creeley <bcreeley@....com>
Cc: Brett Creeley <brett.creeley@....com>, kvm@...r.kernel.org,
netdev@...r.kernel.org, jgg@...dia.com, yishaih@...dia.com,
shameerali.kolothum.thodi@...wei.com, kevin.tian@...el.com,
horms@...nel.org, shannon.nelson@....com
Subject: Re: [PATCH v14 vfio 6/8] vfio/pds: Add support for dirty page
tracking
On Wed, 9 Aug 2023 08:44:44 -0700
Brett Creeley <bcreeley@....com> wrote:
> On 8/8/2023 3:27 PM, Alex Williamson wrote:
> > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> >
> >
> > On Mon, 7 Aug 2023 13:57:53 -0700
> > Brett Creeley <brett.creeley@....com> wrote:
> > ...
> >> +static int pds_vfio_dirty_enable(struct pds_vfio_pci_device *pds_vfio,
> >> + struct rb_root_cached *ranges, u32 nnodes,
> >> + u64 *page_size)
> >> +{
> >> + struct pci_dev *pdev = pds_vfio->vfio_coredev.pdev;
> >> + struct device *pdsc_dev = &pci_physfn(pdev)->dev;
> >> + struct pds_vfio_dirty *dirty = &pds_vfio->dirty;
> >> + u64 region_start, region_size, region_page_size;
> >> + struct pds_lm_dirty_region_info *region_info;
> >> + struct interval_tree_node *node = NULL;
> >> + u8 max_regions = 0, num_regions;
> >> + dma_addr_t regions_dma = 0;
> >> + u32 num_ranges = nnodes;
> >> + u32 page_count;
> >> + u16 len;
> >> + int err;
> >> +
> >> + dev_dbg(&pdev->dev, "vf%u: Start dirty page tracking\n",
> >> + pds_vfio->vf_id);
> >> +
> >> + if (pds_vfio_dirty_is_enabled(pds_vfio))
> >> + return -EINVAL;
> >> +
> >> + /* find if dirty tracking is disabled, i.e. num_regions == 0 */
> >> + err = pds_vfio_dirty_status_cmd(pds_vfio, 0, &max_regions,
> >> + &num_regions);
> >> + if (err < 0) {
> >> + dev_err(&pdev->dev, "Failed to get dirty status, err %pe\n",
> >> + ERR_PTR(err));
> >> + return err;
> >> + } else if (num_regions) {
> >> + dev_err(&pdev->dev,
> >> + "Dirty tracking already enabled for %d regions\n",
> >> + num_regions);
> >> + return -EEXIST;
> >> + } else if (!max_regions) {
> >> + dev_err(&pdev->dev,
> >> + "Device doesn't support dirty tracking, max_regions %d\n",
> >> + max_regions);
> >> + return -EOPNOTSUPP;
> >> + }
> >> +
> >> + /*
> >> + * Only support 1 region for now. If there are any large gaps in the
> >> + * VM's address regions, then this would be a waste of memory as we are
> >> + * generating 2 bitmaps (ack/seq) from the min address to the max
> >> + * address of the VM's address regions. In the future, if we support
> >> + * more than one region in the device/driver we can split the bitmaps
> >> + * on the largest address region gaps. We can do this split up to the
> >> + * max_regions times returned from the dirty_status command.
> >> + */
> >
> > Isn't this a pretty unfortunately limitation given QEMU makes a 1TB
> > hole on AMD hosts? Or maybe I misunderstand.
> >
> > https://gitlab.com/qemu-project/qemu/-/commit/8504f129450b909c88e199ca44facd35d38ba4de
> >
> > Thanks,
> > Alex
> >
>
> Yes, this is currently an unfortunate limitation. However, our device is
> flexible enough to support >1 regions. There has been some work in this
> area, but we aren't quite there yet. The goal was to get this initial
> support accepted and submit follow on work to support >1 regions.
Ok, good that this is temporary.
Shameer, Kevin, Jason, Yishai, I'm hoping one or more of you can
approve this series as well. Thanks,
Alex
Powered by blists - more mailing lists