[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN9PR11MB52762B0ACFC46319498FE0908C5CA@BN9PR11MB5276.namprd11.prod.outlook.com>
Date: Tue, 20 Jun 2023 02:19:50 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: Brett Creeley <bcreeley@....com>, Brett Creeley <brett.creeley@....com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>, "netdev@...r.kernel.org"
<netdev@...r.kernel.org>, "alex.williamson@...hat.com"
<alex.williamson@...hat.com>, "jgg@...dia.com" <jgg@...dia.com>,
"yishaih@...dia.com" <yishaih@...dia.com>,
"shameerali.kolothum.thodi@...wei.com" <shameerali.kolothum.thodi@...wei.com>
CC: "shannon.nelson@....com" <shannon.nelson@....com>
Subject: RE: [PATCH v10 vfio 4/7] vfio/pds: Add VFIO live migration support
> From: Brett Creeley <bcreeley@....com>
> Sent: Saturday, June 17, 2023 12:45 PM
>
> On 6/16/2023 1:06 AM, Tian, Kevin wrote:
> > Caution: This message originated from an External Source. Use proper
> caution when opening attachments, clicking links, or responding.
> >
> >
> >> From: Brett Creeley <brett.creeley@....com>
> >> Sent: Saturday, June 3, 2023 6:03 AM
> >>
> >> +
> >> +static int pds_vfio_client_adminq_cmd(struct pds_vfio_pci_device
> *pds_vfio,
> >> + union pds_core_adminq_cmd *req,
> >> + size_t req_len,
> >> + union pds_core_adminq_comp *resp,
> >> + u64 flags)
> >> +{
> >> + union pds_core_adminq_cmd cmd = {};
> >> + size_t cp_len;
> >> + int err;
> >> +
> >> + /* Wrap the client request */
> >> + cmd.client_request.opcode = PDS_AQ_CMD_CLIENT_CMD;
> >> + cmd.client_request.client_id = cpu_to_le16(pds_vfio->client_id);
> >> + cp_len = min_t(size_t, req_len,
> >> sizeof(cmd.client_request.client_cmd));
> >
> > 'req_len' is kind of redundant. Looks all the callers use sizeof(req).
>
> It does a memcpy based on the min size between req_len and the size of
> the request.
If all the callers just pass in sizeof(union) as 'req_len', then it's pointless
to do min_t and you can just use sizeof(cmd.client_request.client_cmd) here
which is always smaller than or equal to the sizeof(union).
> >> +
> >> + err = pds_vfio_client_adminq_cmd(pds_vfio, &cmd, sizeof(cmd),
> >> &comp,
> >> + PDS_AQ_FLAG_FASTPOLL);
> >> + if (err) {
> >> + dev_err(dev, "vf%u: Suspend failed: %pe\n", pds_vfio->vf_id,
> >> + ERR_PTR(err));
> >> + return err;
> >> + }
> >> +
> >> + return pds_vfio_suspend_wait_device_cmd(pds_vfio);
> >> +}
> >
> > The logic in this function is very confusing.
> >
> > PDS_LM_CMD_SUSPEND has a completion record:
> >
> > +struct pds_lm_suspend_comp {
> > + u8 status;
> > + u8 rsvd;
> > + __le16 comp_index;
> > + union {
> > + __le64 state_size;
> > + u8 rsvd2[11];
> > + } __packed;
> > + u8 color;
> >
> > Presumably this function can look at the completion record to know
> whether
> > the suspend request succeeds.
> >
> > Why do you require another wait_device step to query the suspend status?
>
> The driver sends the initial suspend request to tell the DSC/firmware to
> suspend the VF's data/control path. The DSC/firmware will ack/nack the
> suspend request in the completion.
>
> Then the driver polls the DSC/firmware to find when the VF's
> data/control path has been fully suspended. When the DSC/firmware isn't
> done suspending yet it will return -EAGAIN. Otherwise it will return
> success/failure.
>
> I will add some comments clarifying these details.
Yes more comment is welcomed.
It's also misleading to have a ' state_size ' field in suspend_comp. In concept
the firmware cannot calculate it accurately before the VF is fully suspended.
Powered by blists - more mailing lists