lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 2 Feb 2022 14:34:52 +0000
From:   Shameerali Kolothum Thodi <shameerali.kolothum.thodi@...wei.com>
To:     Jason Gunthorpe <jgg@...dia.com>
CC:     "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-crypto@...r.kernel.org" <linux-crypto@...r.kernel.org>,
        "alex.williamson@...hat.com" <alex.williamson@...hat.com>,
        "mgurtovoy@...dia.com" <mgurtovoy@...dia.com>,
        Linuxarm <linuxarm@...wei.com>,
        liulongfang <liulongfang@...wei.com>,
        "Zengtao (B)" <prime.zeng@...ilicon.com>,
        yuzenghui <yuzenghui@...wei.com>,
        "Jonathan Cameron" <jonathan.cameron@...wei.com>,
        "Wangzhou (B)" <wangzhou1@...ilicon.com>
Subject: RE: [RFC v2 0/4] vfio/hisilicon: add acc live migration driver



> -----Original Message-----
> From: Jason Gunthorpe [mailto:jgg@...dia.com]
> Sent: 02 February 2022 13:15
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@...wei.com>
> Cc: kvm@...r.kernel.org; linux-kernel@...r.kernel.org;
> linux-crypto@...r.kernel.org; alex.williamson@...hat.com;
> mgurtovoy@...dia.com; Linuxarm <linuxarm@...wei.com>; liulongfang
> <liulongfang@...wei.com>; Zengtao (B) <prime.zeng@...ilicon.com>;
> yuzenghui <yuzenghui@...wei.com>; Jonathan Cameron
> <jonathan.cameron@...wei.com>; Wangzhou (B) <wangzhou1@...ilicon.com>
> Subject: Re: [RFC v2 0/4] vfio/hisilicon: add acc live migration driver
> 
> On Fri, Jul 02, 2021 at 10:58:45AM +0100, Shameer Kolothum wrote:
> > This series attempts to add vfio live migration support for
> > HiSilicon ACC VF devices. HiSilicon ACC VF device MMIO space
> > includes both the functional register space and migration
> > control register space. As discussed in RFCv1[0], this may create
> > security issues as these regions get shared between the Guest
> > driver and the migration driver. Based on the feedback, we tried
> > to address those concerns in this version.
> >
> > This is now based on the new vfio-pci-core framework proposal[1].
> > Understand that the framework proposal is still under discussion,
> > but really appreciate any feedback on the approach taken here
> > to mitigate the security risks.
> 
> Hi, can you look at the v6 proposal for the mlx5 implementation of the
> migration API and see if it meets hisilicon acc's needs as well?
> 
> https://lore.kernel.org/all/20220130160826.32449-1-yishaih@nvidia.com/

Yes, I saw that one. Thanks for that and is now looking into it.

> 
> There are few topics to consider:
>  - Which of the three feature sets (STOP_COPY, P2P and PRECOPY) make
>    sense for this driver?

I think it will be STOP_COPY only for now. We might have PRECOPY feature once
we have the SMMUv3 HTTU support in future.

> 
>    I see pf_qm_state_pre_save() but didn't understand why it wanted to
>    send the first 32 bytes in the PRECOPY mode? It is fine, but it
>    will add some complexity to continue to do this.

That was mainly to do a quick verification between src and dst compatibility
before we start saving the state. I think probably we can delay that check
for later.

>  - I think we discussed the P2P implementation and decided it would
>    work for this device? Can you re-read and confirm?

In our case these devices are Integrated End Point devices and doesn't have
P2P DMA capability. Hence the FSM arcs will be limited to STOP_COPY feature
I guess. Also, since we cannot guarantee a NDMA state in STOP, my
assumption currently is the onus of making sure that no MMIO access happens 
in STOP is on the user. Is that a valid assumption?

>  - Are the arcs we defined going to work here as well? The current
>    implementation in hisi_acc_vf_set_device_state() is very far away
>    from what the v1 protocol is, so I'm having a hard time guessing,
>    but..

Right. The FSM has changed a couple of times since we posted this.
I am going to rebase all that now.

>       RESUMING -> STOP
>         Probably vf_qm_state_resume()
> 
>       RUNNING -> STOP
>         vf_qm_fun_restart() - that is oddly named..
> 
>       STOP -> RESUMING
>         Seems to be a nop (likely a bug)
> 
>       STOP -> RUNNING
>          Not implemented currenty? (also a bug)
> 
>       STOP -> STOP_COPY
>          pf_qm_state_pre_save / vf_qm_state_save
> 
>       STOP_COPY -> STOP
>          NOP

I will check and verify this.

>    And the modification for the P2P/NO DMA is presumably just
>    fun_restart too since stopping the device and stopping DMA are
>    going to be the same thing here?

Yes, in our case stopping device and stopping DMA are effectively the
same thing.

> 
> The mlx5 implementation linked above is a full example you can cut and
> paste from for how to implement the state function and the how to do
> the data transfer. The f_ops read/write implementation for acc looks
> trivial as it only streams the fixed size and pre-allocated 'struct
> acc_vf_data'
> 
> It looks like it would be a short path to implement our v2 proposal
> and remove a lot of driver code, as we saw in mlx5.
> 

Ok. These are the git repo I am using for the rework,
https://github.com/jgunthorpe/qemu/commits/vfio_migration_v2
https://github.com/jgunthorpe/linux/tree/vfio_migration_v2

Please let me know if the above are not up to date.

Also, just noted that my quick prototype is now failing
with below error,

" Error: VFIO device doesn't support migration"

Do we need to set the below before the feature query?
Or am I using a wrong Qemu/kernel repo?

--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -488,6 +488,7 @@ static int vfio_migration_query_flags(VFIODevice
*vbasedev, uint64_t *mig_flags)
     struct vfio_device_feature_migration *mig = (void *)feature->data;

     feature->argsz = sizeof(buf);
+    feature->flags = VFIO_DEVICE_FEATURE_MIGRATION | VFIO_DEVICE_FEATURE_GET;
     if (ioctl(vbasedev->fd, VFIO_DEVICE_FEATURE, feature) != 0)
         return -EOPNOTSUPP;

Thanks,
Shameer

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ