lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4oud_43SGMoZtRZxyAWfaFbVAPdJcLRMLcU84Q90d=8XOxA@mail.gmail.com>
Date:   Tue, 12 Jul 2022 10:55:59 +0200
From:   Íñigo Huguet <ihuguet@...hat.com>
To:     Íñigo Huguet <ihuguet@...hat.com>,
        Edward Cree <ecree.xilinx@...il.com>, sshah@...arflare.com,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org,
        Yanghang Liu <yanghliu@...hat.com>
Subject: Re: [PATCH v2 net] sfc: fix use after free when disabling sriov

Hi Martin,

On Tue, Jul 12, 2022 at 9:56 AM Martin Habets <habetsm.xilinx@...il.com> wrote:
>
> On Tue, Jul 12, 2022 at 08:26:42AM +0200, Íñigo Huguet wrote:
> > Use after free is detected by kfence when disabling sriov. What was read
> > after being freed was vf->pci_dev: it was freed from pci_disable_sriov
> > and later read in efx_ef10_sriov_free_vf_vports, called from
> > efx_ef10_sriov_free_vf_vswitching.
> >
> > Set the pointer to NULL at release time to not trying to read it later.
>
> This solution just bypasses the check we have in
> efx_ef10_sriov_free_vf_vports():
>                 /* If VF is assigned, do not free the vport  */
>                 if (vf->pci_dev && pci_is_dev_assigned(vf->pci_dev))
>                         continue;
>
> If we don't want to detect this any more we should remove this
> check in stead of this patch.

It doesn't really bypass it, because sriov is disabled and vf->pci_dev
set to NULL only if there are no devices assigned: the check is done
by the `if (!vfs_assigned)` in `efx_ef10_pci_sriov_disable`. If there
are any assigned devices, SRIOV is not disabled and vf->pci_dev is not
set to NULL.

> There is another issue here, in efx_ef10_sriov_free_vf_vswitching()
> we do free the memory even if a VF was still assigned. This leads me
> to think that removing the check above is the better thing to do.

Note that `pci_is_dev_assigned` and `pci_vfs_assigned` only count VFs
assigned to Xen, but not with other methods (kvm, vfio...). That means
that we are not really able to know when a VF is actually assigned to
an VM.

Right now:
* If any VF is assigned to Xen VM: driver doesn't disable SRVIO and
doesn't free memory of assigned VFs, but it does free memory of
unassigned VFs
* If any VF is assigned to a non-Xen VM: driver disables SRVIO and
free memory of all VFs

kvm/vfio case: I don't think we can or should do anything to avoid
disabling SRIOV.

Xen case: I didn't know very well what it should be done, so I just
assumed that the driver was doing the right thing. If it's not, there
are 2 possibilities:
* Option 1: Do the same thing that in the kvm/vfio case: Free memory
anyway, as you say, but also disable SRIOV even if there are assigned
VFs.
* Option 2: Continue with the current driver's behaviour (but fixing
it): don't allow to disable SRIOV if there are assigned VFs.

For option1, I don't know what happens if VFs assigned to Xen suddenly
disappear. This option could have more unintended side effects than
option 2....

For option 2, my guess is that we shouldn't free any memory at all if
we don't disable SRIOV. So we should move the call to
`efx_ef10_sriov_free_vf_vswitching` into the `if (!vfs_assigned)`
block. Also, remove that "is_assigned" check inside
`efx_ef10_sriov_free_vf_vswitching`, as you say.

What do you think? Option 1 or option 2?

>
> Martin
>
> > Reproducer and dmesg log (note that kfence doesn't detect it every time):
> > $ echo 1 > /sys/class/net/enp65s0f0np0/device/sriov_numvfs
> > $ echo 0 > /sys/class/net/enp65s0f0np0/device/sriov_numvfs
> >
> >  BUG: KFENCE: use-after-free read in efx_ef10_sriov_free_vf_vswitching+0x82/0x170 [sfc]
> >
> >  Use-after-free read at 0x00000000ff3c1ba5 (in kfence-#224):
> >   efx_ef10_sriov_free_vf_vswitching+0x82/0x170 [sfc]
> >   efx_ef10_pci_sriov_disable+0x38/0x70 [sfc]
> >   efx_pci_sriov_configure+0x24/0x40 [sfc]
> >   sriov_numvfs_store+0xfe/0x140
> >   kernfs_fop_write_iter+0x11c/0x1b0
> >   new_sync_write+0x11f/0x1b0
> >   vfs_write+0x1eb/0x280
> >   ksys_write+0x5f/0xe0
> >   do_syscall_64+0x5c/0x80
> >   entry_SYSCALL_64_after_hwframe+0x44/0xae
> >
> >  kfence-#224: 0x00000000edb8ef95-0x00000000671f5ce1, size=2792, cache=kmalloc-4k
> >
> >  allocated by task 6771 on cpu 10 at 3137.860196s:
> >   pci_alloc_dev+0x21/0x60
> >   pci_iov_add_virtfn+0x2a2/0x320
> >   sriov_enable+0x212/0x3e0
> >   efx_ef10_sriov_configure+0x67/0x80 [sfc]
> >   efx_pci_sriov_configure+0x24/0x40 [sfc]
> >   sriov_numvfs_store+0xba/0x140
> >   kernfs_fop_write_iter+0x11c/0x1b0
> >   new_sync_write+0x11f/0x1b0
> >   vfs_write+0x1eb/0x280
> >   ksys_write+0x5f/0xe0
> >   do_syscall_64+0x5c/0x80
> >   entry_SYSCALL_64_after_hwframe+0x44/0xae
> >
> >  freed by task 6771 on cpu 12 at 3170.991309s:
> >   device_release+0x34/0x90
> >   kobject_cleanup+0x3a/0x130
> >   pci_iov_remove_virtfn+0xd9/0x120
> >   sriov_disable+0x30/0xe0
> >   efx_ef10_pci_sriov_disable+0x57/0x70 [sfc]
> >   efx_pci_sriov_configure+0x24/0x40 [sfc]
> >   sriov_numvfs_store+0xfe/0x140
> >   kernfs_fop_write_iter+0x11c/0x1b0
> >   new_sync_write+0x11f/0x1b0
> >   vfs_write+0x1eb/0x280
> >   ksys_write+0x5f/0xe0
> >   do_syscall_64+0x5c/0x80
> >   entry_SYSCALL_64_after_hwframe+0x44/0xae
> >
> > Fixes: 3c5eb87605e85 ("sfc: create vports for VFs and assign random MAC addresses")
> > Reported-by: Yanghang Liu <yanghliu@...hat.com>
> > Signed-off-by: Íñigo Huguet <ihuguet@...hat.com>
> > ---
> > v2: add missing Fixes tag
> >
> >  drivers/net/ethernet/sfc/ef10_sriov.c | 10 +++++++---
> >  1 file changed, 7 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/sfc/ef10_sriov.c b/drivers/net/ethernet/sfc/ef10_sriov.c
> > index 7f5aa4a8c451..92550c7e85ce 100644
> > --- a/drivers/net/ethernet/sfc/ef10_sriov.c
> > +++ b/drivers/net/ethernet/sfc/ef10_sriov.c
> > @@ -408,8 +408,9 @@ static int efx_ef10_pci_sriov_enable(struct efx_nic *efx, int num_vfs)
> >  static int efx_ef10_pci_sriov_disable(struct efx_nic *efx, bool force)
> >  {
> >       struct pci_dev *dev = efx->pci_dev;
> > +     struct efx_ef10_nic_data *nic_data = efx->nic_data;
> >       unsigned int vfs_assigned = pci_vfs_assigned(dev);
> > -     int rc = 0;
> > +     int i, rc = 0;
> >
> >       if (vfs_assigned && !force) {
> >               netif_info(efx, drv, efx->net_dev, "VFs are assigned to guests; "
> > @@ -417,10 +418,13 @@ static int efx_ef10_pci_sriov_disable(struct efx_nic *efx, bool force)
> >               return -EBUSY;
> >       }
> >
> > -     if (!vfs_assigned)
> > +     if (!vfs_assigned) {
> > +             for (i = 0; i < efx->vf_count; i++)
> > +                     nic_data->vf[i].pci_dev = NULL;
> >               pci_disable_sriov(dev);
> > -     else
> > +     } else {
> >               rc = -EBUSY;
> > +     }
> >
> >       efx_ef10_sriov_free_vf_vswitching(efx);
> >       efx->vf_count = 0;
> > --
> > 2.34.1
>


--
Íñigo Huguet

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ