lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAOhMmr5Z_nuzQ5EkX=tkv4v_OOyjP99ykQh3sCn9nsYgVLCxrA@mail.gmail.com>
Date:   Thu, 11 Feb 2021 00:35:25 -0600
From:   Lijun Pan <lijunp213@...il.com>
To:     Sukadev Bhattiprolu <sukadev@...ux.ibm.com>
Cc:     netdev@...r.kernel.org, Dany Madden <drt@...ux.ibm.com>,
        Lijun Pan <ljp@...ux.ibm.com>,
        Rick Lindsley <ricklind@...ux.ibm.com>, abdhalee@...ibm.com
Subject: Re: [PATCH 1/1] ibmvnic: fix a race between open and reset

On Wed, Feb 10, 2021 at 7:44 PM Sukadev Bhattiprolu
<sukadev@...ux.ibm.com> wrote:
>
> __ibmvnic_reset() currently reads the adapter->state before getting the
> rtnl and saves that state as the "target state" for the reset. If this
> read occurs when adapter is in PROBED state, the target state would be
> PROBED.
>
> Just after the target state is saved, and before the actual reset process
> is started (i.e before rtnl is acquired) if we get an ibmvnic_open() call
> we would move the adapter to OPEN state.
>
> But when the reset is processed (after ibmvnic_open()) drops the rtnl),
> it will leave the adapter in PROBED state even though we already moved
> it to OPEN.
>
> To fix this, use the RTNL to improve the serialization when reading/updating
> the adapter state. i.e determine the target state of a reset only after
> getting the RTNL. And if a reset is in progress during an open, simply
> set the target state of the adapter and let the reset code finish the
> open (like we currently do if failover is pending).
>
> One twist to this serialization is if the adapter state changes when we
> drop the RTNL to update the link state. Account for this by checking if
> there was an intervening open and update the target state for the reset
> accordingly (see new comments in the code). Note that only the reset
> functions and ibmvnic_open() can set the adapter to OPEN state and this
> must happen under rtnl.
>
> Fixes: 7d7195a026ba ("ibmvnic: Do not process device remove during device reset")
> Signed-off-by: Sukadev Bhattiprolu <sukadev@...ux.ibm.com>
> Reviewed-by: Dany Madden <drt@...ux.ibm.com>
> ---
> Changelog[v3]
>         [Jakub Kicinski] Rebase to current net and fix comment style.
>
> Changelog[v2]
>         [Jakub Kicinski] Use ASSERT_RTNL() instead of WARN_ON_ONCE()
>         and rtnl_is_locked());
> ---
>  drivers/net/ethernet/ibm/ibmvnic.c | 71 ++++++++++++++++++++++++++----
>  1 file changed, 63 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
> index a536fdbf05e1..96c2b0985484 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -1197,12 +1197,25 @@ static int ibmvnic_open(struct net_device *netdev)
>         struct ibmvnic_adapter *adapter = netdev_priv(netdev);
>         int rc;
>
> -       /* If device failover is pending, just set device state and return.
> -        * Device operation will be handled by reset routine.
> +       ASSERT_RTNL();
> +
> +       /* If device failover is pending or we are about to reset, just set
> +        * device state and return. Device operation will be handled by reset
> +        * routine.
> +        *
> +        * It should be safe to overwrite the adapter->state here. Since
> +        * we hold the rtnl, either the reset has not actually started or
> +        * the rtnl got dropped during the set_link_state() in do_reset().
> +        * In the former case, no one else is changing the state (again we
> +        * have the rtnl) and in the latter case, do_reset() will detect and
> +        * honor our setting below.
>          */
> -       if (adapter->failover_pending) {
> +       if (adapter->failover_pending || (test_bit(0, &adapter->resetting))) {
> +               netdev_dbg(netdev, "[S:%d FOP:%d] Resetting, deferring open\n",
> +                          adapter->state, adapter->failover_pending);
>                 adapter->state = VNIC_OPEN;
> -               return 0;
> +               rc = 0;
> +               goto out;
>         }
>
>         if (adapter->state != VNIC_CLOSED) {
> @@ -1221,11 +1234,12 @@ static int ibmvnic_open(struct net_device *netdev)
>         rc = __ibmvnic_open(netdev);
>
>  out:
> -       /*
> -        * If open fails due to a pending failover, set device state and
> -        * return. Device operation will be handled by reset routine.
> +       /* If open failed and there is a pending failover or in-progress reset,
> +        * set device state and return. Device operation will be handled by
> +        * reset routine. See also comments above regarding rtnl.
>          */
> -       if (rc && adapter->failover_pending) {
> +       if (rc &&
> +           (adapter->failover_pending || (test_bit(0, &adapter->resetting)))) {
>                 adapter->state = VNIC_OPEN;
>                 rc = 0;
>         }
> @@ -1939,6 +1953,14 @@ static int do_change_param_reset(struct ibmvnic_adapter *adapter,
>         netdev_dbg(adapter->netdev, "Change param resetting driver (%d)\n",
>                    rwi->reset_reason);
>
> +       /* read the state and check (again) after getting rtnl */
> +       reset_state = adapter->state;
> +
> +       if (reset_state == VNIC_REMOVING || reset_state == VNIC_REMOVED) {
> +               rc = -EBUSY;
> +               goto out;
> +       }
> +

The changes in do_change_param_reset will cause merge conflict with
net-next tree since do_change_param_reset is no longer there in
net-next tree. I suggest sending this patch after net-next merges into
mainline, which should be soon.

>         netif_carrier_off(netdev);
>         adapter->reset_reason = rwi->reset_reason;
>
> @@ -2037,6 +2059,14 @@ static int do_reset(struct ibmvnic_adapter *adapter,
>         if (rwi->reset_reason == VNIC_RESET_FAILOVER)
>                 adapter->failover_pending = false;
>
> +       /* read the state and check (again) after getting rtnl */
> +       reset_state = adapter->state;
> +
> +       if (reset_state == VNIC_REMOVING || reset_state == VNIC_REMOVED) {
> +               rc = -EBUSY;
> +               goto out;
> +       }
> +
>         netif_carrier_off(netdev);
>         adapter->reset_reason = rwi->reset_reason;
>
> @@ -2063,7 +2093,24 @@ static int do_reset(struct ibmvnic_adapter *adapter,
>                 if (rc)
>                         goto out;
>
> +               if (adapter->state == VNIC_OPEN) {
> +                       /* When we dropped rtnl, ibmvnic_open() got
> +                        * it and noticed that we are resetting and
> +                        * set the adapter state to OPEN. Update our
> +                        * new "target" state, and resume the reset
> +                        * from VNIC_CLOSING state.
> +                        */
> +                       netdev_dbg(netdev,
> +                                  "Open changed state from %d, updating.\n",
> +                                  reset_state);
> +                       reset_state = VNIC_OPEN;
> +                       adapter->state = VNIC_CLOSING;
> +               }
> +
>                 if (adapter->state != VNIC_CLOSING) {
> +                       /* If someone else changed the adapter state
> +                        * when we dropped the rtnl, fail the reset
> +                        */
>                         rc = -1;
>                         goto out;
>                 }
> @@ -2197,6 +2244,14 @@ static int do_hard_reset(struct ibmvnic_adapter *adapter,
>         netdev_dbg(adapter->netdev, "Hard resetting driver (%d)\n",
>                    rwi->reset_reason);
>
> +       /* read the state and check (again) after getting rtnl */
> +       reset_state = adapter->state;
> +
> +       if (reset_state == VNIC_REMOVING || reset_state == VNIC_REMOVED) {
> +               rc = -EBUSY;
> +               goto out;
> +       }
> +
>         netif_carrier_off(netdev);
>         adapter->reset_reason = rwi->reset_reason;
>
> --
> 2.26.2
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ