netdev - Re: [RFC] e1000e: Relax condition to trigger reset for ME workaround

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKgT0Uf86d8wnAMSLO4hn4+mfCH5fP4e8OsAYknE0m3Y7in9gw@mail.gmail.com>
Date:   Thu, 14 May 2020 07:57:31 -0700
From:   Alexander Duyck <alexander.duyck@...il.com>
To:     Punit Agrawal <punit1.agrawal@...hiba.co.jp>
Cc:     Netdev <netdev@...r.kernel.org>,
        intel-wired-lan <intel-wired-lan@...ts.osuosl.org>,
        daniel.sangorrin@...hiba.co.jp,
        Jeff Kirsher <jeffrey.t.kirsher@...el.com>,
        "David S. Miller" <davem@...emloft.net>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC] e1000e: Relax condition to trigger reset for ME workaround

On Mon, May 11, 2020 at 9:45 PM Punit Agrawal
<punit1.agrawal@...hiba.co.jp> wrote:
>
> It's an error if the value of the RX/TX tail descriptor does not match
> what was written. The error condition is true regardless the duration
> of the interference from ME. But the code only performs the reset if
> E1000_ICH_FWSM_PCIM2PCI_COUNT (2000) iterations of 50us delay have
> transpired. The extra condition can lead to inconsistency between the
> state of hardware as expected by the driver.
>
> Fix this by dropping the check for number of delay iterations.
>
> Signed-off-by: Punit Agrawal <punit1.agrawal@...hiba.co.jp>
> Cc: Jeff Kirsher <jeffrey.t.kirsher@...el.com>
> Cc: "David S. Miller" <davem@...emloft.net>
> Cc: intel-wired-lan@...ts.osuosl.org
> Cc: netdev@...r.kernel.org
> Cc: linux-kernel@...r.kernel.org
> ---
> Hi,
>
> The issue was noticed through code inspection while backporting the
> workaround for TDT corruption. Sending it out as an RFC as I am not
> familiar with the hardware internals of the e1000e.
>
> Another unresolved question is the inherent racy nature of the
> workaround - the ME could block access again after releasing the
> device (i.e., BIT(E1000_ICH_FWSM_PCIM2PCI) clear) but before the
> driver performs the write. Has this not been a problem?
>
> Any feedback on the patch or the more information on the issues
> appreciated.
>
> Thanks,
> Punit
>
>  drivers/net/ethernet/intel/e1000e/netdev.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
> index 177c6da80c57..5ed4d7ed35b3 100644
> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> @@ -607,11 +607,11 @@ static void e1000e_update_rdt_wa(struct e1000_ring *rx_ring, unsigned int i)
>  {
>         struct e1000_adapter *adapter = rx_ring->adapter;
>         struct e1000_hw *hw = &adapter->hw;
> -       s32 ret_val = __ew32_prepare(hw);
>
> +       __ew32_prepare(hw);
>         writel(i, rx_ring->tail);
>
> -       if (unlikely(!ret_val && (i != readl(rx_ring->tail)))) {
> +       if (unlikely(i != readl(rx_ring->tail))) {
>                 u32 rctl = er32(RCTL);
>
>                 ew32(RCTL, rctl & ~E1000_RCTL_EN);
> @@ -624,11 +624,11 @@ static void e1000e_update_tdt_wa(struct e1000_ring *tx_ring, unsigned int i)
>  {
>         struct e1000_adapter *adapter = tx_ring->adapter;
>         struct e1000_hw *hw = &adapter->hw;
> -       s32 ret_val = __ew32_prepare(hw);
>
> +       __ew32_prepare(hw);
>         writel(i, tx_ring->tail);
>
> -       if (unlikely(!ret_val && (i != readl(tx_ring->tail)))) {
> +       if (unlikely(i != readl(tx_ring->tail))) {
>                 u32 tctl = er32(TCTL);
>
>                 ew32(TCTL, tctl & ~E1000_TCTL_EN);

You are eliminating the timeout check in favor of just verifying if
the write succeeded or not. Seems pretty straight forward to me.

One other change you may consider making would be to drop the return
value from __ew32_prepare since it doesn't appear to be used anywhere,
make the function static, and maybe get rid of the prototype in
e1000.h.

Reviewed-by: Alexander Duyck <alexander.h.duyck@...ux.intel.com>