netdev - Re: [RFC] e1000e: Relax condition to trigger reset for ME workaround

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <87v9kym02s.fsf@kokedama.swc.toshiba.co.jp>
Date:   Fri, 15 May 2020 10:57:47 +0900
From:   Punit Agrawal <punit1.agrawal@...hiba.co.jp>
To:     Alexander Duyck <alexander.duyck@...il.com>
Cc:     Netdev <netdev@...r.kernel.org>,
        intel-wired-lan <intel-wired-lan@...ts.osuosl.org>,
        daniel.sangorrin@...hiba.co.jp,
        Jeff Kirsher <jeffrey.t.kirsher@...el.com>,
        "David S. Miller" <davem@...emloft.net>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC] e1000e: Relax condition to trigger reset for ME workaround

Alexander Duyck <alexander.duyck@...il.com> writes:

> On Mon, May 11, 2020 at 9:45 PM Punit Agrawal
> <punit1.agrawal@...hiba.co.jp> wrote:
>>
>> It's an error if the value of the RX/TX tail descriptor does not match
>> what was written. The error condition is true regardless the duration
>> of the interference from ME. But the code only performs the reset if
>> E1000_ICH_FWSM_PCIM2PCI_COUNT (2000) iterations of 50us delay have
>> transpired. The extra condition can lead to inconsistency between the
>> state of hardware as expected by the driver.
>>
>> Fix this by dropping the check for number of delay iterations.
>>
>> Signed-off-by: Punit Agrawal <punit1.agrawal@...hiba.co.jp>
>> Cc: Jeff Kirsher <jeffrey.t.kirsher@...el.com>
>> Cc: "David S. Miller" <davem@...emloft.net>
>> Cc: intel-wired-lan@...ts.osuosl.org
>> Cc: netdev@...r.kernel.org
>> Cc: linux-kernel@...r.kernel.org
>> ---
>> Hi,
>>
>> The issue was noticed through code inspection while backporting the
>> workaround for TDT corruption. Sending it out as an RFC as I am not
>> familiar with the hardware internals of the e1000e.
>>
>> Another unresolved question is the inherent racy nature of the
>> workaround - the ME could block access again after releasing the
>> device (i.e., BIT(E1000_ICH_FWSM_PCIM2PCI) clear) but before the
>> driver performs the write. Has this not been a problem?
>>
>> Any feedback on the patch or the more information on the issues
>> appreciated.
>>
>> Thanks,
>> Punit
>>
>>  drivers/net/ethernet/intel/e1000e/netdev.c | 8 ++++----
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
>> index 177c6da80c57..5ed4d7ed35b3 100644
>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>> @@ -607,11 +607,11 @@ static void e1000e_update_rdt_wa(struct e1000_ring *rx_ring, unsigned int i)
>>  {
>>         struct e1000_adapter *adapter = rx_ring->adapter;
>>         struct e1000_hw *hw = &adapter->hw;
>> -       s32 ret_val = __ew32_prepare(hw);
>>
>> +       __ew32_prepare(hw);
>>         writel(i, rx_ring->tail);
>>
>> -       if (unlikely(!ret_val && (i != readl(rx_ring->tail)))) {
>> +       if (unlikely(i != readl(rx_ring->tail))) {
>>                 u32 rctl = er32(RCTL);
>>
>>                 ew32(RCTL, rctl & ~E1000_RCTL_EN);
>> @@ -624,11 +624,11 @@ static void e1000e_update_tdt_wa(struct e1000_ring *tx_ring, unsigned int i)
>>  {
>>         struct e1000_adapter *adapter = tx_ring->adapter;
>>         struct e1000_hw *hw = &adapter->hw;
>> -       s32 ret_val = __ew32_prepare(hw);
>>
>> +       __ew32_prepare(hw);
>>         writel(i, tx_ring->tail);
>>
>> -       if (unlikely(!ret_val && (i != readl(tx_ring->tail)))) {
>> +       if (unlikely(i != readl(tx_ring->tail))) {
>>                 u32 tctl = er32(TCTL);
>>
>>                 ew32(TCTL, tctl & ~E1000_TCTL_EN);
>
> You are eliminating the timeout check in favor of just verifying if
> the write succeeded or not. Seems pretty straight forward to me.
>
> One other change you may consider making would be to drop the return
> value from __ew32_prepare since it doesn't appear to be used anywhere,
> make the function static, and maybe get rid of the prototype in
> e1000.h.
>
> Reviewed-by: Alexander Duyck <alexander.h.duyck@...ux.intel.com>

Thanks! I will send out an update dropping the return and the prototype.