netdev - [RFC] e1000e: Relax condition to trigger reset for ME workaround

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <20200512044347.3810257-1-punit1.agrawal@toshiba.co.jp>
Date:   Tue, 12 May 2020 13:43:47 +0900
From:   Punit Agrawal <punit1.agrawal@...hiba.co.jp>
To:     netdev@...r.kernel.org, intel-wired-lan@...ts.osuosl.org
Cc:     daniel.sangorrin@...hiba.co.jp,
        Punit Agrawal <punit1.agrawal@...hiba.co.jp>,
        Jeff Kirsher <jeffrey.t.kirsher@...el.com>,
        "David S. Miller" <davem@...emloft.net>,
        linux-kernel@...r.kernel.org
Subject: [RFC] e1000e: Relax condition to trigger reset for ME workaround

It's an error if the value of the RX/TX tail descriptor does not match
what was written. The error condition is true regardless the duration
of the interference from ME. But the code only performs the reset if
E1000_ICH_FWSM_PCIM2PCI_COUNT (2000) iterations of 50us delay have
transpired. The extra condition can lead to inconsistency between the
state of hardware as expected by the driver.

Fix this by dropping the check for number of delay iterations.

Signed-off-by: Punit Agrawal <punit1.agrawal@...hiba.co.jp>
Cc: Jeff Kirsher <jeffrey.t.kirsher@...el.com>
Cc: "David S. Miller" <davem@...emloft.net>
Cc: intel-wired-lan@...ts.osuosl.org
Cc: netdev@...r.kernel.org
Cc: linux-kernel@...r.kernel.org
---
Hi,

The issue was noticed through code inspection while backporting the
workaround for TDT corruption. Sending it out as an RFC as I am not
familiar with the hardware internals of the e1000e.

Another unresolved question is the inherent racy nature of the
workaround - the ME could block access again after releasing the
device (i.e., BIT(E1000_ICH_FWSM_PCIM2PCI) clear) but before the
driver performs the write. Has this not been a problem?

Any feedback on the patch or the more information on the issues
appreciated.

Thanks,
Punit

 drivers/net/ethernet/intel/e1000e/netdev.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 177c6da80c57..5ed4d7ed35b3 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -607,11 +607,11 @@ static void e1000e_update_rdt_wa(struct e1000_ring *rx_ring, unsigned int i)
 {
 	struct e1000_adapter *adapter = rx_ring->adapter;
 	struct e1000_hw *hw = &adapter->hw;
-	s32 ret_val = __ew32_prepare(hw);

+	__ew32_prepare(hw);
 	writel(i, rx_ring->tail);

-	if (unlikely(!ret_val && (i != readl(rx_ring->tail)))) {
+	if (unlikely(i != readl(rx_ring->tail))) {
 		u32 rctl = er32(RCTL);

 		ew32(RCTL, rctl & ~E1000_RCTL_EN);
@@ -624,11 +624,11 @@ static void e1000e_update_tdt_wa(struct e1000_ring *tx_ring, unsigned int i)
 {
 	struct e1000_adapter *adapter = tx_ring->adapter;
 	struct e1000_hw *hw = &adapter->hw;
-	s32 ret_val = __ew32_prepare(hw);

+	__ew32_prepare(hw);
 	writel(i, tx_ring->tail);

-	if (unlikely(!ret_val && (i != readl(tx_ring->tail)))) {
+	if (unlikely(i != readl(tx_ring->tail))) {
 		u32 tctl = er32(TCTL);

 		ew32(TCTL, tctl & ~E1000_TCTL_EN);
-- 
2.26.2