lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAN0jFd1CpPtid7TGJcgzajRXQ5oxYN1LjLjLwK7HjQ1piuZ_XQ@mail.gmail.com>
Date: Sat, 10 Aug 2024 08:55:03 -0700
From: Daiwei Li <daiweili@...il.com>
To: Richard Cochran <richardcochran@...il.com>
Cc: Vinicius Costa Gomes <vinicius.gomes@...el.com>, intel-wired-lan@...ts.osuosl.org, 
	sasha.neftin@...el.com, kurt@...utronix.de, anthony.l.nguyen@...el.com, 
	netdev@...r.kernel.org, Przemek Kitszel <przemyslaw.kitszel@...el.com>, 
	"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, 
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH iwl-net v1] igb: Fix not clearing TimeSync interrupts for 82580

> @Daiwei Li, I don't have a 82580 handy, please confirm that the patch
fixes the issue you are having.

Thank you for the patch! I can confirm it fixes my issue. Below I offer a
patch that also works in response to Paul's feedback.

> Please also add a description of the test case

I am running ptp4l to serve PTP to a client device attached to the NIC.
To test, I am rebuilding igb.ko and reloading it.
Without this patch, I see repeatedly in the output of ptp4l:

> timed out while polling for tx timestamp increasing tx_timestamp_timeout or
> increasing kworker priority may correct this issue, but a driver bug likely
> causes it

as well as my client device failing to sync time.

> and maybe the PCI vendor and device code of your network device.

% lspci -nn | grep Network
17:00.0 Ethernet controller [0200]: Intel Corporation 82580 Gigabit
Network Connection [8086:150e] (rev 01)
17:00.1 Ethernet controller [0200]: Intel Corporation 82580 Gigabit
Network Connection [8086:150e] (rev 01)
17:00.2 Ethernet controller [0200]: Intel Corporation 82580 Gigabit
Network Connection [8086:150e] (rev 01)
17:00.3 Ethernet controller [0200]: Intel Corporation 82580 Gigabit
Network Connection [8086:150e] (rev 01)

> Bug, or was it a feature?

According to https://lore.kernel.org/all/CDCB8BE0.1EC2C%25matthew.vick@intel.com/
it was a bug. It looks like the datasheet was not updated to
acknowledge this bug:
https://www.intel.com/content/www/us/en/content-details/333167/intel-82580-eb-82580-db-gbe-controller-datasheet.html
(section 8.17.28.1).

> Is there a nicer way to write this, so `ack` is only assigned in case
> for the 82580?

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c
b/drivers/net/ethernet/intel/igb/igb_main.c
index ada42ba63549..87ec1258e22a 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -6986,6 +6986,10 @@ static void igb_tsync_interrupt(struct
igb_adapter *adapter)
        struct e1000_hw *hw = &adapter->hw;
        u32 tsicr = rd32(E1000_TSICR);
        struct ptp_clock_event event;
+       const u32 mask = (TSINTR_SYS_WRAP | E1000_TSICR_TXTS |
+                          TSINTR_TT0 | TSINTR_TT1 |
+                          TSINTR_AUTT0 | TSINTR_AUTT1);
+

        if (tsicr & TSINTR_SYS_WRAP) {
                event.type = PTP_CLOCK_PPS;
@@ -7009,6 +7013,13 @@ static void igb_tsync_interrupt(struct
igb_adapter *adapter)

        if (tsicr & TSINTR_AUTT1)
                igb_extts(adapter, 1);
+
+       if (hw->mac.type == e1000_82580) {
+               /* 82580 has a hardware bug that requires a explicit
+                * write to clear the TimeSync interrupt cause.
+                */
+               wr32(E1000_TSICR, tsicr & mask);
+       }
 }
On Fri, Aug 9, 2024 at 10:04 PM Richard Cochran
<richardcochran@...il.com> wrote:
>
> On Fri, Aug 09, 2024 at 05:23:02PM -0700, Vinicius Costa Gomes wrote:
> > It was reported that 82580 NICs have a hardware bug that makes it
> > necessary to write into the TSICR (TimeSync Interrupt Cause) register
> > to clear it.
>
> Bug, or was it a feature?
>
> Or IOW, maybe i210 changed the semantics of the TSICR?
>
> And what about the 82576?
>
> Thanks,
> Richard

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ