lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87sev9wrkj.fsf@intel.com>
Date: Mon, 12 Aug 2024 10:59:08 -0700
From: Vinicius Costa Gomes <vinicius.gomes@...el.com>
To: Daiwei Li <daiweili@...il.com>, Richard Cochran <richardcochran@...il.com>
Cc: intel-wired-lan@...ts.osuosl.org, sasha.neftin@...el.com,
 kurt@...utronix.de, anthony.l.nguyen@...el.com, netdev@...r.kernel.org,
 Przemek Kitszel <przemyslaw.kitszel@...el.com>, "David S. Miller"
 <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski
 <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH iwl-net v1] igb: Fix not clearing TimeSync interrupts
 for 82580

Hi,

Daiwei Li <daiweili@...il.com> writes:

>> @Daiwei Li, I don't have a 82580 handy, please confirm that the patch
> fixes the issue you are having.
>
> Thank you for the patch! I can confirm it fixes my issue. Below I offer a
> patch that also works in response to Paul's feedback.
>

Your patch looks better than mine. I would suggest for you to go ahead
and propose yours for inclusion.

>> Please also add a description of the test case
>
> I am running ptp4l to serve PTP to a client device attached to the NIC.
> To test, I am rebuilding igb.ko and reloading it.
> Without this patch, I see repeatedly in the output of ptp4l:
>
>> timed out while polling for tx timestamp increasing tx_timestamp_timeout or
>> increasing kworker priority may correct this issue, but a driver bug likely
>> causes it
>
> as well as my client device failing to sync time.
>
>> and maybe the PCI vendor and device code of your network device.
>
> % lspci -nn | grep Network
> 17:00.0 Ethernet controller [0200]: Intel Corporation 82580 Gigabit
> Network Connection [8086:150e] (rev 01)
> 17:00.1 Ethernet controller [0200]: Intel Corporation 82580 Gigabit
> Network Connection [8086:150e] (rev 01)
> 17:00.2 Ethernet controller [0200]: Intel Corporation 82580 Gigabit
> Network Connection [8086:150e] (rev 01)
> 17:00.3 Ethernet controller [0200]: Intel Corporation 82580 Gigabit
> Network Connection [8086:150e] (rev 01)
>
>> Bug, or was it a feature?
>
> According to https://lore.kernel.org/all/CDCB8BE0.1EC2C%25matthew.vick@intel.com/
> it was a bug. It looks like the datasheet was not updated to
> acknowledge this bug:
> https://www.intel.com/content/www/us/en/content-details/333167/intel-82580-eb-82580-db-gbe-controller-datasheet.html
> (section 8.17.28.1).
>
>> Is there a nicer way to write this, so `ack` is only assigned in case
>> for the 82580?
>
> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c
> b/drivers/net/ethernet/intel/igb/igb_main.c
> index ada42ba63549..87ec1258e22a 100644
> --- a/drivers/net/ethernet/intel/igb/igb_main.c
> +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> @@ -6986,6 +6986,10 @@ static void igb_tsync_interrupt(struct
> igb_adapter *adapter)
>         struct e1000_hw *hw = &adapter->hw;
>         u32 tsicr = rd32(E1000_TSICR);
>         struct ptp_clock_event event;
> +       const u32 mask = (TSINTR_SYS_WRAP | E1000_TSICR_TXTS |
> +                          TSINTR_TT0 | TSINTR_TT1 |
> +                          TSINTR_AUTT0 | TSINTR_AUTT1);
> +
>
>         if (tsicr & TSINTR_SYS_WRAP) {
>                 event.type = PTP_CLOCK_PPS;
> @@ -7009,6 +7013,13 @@ static void igb_tsync_interrupt(struct
> igb_adapter *adapter)
>
>         if (tsicr & TSINTR_AUTT1)
>                 igb_extts(adapter, 1);
> +
> +       if (hw->mac.type == e1000_82580) {
> +               /* 82580 has a hardware bug that requires a explicit
> +                * write to clear the TimeSync interrupt cause.
> +                */
> +               wr32(E1000_TSICR, tsicr & mask);

Yeah, I should have thought about that, that writing '1' into an
interrupr that is cleared should be fine.

> +       }
>  }
> On Fri, Aug 9, 2024 at 10:04 PM Richard Cochran
> <richardcochran@...il.com> wrote:
>>
>> On Fri, Aug 09, 2024 at 05:23:02PM -0700, Vinicius Costa Gomes wrote:
>> > It was reported that 82580 NICs have a hardware bug that makes it
>> > necessary to write into the TSICR (TimeSync Interrupt Cause) register
>> > to clear it.
>>
>> Bug, or was it a feature?
>>
>> Or IOW, maybe i210 changed the semantics of the TSICR?
>>
>> And what about the 82576?
>>
>> Thanks,
>> Richard

-- 
Vinicius

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ