lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Mon, 1 Aug 2022 16:29:40 -0700 From: Jacob Keller <jacob.e.keller@...el.com> To: Ilya Evenbach <ievenbach@...ora.tech>, Alison Chaiken <achaiken@...ora.tech>, Steve Payne <spayne@...ora.tech>, <jesse.brandeburg@...el.com>, <richardcochran@...il.com>, <netdev@...r.kernel.org>, <intel-wired-lan@...ts.osuosl.org> Subject: Re: Fwd: [PATCH] Use ixgbe_ptp_reset on linkup/linkdown for X550 On 8/1/2022 4:00 PM, Ilya Evenbach wrote: >>> -----Original Message----- >>> From: achaiken@...ora.tech <achaiken@...ora.tech> >>> Sent: Monday, August 01, 2022 6:38 AM >>> To: Brandeburg, Jesse <jesse.brandeburg@...el.com>; >>> richardcochran@...il.com >>> Cc: spayne@...ora.tech; achaiken@...ora.tech; alison@...-devel.com; >>> netdev@...r.kernel.org; intel-wired-lan@...ts.osuosl.org >>> Subject: [PATCH] Use ixgbe_ptp_reset on linkup/linkdown for X550 >>> >>> From: Steve Payne <spayne@...ora.tech> >>> >>> For an unknown reason, when `ixgbe_ptp_start_cyclecounter` is called >>> from `ixgbe_watchdog_link_is_down` the PHC on the NIC jumps backward >>> by a seemingly inconsistent amount, which causes discontinuities in >>> time synchronization. Explicitly reset the NIC's PHC to >>> `CLOCK_REALTIME` whenever the NIC goes up or down by calling >>> `ixgbe_ptp_reset` instead of the bare `ixgbe_ptp_start_cyclecounter`. >>> >>> Signed-off-by: Steve Payne <spayne@...ora.tech> >>> Signed-off-by: Alison Chaiken <achaiken@...ora.tech> >>> >> >> Resetting PTP could be a problem if the clock was not being synchronized with the kernel CLOCK_REALTIME, > > That is true, but most likely not really important, as the unmitigated > problem also introduces significant discontinuities in time. > Basically, this patch does not make things worse. > Sure, but I am trying to see if I can understand *why* things get wonky. I suspect the issue is caused because of how we're resetting the cyclecounter. >> >> and does result in some loss of timer precision either way due to the delays involved with setting the time. > > That precision loss is negligible compared to jumps resulting from > link down/up, and should be corrected by normal PTP operation very > quickly. > Only if CLOCK_REALTIME is actually being synchronized. Yes, that is generally true, but its not necessarily guaranteed. >> >> Do you have an example of the clock jump? How much is it? > > 2021-02-12T09:24:37.741191+00:00 bench-12 phc2sys: [195230.451] > CLOCK_REALTIME phc offset 61 s2 freq -36503 delay 2298 > 2021-02-12T09:24:38.741315+00:00 bench-12 phc2sys: [195231.451] > CLOCK_REALTIME phc offset 169 s2 freq -36377 delay 2294 > 2021-02-12T09:24:39.741407+00:00 bench-12 phc2sys: [195232.451] > CLOCK_REALTIME phc offset 195213702387037 s2 freq +100000000 delay > 2301 > 2021-02-12T09:24:40.741489+00:00 bench-12 phc2sys: [195233.452] > CLOCK_REALTIME phc offset 195213591220495 s2 freq +100000000 delay > 2081 > Thanks. I think what's actually going on is a bug in the ixgbe_ptp_start_cyclecounter function where the system time registers are being reset. What hardware are you operating on? Do you know if its an X550 board? It looks like this has been the case since a9763f3cb54c ("ixgbe: Update PTP to support X550EM_x devices"). The start_cyclecounter was never supposed to modify the current time registers, but resetting it to 0 as it does for X550 devices would give the exact behavior you're seeing.
Powered by blists - more mailing lists