netdev - Re: [PATCH v2 1/3] can: ctucanfd: add HW timestamps to RX and error CAN frames

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220804091132.qpmiulpjmvleiqqj@pengutronix.de>
Date:   Thu, 4 Aug 2022 11:11:32 +0200
From:   Marc Kleine-Budde <mkl@...gutronix.de>
To:     Pavel Pisa <pisa@....felk.cvut.cz>
Cc:     Matej Vasilevski <matej.vasilevski@...nam.cz>,
        Ondrej Ille <ondrej.ille@...il.com>,
        Wolfgang Grandegger <wg@...ndegger.com>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Rob Herring <robh+dt@...nel.org>,
        Krzysztof Kozlowski <krzysztof.kozlowski+dt@...aro.org>,
        linux-can@...r.kernel.org, netdev@...r.kernel.org,
        devicetree@...r.kernel.org
Subject: Re: [PATCH v2 1/3] can: ctucanfd: add HW timestamps to RX and error
 CAN frames

On 04.08.2022 10:08:15, Pavel Pisa wrote:
> > > > Can the ctucan_skb_set_timestamp() and ctucan_read_timestamp_counter()
> > > > happen concurrently? AFAICS they are all called from ctucan_rx_poll(),
> > > > right?
> > >
> > > I am not sure about which possible problem do you think.
> > > But ctucan_read_timestamp_counter() is fully reentrant
> > > and has no side effect on the core. So there is no
> > > problem.
> >
> > ctucan_read_timestamp_counter() is reentrant, but on 32 bit systems the
> > update of tc->cycle_last isn't.
> >
> > [...]
> 
> Good catch, so we probably should use atomic there or we need to add
> spinlock, but I think that atomic is optimal solution there.

Atomic look like a good solution, but not scope of this patch, as the
timercounter needs to be modified. So use spinlocks for now.

> > > The need for clock running should be released in ctucan_suspend()
> > > and regained in ctucan_resume(). I see that the most CAN drivers
> > > use there clk_disable_unprepare/clk_prepare_enable but I am not
> > > sure, if this is right. Ma be plain clk_disable/clk_enable should
> > > be used for suspend and resume because as I understand, the clock
> > > frequency can be recomputed and reset during clk_prepare which
> > > would require to recompute bitrate. Do you have some advice
> > > what is a right option there?
> >
> > For the CAN clock, add a prepare_enable to ndo_open, corresponding
> > function to ndo_stop. Or better, add these time runtime_pm.
> 
> Hmm, there is problem that we have single clock for whole design,
> so if we try to touch AXI/APB slave registers without clock setup
> then system blocks. So I think that we need to prepare and enable
> clocks in probe to allow registers access later.

ACK, enable clocks during probe, too. There are already pm_runtime calls
in the driver, but no runtime pm callback that handle the clocks.

BTW: that pm_enable_call argument to ctucan_probe_common() looks a bit
strange.

| int ctucan_probe_common(struct device *dev, void __iomem *addr, int irq, unsigned int ntxbufs,
| 			unsigned long can_clk_rate, int pm_enable_call,
| 			void (*set_drvdata_fnc)(struct device *dev, struct net_device *ndev))

I think the caller should setup the pm stuff so that the
ctucan_probe_common() can do a pm_runtime_get_sync(dev) call.

> We need it at least
> for core bus endian probe and version validation/quirks. May it be
> we can disable clocks and reenable them in open.... But it is
> a little risky play and needs to ensure that no other path
> in the closed driver can attempt to access registers.

Sure. There are basically 3 parts in the base driver to consider:
1) ctucan_probe_common()
2) ndo_open()...ndo_stop()
3) do_get_berr_counter()

> Due to use of AXI bridges and other peripherals in Zynq Programmable
> Logic (FPGA) we keep forcibly clock enabled. In the fact, this
> should be solved some way on level of APB (now renamed in Zynq DST
> to AXI) bus mapping. 
> 
> > Has system suspend/resume been tested? I think the IP core might be
> > powered off during system suspend, so the driver has to restore the
> > state of the chip. The easiest would be to run through
> > chip_start()/chip_stop().
> 
> Hmm, if we do not reconfigure FPGA then it keeps state and if we
> lose configuration then whole cycle with DTS overlay is required.

Hardware IP cores might be powered down during system suspend and lose
their configuration. So you have to configure the CAN IP core again,
i.e. setup priorities, queues, bit timing, etc... Things that the
ctucan_chip_start() does. So I think ctucan_resume() must call to
ctucan_chip_start() if the interface was up while the system was
suspended.

> So basically for runtime power down wee need to unload overlay
> which removes driver instances and then reload  overlay and instances
> again...

I've never played with softcores before, but from the theory I don't
think you have to unload overlays etc...every driver needs to implement
proper suspend/resume functions.

> If you speak about suspend to disk, then I do not expect
> to run this way on any platform bus based system in near future.
> With PCIe card on PC it is possible... So it is other area to invest
> work in future...

:)

> > For the possible change of clock rate between probe and ifup, we should
> > add a CAN driver framework wide function to re-calculate the bitrates
> > with the current clock rate after the prepare_enable.
> 
> ACK
> 
> > BTW: In an early version of the stm32mp1 device tree some graphics clock
> > and the CAN clock shared the same parent clock. The configuration of the
> > display (which happened after the probe of the CAN driver ) caused a
> > different rate in the CAN clock, resulting in broken bit timings.
> 
> So in the fact each CAN device should listen to the clock rate
> change notifier...

ACK - and strictly speaking clk_get_rate() of a clock that's not enabled
is not valid. But that's not enforced by the common clock framework.

> > > Actual omission is no problem on our systems, be the clock are used
> > > in whole FPGA system with AXI connection and has to running already
> > > and we use same for timestamping.
> > >
> > > I would prefer to allow timestamping patch as it is without clock enable
> > > and then correct clock enable, disable by another patch for both ts and
> > > core clocks.
> >
> > NACK - if the time stamping clock is added, please with proper handling.
> > The core clock can be fixed in a later patch.
> 
> OK, how is it with your timing plans. I have already stated to Matej 
> Vasilevski that slip of the patch sending after 5.19 release means
> that we would not fit in 5.20 probably.

net-next for 5.20 is closed.

> If it is so and you, then I
> expect that postpone of discussions, replies and thoughts about our
> work after 5.20 preparation calms down is preferred on your side.
> We focus on preparation of proper series for early 5.21/6.0 cycle.

I don't mind discussing things for 5.21 now. Everything for 5.20 is now
bug fixes only.

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)