linux-kernel - Re: [PATCH net-next v3 4/4] net: ocelot: add FDMA support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211129174038.gptbivgmbqzrrgtz@skbuf>
Date:   Mon, 29 Nov 2021 17:40:39 +0000
From:   Vladimir Oltean <vladimir.oltean@....com>
To:     Clément Léger <clement.leger@...tlin.com>
CC:     "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Rob Herring <robh+dt@...nel.org>,
        Claudiu Manoil <claudiu.manoil@....com>,
        Alexandre Belloni <alexandre.belloni@...tlin.com>,
        "UNGLinuxDriver@...rochip.com" <UNGLinuxDriver@...rochip.com>,
        Andrew Lunn <andrew@...n.ch>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "devicetree@...r.kernel.org" <devicetree@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Thomas Petazzoni <thomas.petazzoni@...tlin.com>,
        Denis Kirjanov <dkirjanov@...e.de>,
        Julian Wiedmann <jwi@...ux.ibm.com>
Subject: Re: [PATCH net-next v3 4/4] net: ocelot: add FDMA support

On Mon, Nov 29, 2021 at 09:19:02AM +0100, Clément Léger wrote:
> > I'm not sure why you're letting the hardware grind to a halt first,
> > before refilling? I think since the CPU is the bottleneck anyway, you
> > can stop the extraction channel at any time you want to refill.
> > A constant stream of less data might be better than a bursty one.
> > Or maybe I'm misunderstanding some of the details of the hardware.
> 
> Indeed, I can stop the extraction channel but that does not seems a
> good idea to stop the channel in a steady state. At least that's what I
> thought since it will make the receive "window" non predictable. Not
> sure how well it will play with various protocol but I will try
> implementing the refill we talked previously (ie when there an
> available threshold is reached).
(...)
> > I don't understand why you restart the injection channel from the TX
> > confirmation interrupt. It raised the interrupt to tell you that it hit
> > a NULL LLP because there's nothing left to send. If you restart it now and
> > no other transmission has happened in the meantime, won't it stop again?
> 
> Actually, it is only restarted if there is some pending packets to
> send. With this hardware, packets can't be added while the FDMA is
> running and it must be stopped everytime we want to add a packet to the
> list. To avoid that, in the TX path, if the FDMA is stopped, we set the
> llp of the packet to NULL and start the chan. However, if the FDMA TX
> channel is running, we don't stop it, we simply add the next packets to
> the ring. However, the FDMA will stop on the previous NULL LLP. So when
> we hit a LLP, we might not be at the end of the list. This is why the
> next check verifies if we hit a NULL LLP and if there is still some
> packet to send. 

Oh, is that so? That would be pretty odd if the hardware is so dumb that
it doesn't detect changes made to an LLP on the go.

The manual has this to say, and I'm not sure how to interpret it:

| It is possible to update an active channels LLP pointer and pointers in
| the DCB chains. Before changing pointers software must schedule the
| channel for disabling (by writing FDMA_CH_DISABLE.CH_DISABLE[ch]) and
| then wait for the channel to set FDMA_CH_SAFE.CH_SAFE[ch]. When the
| pointer update is complete, soft must re-activate the channel by setting
| FDMA_CH_ACTIVATE.CH_ACTIVATE[ch]. Setting activate will cancel the
| deactivate-request, or if the channel has disabled itself in the
| meantime, it will re activate the channel.

So it is possible to update an active channel's LLP pointer, but not
while it's active? Thank you very much!

If true, this will severely limit the termination performance one is
able to obtain with this switch, even with a faster CPU and PCIe.

> > > +void ocelot_fdma_netdev_init(struct ocelot_fdma *fdma, struct net_device *dev)
> > > +{
> > > +	dev->needed_headroom = OCELOT_TAG_LEN;
> > > +	dev->needed_tailroom = ETH_FCS_LEN;  
> > 
> > The needed_headroom is in no way specific to FDMA, right? Why aren't you
> > doing it for manual register-based injection too? (in a separate patch ofc)
> 
> Actually, If I switch to page based ring, This won't be useful anymore
> because the header will be written directly in the page and not anymore
> directly in the skb header.

I don't understand this comment. You set up the needed headroom and
tailroom netdev variables to avoid reallocation on TX, not for RX.
And you use half page buffers for RX, not for TX.

> > I can't help but think how painful it is that with a CPU as slow as
> > yours, insult over injury, you also need to check for each packet
> > whether the device tree had defined the "fdma" region or not, because
> > you practically keep two traffic I/O implementations due to that sole
> > reason. I think for the ocelot switchdev driver, which is strictly for
> > MIPS CPUs embedded within the device, it should be fine to introduce a
> > static key here (search for static_branch_likely in the kernel).
> 
> I thinked about it *but* did not wanted to add a key since it would be
> global. However, we could consider that there is always only one
> instance of the driver and indeed a static key is an option.
> Unfortunately, I'm not sure this will yield any noticeable performance
> improvement.

What is the concern with a static key in this driver, exactly?