netdev - Re: [PATCH net-next v7 04/11] net: ti: prueth: Adds link detection, RX and TX support.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <cbba5990-959b-476f-a3fd-6b346a7d58ee@redhat.com>
Date: Tue, 13 May 2025 16:11:05 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: Parvathi Pudi <parvathi@...thit.com>
Cc: "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [PATCH net-next v7 04/11] net: ti: prueth: Adds link detection,
 RX and TX support.

Apparently you unintentionally stripped the recipients list. Let me
re-add the ML.

On 5/13/25 2:18 PM, Parvathi Pudi wrote:
>> On 5/9/25 12:21 PM, Parvathi Pudi wrote:
>>>> On 5/3/25 3:11 PM, Parvathi Pudi wrote:
>>>>> +/**
>>>>> + * icssm_emac_rx_thread - EMAC Rx interrupt thread handler
>>>>> + * @irq: interrupt number
>>>>> + * @dev_id: pointer to net_device
>>>>> + *
>>>>> + * EMAC Rx Interrupt thread handler - function to process the rx frames in a
>>>>> + * irq thread function. There is only limited buffer at the ingress to
>>>>> + * queue the frames. As the frames are to be emptied as quickly as
>>>>> + * possible to avoid overflow, irq thread is necessary. Current implementation
>>>>> + * based on NAPI poll results in packet loss due to overflow at
>>>>> + * the ingress queues. Industrial use case requires loss free packet
>>>>> + * processing. Tests shows that with threaded irq based processing,
>>>>> + * no overflow happens when receiving at ~92Mbps for MTU sized frames and thus
>>>>> + * meet the requirement for industrial use case.
>>>>
>>>> The above statement is highly suspicious. On an non idle system the
>>>> threaded irq can be delayed for an unbound amount of time. On an idle
>>>> system napi_poll should be invoked with a latency comparable - if not
>>>> less - to the threaded irq. Possibly you tripped on some H/W induced
>>>> latency to re-program the ISR?
>>>>
>>>> In any case I think we need a better argumented statement to
>>>> intentionally avoid NAPI.
>>>>
>>>> Cheers,
>>>>
>>>> Paolo
>>>
>>> The above comment was from the developer to highlight that there is an
>>> improvement in
>>> performance with IRQ compared to NAPI. The improvement in performance was
>>> observed due to
>>> the limited PRU buffer pool (holds only 3 MTU packets). We need to service the
>>> queue as
>>> soon as a packet is written to prevent overflow. To achieve this, IRQs with
>>> highest
>>> priority is used. We will clean up the comments in the next version.
>>
>> Do you mean 'IRQ _thread_ with the highest priority'? I'm possibly
>> missing something, but I don't see the driver setting the irq thread
>> priority.
>>
>> Still it's not clear to me why/how scheduling a thread should guarantee
>> lower latency than serving the irq is softirq context, where no
>> reschedule is needed.
>>
>> I think you should justify this statement which sounds counter-intuitive
>> to me.
>>
>> Thanks,
>>
>> Paolo
> 
> I might not have been clear in my earlier communication. The driver does not
> configure the IRQ thread to run at the highest priority. Instead, the highest
> real-time priority is assigned to the user applications using "chrt" to ensure
> timely processing of network traffic.

I don't follow. Setting real-time priority for the user-space
application thread will possibly increase the latency for the irq thread.

> This commit in the TI Linux kernel can be used as a reference for the transition
> from NAPI-based polling to IRQ-driven packet processing:
> https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/commit/?id=7db2c2eb0e33821c2f1abea14f079bc23b7bde56
> 
> This change is based on performance limitations observed with NAPI polling under
> high UDP traffic. Specifically, during iperf tests with UDP traffic at 92 Mbps
> and MTU-sized frames, ingress queue overflows were observed. This occurs because
> the hardware ingress queues have limited buffering capacity, resulting in dropped
> frames.
> 
> Based on the stated limitations with NAPI polling, we believe that switching to a
> threaded IRQ handler is a reasonable solution. I'll follow up with the developers
> in the background to gather additional context on this implementation and will try
> to collect relevant bench-marking data to help justify the approach further. In
> the meantime, do you see any concern moving forward with the use of threaded IRQs
> for now?

The IRQ thread is completely under the scheduler control. It can
experience unbound latency (i.e. if there are many other running thread
in the same CPU). You need to tune the scheduling priority for the
specific setup/host, and likely will not work well (or at all) in other
scenarios.

See commit 4cd13c21b207 and all it's follow-ups to get an idea of the
things that could go wrong. Note that such commit moved the network
processing into thread scope only on quite restrictive conditions.
Having the packet processing unconditionally threaded could be much worse.

/P