[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3dd73125-7f9b-405c-b5cd-0ab172014d00@gmail.com>
Date: Fri, 15 Aug 2025 21:02:27 +0100
From: Pavel Begunkov <asml.silence@...il.com>
To: Breno Leitao <leitao@...ian.org>, Jakub Kicinski <kuba@...nel.org>
Cc: Johannes Berg <johannes@...solutions.net>, Mike Galbraith
<efault@....de>, paulmck@...nel.org, LKML <linux-kernel@...r.kernel.org>,
netdev@...r.kernel.org, boqun.feng@...il.com
Subject: Re: netconsole: HARDIRQ-safe -> HARDIRQ-unsafe lock order warning
On 8/15/25 18:29, Breno Leitao wrote:
> On Fri, Aug 15, 2025 at 09:42:17AM -0700, Jakub Kicinski wrote:
>> On Fri, 15 Aug 2025 11:44:45 +0100 Pavel Begunkov wrote:
>>> On 8/15/25 01:23, Jakub Kicinski wrote:
>>
>> I suspect disabling netconsole over WiFi may be the most sensible way out.
>
> I believe we might be facing a similar issue with virtio-net.
> Specifically, any network adapter where TX is not safe to use in IRQ
> context encounters this problem.
>
> If we want to keep netconsole enabled on all TX paths, a possible
> solution is to defer the transmission work when netconsole is called
> inside an IRQ.
>
> The idea is that netconsole first checks if it is running in an IRQ
> context using in_irq(). If so, it queues the skb without transmitting it
> immediately and schedules deferred work to handle the transmission
> later.
>
> A rough implementation could be:
>
> static void send_udp(struct netconsole_target *nt, const char *msg, int len) {
>
> /* get the SKB that is already populated, with all the headers
> * and ready to be sent
> */
> struct sk_buff = netpoll_get_skb(&nt->np, msg, len);
>
> if (in_irq()) {
It's not just irq handlers but any context that has irqs disabled, and
since it's nested under irq-safe console_owner it'd need to always be
deferred or somehow moved out of the console_owner critical section.
Maybe there is printk lock trickery I don't understand, however.
> skb_queue_tail(&np->delayed_queue, skb);
> schedule_delayed_work(flush_delayed_queue, 0);
> return;
> }
>
> return __netpoll_send_skb(struct netpoll *np, struct sk_buff *skb)
> }
>
> This approach does not require additional memory or extra data copying,
> since copying from the printk buffer to the skb must be performed
> regardless.
>
> The main drawback is a slight delay for messages sent from within an IRQ
> context, though I believe such cases are infrequent.
>
> We could potentially also perform the flush from softirq context, which
> would help reduce this latency further.
--
Pavel Begunkov
Powered by blists - more mailing lists