lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Mon, 06 Sep 2021 01:36:56 +0200
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Hillf Danton <hdanton@...a.com>,
        syzbot <syzbot+a9b681dcbc06eb2bca04@...kaller.appspotmail.com>
Cc:     linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
        syzkaller-bugs@...glegroups.com, eric.dumazet@...il.com
Subject: Re: [syzbot] INFO: task hung in __lru_add_drain_all

Hillf,

On Fri, Sep 03 2021 at 19:10, Hillf Danton wrote:
>
> See if ksoftirqd is preventing bound workqueue work from running.

What?

> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -521,6 +521,7 @@ asmlinkage __visible void __softirq_entr
>  	bool in_hardirq;
>  	__u32 pending;
>  	int softirq_bit;
> +	bool is_ksoftirqd = __this_cpu_read(ksoftirqd) == current;
>  
>  	/*
>  	 * Mask out PF_MEMALLOC as the current task context is borrowed for the
> @@ -565,6 +566,8 @@ restart:
>  		}
>  		h++;
>  		pending >>= softirq_bit;
> +		if (is_ksoftirqd && in_task())

Can you please explain how this would ever be true?

 #define in_task()	(!(in_nmi() | in_hardirq() | in_serving_softirq()))

in_task() is guaranteed to be false here, because in_serving_softirq()
is guaranteed to be true simply because this is the softirq processing
context.

> +			cond_resched();

___do_softirq() returns after 2 msec of softirq processing whether it is
invoked on return from interrupt or in ksoftirqd context. On return from
interrupt this wakes ksoftirqd and returns. In ksoftirqd this is a
rescheduling point.

But that only works when the action handlers, e.g. net_rx_action(),
behave well and respect that limit as well.

net_rx_action() has it's own time limit: netdev_budget_usecs

That defaults to: 2 * USEC_PER_SEC / HZ 

The config has HZ=100, so this loop should terminate after

    2 * 1e6 / 100 = 20000us = 20ms

The provided C-reproducer does not change that default.

But again this loop can only terminate if napi_poll() and the
subsequently invoked callchain behaves well.

So instead of sending obviously useless "debug" patches, why are you not
grabbing the kernel config and the reproducer and figure out what the
root cause is?

Enable tracing, add some trace_printks and let ftrace_dump_on_oops spill
it out when the problem triggers. That will pinpoint the issue.

Thanks,

        tglx


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ