linux-kernel - Re: [PATCH net 0/4] selftests/net/tcp_ao: A bunch of fixes for TCP-AO selftests

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJwJo6Yw4S1wCcimRVy=P8h0Ez0UDt-yw2jqSY-ph3TKsQVVGA@mail.gmail.com>
Date: Wed, 17 Apr 2024 19:47:18 +0100
From: Dmitry Safonov <0x7f454c46@...il.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Dmitry Safonov via B4 Relay <devnull+0x7f454c46.gmail.com@...nel.org>, 
	"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, 
	Paolo Abeni <pabeni@...hat.com>, Shuah Khan <shuah@...nel.org>, netdev@...r.kernel.org, 
	linux-kselftest@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net 0/4] selftests/net/tcp_ao: A bunch of fixes for TCP-AO selftests

On Tue, 16 Apr 2024 at 15:28, Jakub Kicinski <kuba@...nel.org> wrote:
>
> On Sat, 13 Apr 2024 02:42:51 +0100 Dmitry Safonov via B4 Relay wrote:
> > Started as addressing the flakiness issues in rst_ipv*, that affect
> > netdev dashboard.
>
> Thank you! :)

Jakub, you are very welcome :)
I'll keep an eye on the dashboard, but I very much encourage you to
ping me in case of any other issues with tcp_ao selftests.

I currently have v2 for tcp-ao tracepoints, but delaying it as working
on a reproducer/selftest for an issue I think I have a patch for.

BTW, do you know if those were addressed or anyone is looking into
them? (from other tcp-ao hits, that seem not anyhow related to tcp-ao
itself):

1. [ 240.001391][ T833] Possible interrupt unsafe locking scenario:
[  240.001391][  T833]
[  240.001635][  T833]        CPU0                    CPU1
[  240.001797][  T833]        ----                    ----
[  240.001958][  T833]   lock(&p->alloc_lock);
[  240.002083][  T833]                                local_irq_disable();
[  240.002284][  T833]                                lock(&ndev->lock);
[  240.002490][  T833]                                lock(&p->alloc_lock);
[  240.002709][  T833]   <Interrupt>
[  240.002819][  T833]     lock(&ndev->lock);
[  240.002937][  T833]
[  240.002937][  T833]  *** DEADLOCK ***

https://netdev-3.bots.linux.dev/vmksft-tcp-ao-dbg/results/537021/14-self-connect-ipv6/stderr

2. [  251.411647][   T71] WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock
order detected
[  251.411986][   T71] 6.9.0-rc1-virtme #1 Not tainted
[  251.412214][   T71] -----------------------------------------------------
[  251.412533][   T71] kworker/u16:1/71 [HC0[0]:SC0[2]:HE1:SE0] is
trying to acquire:
[  251.412837][   T71] ffff888005182c28 (&p->alloc_lock){+.+.}-{2:2},
at: __get_task_comm+0x27/0x70
[  251.413214][   T71]
[  251.413214][   T71] and this task is already holding:
[  251.413527][   T71] ffff88802f83efd8 (&ul->lock){+.-.}-{2:2}, at:
rt6_uncached_list_flush_dev+0x138/0x840
[  251.413887][   T71] which would create a new lock dependency:
[  251.414153][   T71]  (&ul->lock){+.-.}-{2:2} -> (&p->alloc_lock){+.+.}-{2:2}
[  251.414464][   T71]
[  251.414464][   T71] but this new dependency connects a SOFTIRQ-irq-safe lock:
[  251.414808][   T71]  (&ul->lock){+.-.}-{2:2}

https://netdev-3.bots.linux.dev/vmksft-tcp-ao-dbg/results/537201/17-icmps-discard-ipv4/stderr

3. [ 264.280734][ C3] Possible unsafe locking scenario:
[  264.280734][    C3]
[  264.280968][    C3]        CPU0                    CPU1
[  264.281117][    C3]        ----                    ----
[  264.281263][    C3]   lock((&tw->tw_timer));
[  264.281427][    C3]
lock(&hashinfo->ehash_locks[i]);
[  264.281647][    C3]                                lock((&tw->tw_timer));
[  264.281834][    C3]   lock(&hashinfo->ehash_locks[i]);

https://netdev-3.bots.linux.dev/vmksft-tcp-ao-dbg/results/547461/19-self-connect-ipv4/stderr

I can spend some time on them after I verify that my fix for -stable
is actually fixing an issue I think it fixes.
Seems like your automation + my selftests are giving some fruits, hehe.

Thanks,
             Dmitry