linux-kernel - Re: linus-next: improving functional testing for to-be-merged pull requests

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e4ac4459-faf6-48df-851a-a5204bdee4cd@paulmck-laptop>
Date: Wed, 23 Oct 2024 11:37:31 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Christoph Hellwig <hch@...radead.org>, Sasha Levin <sashal@...nel.org>,
	Kees Cook <kees@...nel.org>, ksummit@...ts.linux.dev,
	linux-kernel@...r.kernel.org
Subject: Re: linus-next: improving functional testing for to-be-merged pull
 requests

On Wed, Oct 23, 2024 at 11:06:59AM -0700, Linus Torvalds wrote:
> On Wed, 23 Oct 2024 at 10:47, Paul E. McKenney <paulmck@...nel.org> wrote:
> >
> > > Yes, without Linus caring we're not going to get our process worked out.
> > > Not sure how a tree that probably won't have much better latency than
> > > linux-next is going to fix that, though.
> >
> > If I recall correctly, one thing Linus asked us to do earlier this year
> > (ARM Summit) is to CC him on -next failures.
> 
> Yes. I definitely care about failures in linux-next, but I often don't
> _know_ about them unless I'm told.
> 
> The linux-next automation sends notifications to the owners of the
> trees, but not to me.

OK, then I guess I was inadvertently doing the right thing by forgetting
to CC you on all -next issues my testing has located.  ;-)

[ . . . ]

> And yes, I know some people do functional testing on linux-next
> already. The message at the maintainer summit was a bit mixed with
> some people saying linux-next tends to work even for that, others
> saying it's often too broken to be useful.

Functional testing?

Me, I do rcutorture *stress* testing on -next, and it usually passes.
Yes, there is the occasional spectacular exception, like the version
last month where rcutorture found the better part of ten bugs.  Two of
which were ugly heisenbugs that are still being chased down.

(Full disclosure: One of those bugs is in RCU, but that bug is already
in mainline.  Much of my time over the past two weeks has gone into
moving it from a once-per-year heisenbug to more than ten failures
per hour.  This bug does not happen without lots of CPU hotplug, as in
a 50-millisecond delay between successive CPU-hotplug operations.)

What I do is to carry patches against -next, and this week I am down
from two to one of them.  But most of the time, my list of -next-fix
patches is of length zero.

Now, I freely admit that rcutorture doesn't hammer all that much of
the kernel: mostly just RCU, timers, the scheduler, and CPU hotplug.
So I could easily imagine that -next testing of other parts of the kernel
might be more failure-prone.  Except that the correct answer to that is
*more* -next testing, not less of it.

							Thanx, Paul