[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a5avbx7c6pilz4bnp3iv7ivuxtth7udo6ypepemhumsxvuawrw@qa7kec5sxyhp>
Date: Fri, 7 Mar 2025 09:33:11 -0500
From: Kent Overstreet <kent.overstreet@...ux.dev>
To: Theodore Ts'o <tytso@....edu>
Cc: Hector Martin <marcan@...can.st>,
syzbot <syzbot+4364ec1693041cad20de@...kaller.appspotmail.com>, broonie@...nel.org, joel.granados@...nel.org, kees@...nel.org,
linux-bcachefs@...r.kernel.org, linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [bcachefs?] general protection fault in proc_sys_compare
On Fri, Mar 07, 2025 at 08:31:26AM -0500, Theodore Ts'o wrote:
> On Fri, Mar 07, 2025 at 06:51:23AM -0500, Kent Overstreet wrote:
> >
> > Better bisection algorithm? Standand bisect does really badly when fed
> > noisy data, but it wouldn't be hard to fix that: after N successive
> > passes or fails, which is unlikely because bisect tests are coinflips,
> > backtrack and gather more data in the part of the commit history where
> > you don't have much.
>
> My general approach when handling some test failure is to try running
> the reproducer 5-10 times on the original commit where the failure was
> detected, to see if the reproducer is reliable. Once it's been
> established whether the failure reproduces 100% of the time, or some
> fraction of the time, say 25% of the time, then we can estalbish how
> times we should try running the reproducer before we can conclude the
> that a particular commit is "good" --- and the first time we detect a
> failure, we can declare the commit is "bad", even if it happens on the
> 2nd out of the 25 tries that we might need to run a test if it is
> particularly flaky.
That does sound like a nice trick. I think we'd probably want both
approaches though, I've seen cases where a test starts out failing
perhasp 5% of the time and then jumps up to 40% later on - some other
behavioural change makes your race or what have you easier to hit.
Really what we're trying to do is determine the shape of an unknown
function sampling; we hope it's just a single stepwise change
but if not we need to keep gathering more data until we get a clear
enough picture (and we need a way to present that data, too).
>
> Maybe this is something Syzbot could implement?
Wouldn't it be better to have it in 'git bisect'?
> And if someone is familiar with the Go language, patches to implement
> this in gce-xfstests's ltm server would be great! It's something I've
> wanted to do, but I haven't gotten around to implementing it yet so it
> can be fully automated. Right now, ltm's git branch watcher reruns
> any failing test 5 times, so I get an idea of whether a failure is
> flaky or not. I'll then manually run a potentially flaky test 30
> times, and based on how reliable or flaky the test failure happens to
> be, I then tell gce-xfstests to do a bisect running each test N times,
> without having it stop once the test fails. It wasts a bit of test
> resources, but since it doesn't block my personal time (results land
> in my inbox when the bisect completes), it hasn't risen to the top of
> my todo list.
If only we had interns and grad students for this sort of thing :)
Powered by blists - more mailing lists