[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y0tafD7qI2x5xzTc@yury-laptop>
Date: Sat, 15 Oct 2022 18:12:28 -0700
From: Yury Norov <yury.norov@...il.com>
To: Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
Cc: Borislav Petkov <bp@...en8.de>,
Linus Torvalds <torvalds@...ux-foundation.org>,
syzbot <syzbot+d0fd2bf0dd6da72496dd@...kaller.appspotmail.com>,
syzkaller-bugs@...glegroups.com,
Andrew Jones <ajones@...tanamicro.com>, netdev@...r.kernel.org,
"David S . Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Menglong Dong <imagedong@...cent.com>,
Kuniyuki Iwashima <kuniyu@...zon.com>,
Petr Machata <petrm@...dia.com>,
Guo Ren <guoren@...ux.alibaba.com>,
"Michael S . Tsirkin" <mst@...hat.com>,
Alexander Gordeev <agordeev@...ux.ibm.com>,
andriy.shevchenko@...ux.intel.com, linux@...musvillemoes.dk,
caraitto@...gle.com, willemb@...gle.com, jonolson@...gle.com,
amritha.nambiar@...el.com, linux-kernel@...r.kernel.org
Subject: Re: [syzbot] WARNING in c_start
On Sun, Oct 16, 2022 at 09:24:57AM +0900, Tetsuo Handa wrote:
> On 2022/10/16 5:44, Yury Norov wrote:
> > Add people from other threads discussing this.
> >
> > On Sat, Oct 15, 2022 at 01:53:19PM +0200, Borislav Petkov wrote:
> >> On Sat, Oct 15, 2022 at 08:39:19PM +0900, Tetsuo Handa wrote:
> >>> That's an invalid command line. The correct syntax is:
> >>>
> >>> #syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> >>
> >> The fix is not in Linus' tree yet.
> >>
> >>> Andrew Jones proposed a fix for x86 and riscv architectures [2]. But
> >>> other architectures have the same problem. And fixing all callers will
> >>> not be in time for this merge window.
> >>
> >> Why won't there be time? That's why the -rcs are for.
> >>
> >> Also, that thing fires only when CONFIG_DEBUG_PER_CPU_MAPS is enabled.
> >>
> >> So no, we will take Andrew's fixes for all arches in time for 6.1.
> >
> > Summarizing things:
> >
> > 1. cpumask_check() was introduced to make sure that the cpu number
> > passed into cpumask API belongs to a valid range. But the check is
> > broken for a very long time. And because of that there are a lot of
> > places where cpumask API is used wrongly.
> >
> > 2. Underlying bitmap functions handle that correctly - when user
> > passes out-of-range CPU index, the nr_cpu_ids is returned, and this is
> > what expected by client code. So if DEBUG_PER_CPU_MAPS config is off,
> > everything is working smoothly.
> >
> > 3. I fixed all warnings that I was aware at the time of submitting the
> > patch. 2 follow-up series are on review: "[PATCH v2 0/4] net: drop
> > netif_attrmask_next*()" and "[PATCH 0/9] lib/cpumask: simplify
> > cpumask_next_wrap()". Also, Andrew Jones, Alexander Gordeev and Guo Ren
> > proposed fixes for c_start() in arch code.
> >
> > 4. The code paths mentioned above are all known to me that violate
> > cpumask_check() rules. (Did I miss something?)
> >
> > With all that, I agree with Borislav. Unfortunately, syzcall didn't CC
> > me about this problem with c_start(). But I don't like the idea to revert
> > cpumask_check() fix. This way we'll never clean that mess.
> >
> > If for some reason those warnings are unacceptable for -rcs (and like
> > Boris, I don't understand why), than instead of reverting commits, I'd
> > suggest moving cpumask sanity check from DEBUG_PER_CPU_MAPS under a new
> > config, say CONFIG_CPUMASK_DEBUG, which will be inactive until people will
> > fix their code. I can send a patch shortly, if we'll decide going this way.
> >
> > How people would even realize that they're doing something wrong if
> > they will not get warned about it?
>
> I'm asking you not to use BUG_ON()/WARN_ON() etc. which breaks syzkaller.
It's not me who added WARN_ON() in the cpumask_check(). You're asking
a wrong person.
What for do we have WARN_ON(), if we can't use it?
> Just printing messages (without "BUG:"/"WARNING:" string which also breaks
> syzkaller) like below diff is sufficient for people to realize that they're
> doing something wrong.
>
> Again, please do revert "cpumask: fix checking valid cpu range" immediately.
The revert is already in Jakub's batch for -rc2, AFAIK.
Thanks,
Yury
Powered by blists - more mailing lists