[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170918004845.y67x3p3sz2bhusgv@wfg-t540p.sh.intel.com>
Date: Mon, 18 Sep 2017 08:48:45 +0800
From: Fengguang Wu <fengguang.wu@...el.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: LKML <linux-kernel@...r.kernel.org>
Subject: Re: [linus:master] BUILD REGRESSION
2bd6bf03f4c1c59381d62c61d03f6cc3fe71f66e
Hi Linus,
On Sun, Sep 17, 2017 at 08:31:56AM -0700, Linus Torvalds wrote:
>Fengguang,
> it looks like the kernel build robot _only_ tests the actual rc
>kernels, and doesn't bisect down where the error started.
Nah, that's an illusion. :)
It's a per-branch summary report _in addition to_ per-bisect reports.
The former shows all active (not-yet-fixed) error/warnings in the
current branch HEAD; the latter shows result of one bisect.
Typically for all error messages showed in this summary report, there
have been individual bisect reports sent out to the relevant authors
and committers. I'll give concrete examples in the bottom.
>Any change that when it notices an error, it would bisect it, like it
>does for linux-next?
It should already be so -- otherwise it's a bug in 0day robot. In fact
your tree _implicitly_ receives much more tests than linux-next, and
linux-next receives more tests than other individual developer trees.
It works like this:
The robot will normally test all pushed branch HEADs of all git trees.
IOW, each of your (and others') git push will trigger tests -- unless
when occasionally the robot cannot catch up.
The RC kernels will effectively receive _much more_ tests, since
developers typically base their git branches on RC releases. So
whenever they do git push, the triggered tests on their branch HEAD
will automatically cover its base RC kernel.
Whenever an error is found in a commit (typically the branch HEAD),
the robot will traverse backwards in its git history and test these
critical points until a GOOD point is found for starting the bisect:
- the branch's BASE commit (typically an RC kernel)
- the official releases (eg. 4.14 => 4.13 => 4.12 => ...)
We'll give up when the bug is found to exist in too old kernel, since
old bugs are likely either uninteresting (no one cares to fix) or hard
to bisect.
>On Sat, Sep 16, 2017 at 11:02 PM, kbuild test robot
><fengguang.wu@...el.com> wrote:
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
>> 2bd6bf03f4c1c59381d62c61d03f6cc3fe71f66e Linux 4.14-rc1
>>
>> arch/alpha/include/asm/mmu_context.h:160:24: error: invalid type argument of '->' (have 'int')
Error ids grouped by kconfigs:
recent_errors
├── alpha-allmodconfig
│ ├── arch-alpha-include-asm-mmu_context.h:error:implicit-declaration-of-function-task_thread_info
│ └── arch-alpha-include-asm-mmu_context.h:error:invalid-type-argument-of-(have-int-)
The bisect report was sent here:
https://lkml.org/lkml/2017/9/16/187
And a fix was freshly posted here:
https://patchwork.kernel.org/patch/9954963/
├── cris-allyesconfig
│ ├── drivers-tty-serial-8250_core.c:error:unrecognizable-insn:
│ └── drivers-tty-serial-8250_core.c:internal-compiler-error:in-extract_insn-at-recog.c
Bisected and reported here:
https://www.spinics.net/lists/linux-serial/msg27175.html
├── ia64-allmodconfig
│ ├── drivers-clocksource-timer-of.h:error:field-clkevt-has-incomplete-type
│ └── include-linux-kernel.h:error:dereferencing-pointer-to-incomplete-type-struct-clock_event_device
Reported here
https://www.spinics.net/lists/kernel/msg2556450.html
which may be fixed by this RFC patch:
https://patchwork.kernel.org/patch/9939191/
├── ia64-allyesconfig
│ ├── drivers-clocksource-timer-of.h:error:field-clkevt-has-incomplete-type
│ └── include-linux-kernel.h:error:dereferencing-pointer-to-incomplete-type-struct-clock_event_device
Ditto.
├── mips-jmr3927_defconfig
│ ├── arch-mips-vdso-elf.S:error:march-r3900-requires-mfp32
│ ├── arch-mips-vdso-gettimeofday.c:error:march-r3900-requires-mfp32
│ ├── arch-mips-vdso-sigreturn.S:error:march-r3900-requires-mfp32
│ └── cc1:error:march-r3900-requires-mfp32
That's rather old bug that I gave up repeatedly reporting:
https://www.linux-mips.org/archives/linux-mips/2016-03/msg00215.html
├── parisc-allmodconfig
│ └── ERROR:__cmpxchg_u64-drivers-net-ethernet-intel-i40e-i40e.ko-undefined
Reported here:
https://lkml.org/lkml/2017/9/10/100
├── sparc64-allmodconfig
│ ├── arch-sparc-include-asm-mmu_context_64.h:error:implicit-declaration-of-function-per_cpu
│ ├── arch-sparc-include-asm-mmu_context_64.h:error:implicit-declaration-of-function-smp_processor_id
│ ├── arch-sparc-include-asm-mmu_context_64.h:error:per_cpu_secondary_mm-undeclared-(first-use-in-this-function)
│ └── arch-sparc-include-asm-mmu_context_64.h:error:unknown-type-name-per_cpu_secondary_mm
Reported here:
https://lists.01.org/pipermail/kbuild-all/2017-August/037613.html
https://lists.01.org/pipermail/kbuild-all/2017-September/037968.html
And recently fixed here:
https://patchwork.kernel.org/patch/9946375/
└── x86_64-randconfig-s4-09170918
└── net-netfilter-nf_nat_core.c:note:in-expansion-of-macro-if
Reported here:
https://lkml.org/lkml/2017/9/16/203
As you may see, all the errors mentioned in this summary report have
been individually bisected and reported somewhere before.
Regards,
Fengguang
Powered by blists - more mailing lists