lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170918004845.y67x3p3sz2bhusgv@wfg-t540p.sh.intel.com>
Date:   Mon, 18 Sep 2017 08:48:45 +0800
From:   Fengguang Wu <fengguang.wu@...el.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     LKML <linux-kernel@...r.kernel.org>
Subject: Re: [linus:master] BUILD REGRESSION
 2bd6bf03f4c1c59381d62c61d03f6cc3fe71f66e

Hi Linus,

On Sun, Sep 17, 2017 at 08:31:56AM -0700, Linus Torvalds wrote:
>Fengguang,
> it looks like the kernel build robot _only_ tests the actual rc
>kernels, and doesn't bisect down where the error started.

Nah, that's an illusion. :)

It's a per-branch summary report _in addition to_ per-bisect reports.
The former shows all active (not-yet-fixed) error/warnings in the
current branch HEAD; the latter shows result of one bisect.

Typically for all error messages showed in this summary report, there
have been individual bisect reports sent out to the relevant authors
and committers. I'll give concrete examples in the bottom.

>Any change that when it notices an error, it would bisect it, like it
>does for linux-next?

It should already be so -- otherwise it's a bug in 0day robot. In fact
your tree _implicitly_ receives much more tests than linux-next, and
linux-next receives more tests than other individual developer trees.
It works like this:

The robot will normally test all pushed branch HEADs of all git trees.
IOW, each of your (and others') git push will trigger tests -- unless
when occasionally the robot cannot catch up.

The RC kernels will effectively receive _much more_ tests, since
developers typically base their git branches on RC releases. So
whenever they do git push, the triggered tests on their branch HEAD
will automatically cover its base RC kernel.

Whenever an error is found in a commit (typically the branch HEAD),
the robot will traverse backwards in its git history and test these
critical points until a GOOD point is found for starting the bisect:

        - the branch's BASE commit (typically an RC kernel)
        - the official releases (eg. 4.14 => 4.13 => 4.12 => ...)

We'll give up when the bug is found to exist in too old kernel, since
old bugs are likely either uninteresting (no one cares to fix) or hard
to bisect.

>On Sat, Sep 16, 2017 at 11:02 PM, kbuild test robot
><fengguang.wu@...el.com> wrote:
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git  master
>> 2bd6bf03f4c1c59381d62c61d03f6cc3fe71f66e  Linux 4.14-rc1
>>
>> arch/alpha/include/asm/mmu_context.h:160:24: error: invalid type argument of '->' (have 'int')

Error ids grouped by kconfigs:

recent_errors
├── alpha-allmodconfig
│   ├── arch-alpha-include-asm-mmu_context.h:error:implicit-declaration-of-function-task_thread_info
│   └── arch-alpha-include-asm-mmu_context.h:error:invalid-type-argument-of-(have-int-)

The bisect report was sent here:

https://lkml.org/lkml/2017/9/16/187

And a fix was freshly posted here:

https://patchwork.kernel.org/patch/9954963/

├── cris-allyesconfig
│   ├── drivers-tty-serial-8250_core.c:error:unrecognizable-insn:
│   └── drivers-tty-serial-8250_core.c:internal-compiler-error:in-extract_insn-at-recog.c

Bisected and reported here:

https://www.spinics.net/lists/linux-serial/msg27175.html

├── ia64-allmodconfig
│   ├── drivers-clocksource-timer-of.h:error:field-clkevt-has-incomplete-type
│   └── include-linux-kernel.h:error:dereferencing-pointer-to-incomplete-type-struct-clock_event_device

Reported here

https://www.spinics.net/lists/kernel/msg2556450.html

which may be fixed by this RFC patch:

https://patchwork.kernel.org/patch/9939191/

├── ia64-allyesconfig
│   ├── drivers-clocksource-timer-of.h:error:field-clkevt-has-incomplete-type
│   └── include-linux-kernel.h:error:dereferencing-pointer-to-incomplete-type-struct-clock_event_device

Ditto.

├── mips-jmr3927_defconfig
│   ├── arch-mips-vdso-elf.S:error:march-r3900-requires-mfp32
│   ├── arch-mips-vdso-gettimeofday.c:error:march-r3900-requires-mfp32
│   ├── arch-mips-vdso-sigreturn.S:error:march-r3900-requires-mfp32
│   └── cc1:error:march-r3900-requires-mfp32

That's rather old bug that I gave up repeatedly reporting:

https://www.linux-mips.org/archives/linux-mips/2016-03/msg00215.html

├── parisc-allmodconfig
│   └── ERROR:__cmpxchg_u64-drivers-net-ethernet-intel-i40e-i40e.ko-undefined

Reported here:

https://lkml.org/lkml/2017/9/10/100

├── sparc64-allmodconfig
│   ├── arch-sparc-include-asm-mmu_context_64.h:error:implicit-declaration-of-function-per_cpu
│   ├── arch-sparc-include-asm-mmu_context_64.h:error:implicit-declaration-of-function-smp_processor_id
│   ├── arch-sparc-include-asm-mmu_context_64.h:error:per_cpu_secondary_mm-undeclared-(first-use-in-this-function)
│   └── arch-sparc-include-asm-mmu_context_64.h:error:unknown-type-name-per_cpu_secondary_mm

Reported here:

https://lists.01.org/pipermail/kbuild-all/2017-August/037613.html
https://lists.01.org/pipermail/kbuild-all/2017-September/037968.html

And recently fixed here:

https://patchwork.kernel.org/patch/9946375/

└── x86_64-randconfig-s4-09170918
    └── net-netfilter-nf_nat_core.c:note:in-expansion-of-macro-if

Reported here:

https://lkml.org/lkml/2017/9/16/203

As you may see, all the errors mentioned in this summary report have
been individually bisected and reported somewhere before.

Regards,
Fengguang

Powered by blists - more mailing lists