lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 25 Apr 2021 09:48:16 +0800
From:   Oliver Sang <oliver.sang@...el.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     "Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>,
        Harish Sriram <harish@...ux.ibm.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        kernel test robot <lkp@...el.com>
Subject: Re: [mm/vunmap] e47110e905: WARNING:at_mm/vmalloc.c:#__vunmap

Hi Linus,

On Fri, Apr 23, 2021 at 10:18:18AM -0700, Linus Torvalds wrote:
> On Thu, Apr 22, 2021 at 11:15 PM kernel test robot
> <oliver.sang@...el.com> wrote:
> >
> > commit: e47110e90584a22e9980510b00d0dfad3a83354e ("mm/vunmap: add cond_resched() in vunmap_pmd_range")
> 
> Funky. That commit doesn't seem to have anything to do with the oops.
> 
> The oops is odd too:
> 
> > [  198.731223] WARNING: CPU: 0 PID: 1948 at mm/vmalloc.c:2247 __vunmap (kbuild/src/consumer/mm/vmalloc.c:2247 (discriminator 1))
> 
> That's the warning for an unaligned vunmap():
> 
>   2247          if (WARN(!PAGE_ALIGNED(addr), "Trying to vfree() bad
> address (%p)\n",
>   2248                          addr))
>   2249                  return;
> 
> > [  198.744933] Call Trace:
> > [  198.745229] free_module (kbuild/src/consumer/kernel/module.c:2251)
> 
>   2248          /* This may be empty, but that's OK */
>   2249          module_arch_freeing_init(mod);
>   2250          module_memfree(mod->init_layout.base);
>   2251          kfree(mod->args);
> 
> That's the "module_memfree()" - the return address points to the
> return point, which is the next line.
> 
> And as far as I can tell, the only thing that assigns anything but
> NULL to that init_layout.base is
> 
>                 ptr = module_alloc(mod->init_layout.size);
> 
> which uses __vmalloc_node_range() for the allocation.
> 
> So absolutely nothing in this report makes sense to me. I suspect it's
> some odd memory corruption.
> 
> Oliver - how reliable is that bisection?

we will check further if any issue in our test env.

by bot auto tests, we saw 12 issue instances out of 74 runs. but not happen
out of 100 runs of parent.
f3f99d63a8156c7a e47110e90584a22e9980510b00d
---------------- ---------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
          1:100         -1%            :74    dmesg.BUG:kernel_reboot-without-warning_in_test_stage
          2:100          0%           2:74    dmesg.BUG:unable_to_handle_page_fault_for_address
           :100         12%          12:74    dmesg.Kernel_panic-not_syncing:Fatal_exception
          2:100          0%           2:74    dmesg.Oops:#[##]
          1:100         -1%            :74    dmesg.RIP:__is_module_percpu_address
           :100         12%          12:74    dmesg.RIP:__vunmap  <-----
           :100         12%          12:74    dmesg.RIP:kfree
           :100          1%           1:74    dmesg.RIP:kobject_add_internal
          2:100         -1%           1:74    dmesg.RIP:print_modules
          1:100         -1%            :74    dmesg.RIP:skip_spaces
          1:100         -1%            :74    dmesg.RIP:usercopy_abort
           :100          1%           1:74    dmesg.WARNING:at_lib/kobject.c:#kobject_add_internal
           :100         12%          12:74    dmesg.WARNING:at_mm/vmalloc.c:#__vunmap
          3:100         10%          13:74    dmesg.boot_failures
          1:100         -1%            :74    dmesg.canonical_address#:#[##]
          2:100         -2%            :74    dmesg.invalid_opcode:#[##]
          2:100         -2%            :74    dmesg.kernel_BUG_at_mm/usercopy.c
           :100         11%          11:74    dmesg.stack_segment:#[##]



> 
> Does anybody else see what might be up?
> 
>             Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ