lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Fri, 26 Jan 2024 14:35:44 +0800
From: Oliver Sang <oliver.sang@...el.com>
To: Catalin Marinas <catalin.marinas@....com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>, Waiman Long <longman@...hat.com>,
	<linux-mm@...ck.org>, <oliver.sang@...el.com>
Subject: Re: [linus:master] [kmemleak]  39042079a0:
 kernel-selftests.kvm.memslot_perf_test.fail

hi, Catalin,

On Thu, Jan 25, 2024 at 12:01:56PM +0000, Catalin Marinas wrote:
> On Thu, Jan 25, 2024 at 03:34:37PM +0800, kernel test robot wrote:
> > kernel test robot noticed "kernel-selftests.kvm.memslot_perf_test.fail" on:
> > 
> > commit: 39042079a0c241d09fa6fc3bb67c2ddf60011d0f ("kmemleak: avoid RCU stalls when freeing metadata for per-CPU pointers")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> [...]
> > The kernel config and materials to reproduce are available at:
> > https://download.01.org/0day-ci/archive/20240125/202401251429.d3dea02b-oliver.sang@intel.com
> [...]
> > # Testing RW performance with 1 runs, 5 seconds each
> > #
> > not ok 71 selftests: kvm: memslot_perf_test # TIMEOUT 120 seconds
> 
> I'm not sure how this relates to kmemleak, especially the commit above.
> It might as well be that the above kmemleak commit increases the system
> load and you trip over the timeout above. Is it still reproducible with
> a larger timeout (unfortunately I don't have the hardware to try to
> reproduce the problem)?
> 
> I can see a lockdep warning in the dmesg but that doesn't look related
> to kmemleak.

below things FYI.

(1) we set same timeout for 39042079a0 and its parent, test always can pass
on parent (by 20 runs), but almost always fail on 39042079a0 (18 out of 20
runs fail).

(2) while running tests, we use a kmemleak monitor.
for parent runs, seems no problem, but for 39042079a0 runs, it will capture
below log:

$ cat kmemleak
unreferenced object 0xff110001198a1720 (size 16):
  comm "swapper/0", pid 1, jiffies 4294783051
  hex dump (first 16 bytes):
    41 43 50 49 20 48 4d 41 54 00 6b 6b 6b 6b 6b a5  ACPI HMAT.kkkkk.
  backtrace (crc 45396b86):
    [<ffffffff81a97e28>] __kmem_cache_alloc_node+0x228/0x310
    [<ffffffff81936de5>] __kmalloc_node_track_caller+0x55/0xf0
    [<ffffffff819160b4>] kstrdup+0x34/0x60
    [<ffffffff81aada3b>] mt_set_default_dram_perf+0x3cb/0x460
    [<ffffffff8658a9a6>] hmat_init+0x2b6/0x3a0
    [<ffffffff81002c58>] do_one_initcall+0xd8/0x410
    [<ffffffff864aaa4a>] do_initcalls+0x1ca/0x390
    [<ffffffff864ab53e>] kernel_init_freeable+0x8ce/0xc80
    [<ffffffff83beed8f>] kernel_init+0x1f/0x1d0
    [<ffffffff810a07c1>] ret_from_fork+0x31/0x70
    [<ffffffff81006341>] ret_from_fork_asm+0x11/0x20

(3) we tried to disable the kmemleak monitor for 39042079a0, then
memslot_perf_test can pass now. there is no kmemleak dump.

(4) we also noticed a test can pass on 39042079a0 now, but always fail on
parent, no matter kmemleak monitor enable or disable.
(normally we won't report a commit if it makes some test start to pass, so this
information is not in original report)

f67f8d4a8c1e1ebc 39042079a0c241d09fa6fc3bb67
---------------- ---------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
           :20         150%          30:30    kernel-selftests.kvm.rseq_test.pass

> 
> -- 
> Catalin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ