lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 2 Sep 2022 16:46:18 +0100
From:   Daniel Dao <dqminh@...udflare.com>
To:     acme@...nel.org, linux-perf-users@...r.kernel.org,
        kernel-team <kernel-team@...udflare.com>,
        linux-kernel <linux-kernel@...r.kernel.org>, jolsa@...hat.com,
        adrian.hunter@...el.com, namhyung@...nel.org
Subject: perf record --kcore does not work when /proc/modules changed during copy

Hi Perf tools maintainers,

`perf record --kcore` frequently did not work on a somewhat busy system.
For example:

  sudo perf record --kcore -- sleep 1
  ERROR: Failed to copy kcore

Using strace to look at the invocation, the failure looks like:

  ...
  openat(AT_FDCWD, "/proc/modules", O_RDONLY) = 56
  openat(AT_FDCWD, "perf.data/kcore_dir/modules", O_RDONLY) = 57
  read(56, "mpls_gso 16384 0 - Live 0xffffff"..., 4096) = 4070
  read(57, "mpls_gso 16384 0 - Live 0xffffff"..., 4070) = 4070
  read(56, "xt_conntrack 24576 22 - Live 0xf"..., 4096) = 3738
  read(57, "xt_conntrack 24576 22 - Live 0xf"..., 3738) = 3738
  close(57)                               = 0
  close(56)                               = 0
  close(55)                               = 0
  unlink("perf.data/kcore_dir/kcore")     = 0
  close(54)                               = 0
  unlink("perf.data/kcore_dir/modules")   = 0
  unlink("perf.data/kcore_dir/kallsyms")  = 0
  write(2, "ERROR: Failed to copy kcore\n", 28ERROR: Failed to copy kcore
  ...

We can see that the verification of proc/modules failed because /proc/modules
output changed after we copied kcore. When i looked at it, they are caused by
changes of module refcount which seems expected on busy systems, such as

  < tcp_bbr 40960 12644 - Live 0x0000000000000000
  ---
  > tcp_bbr 40960 12678 - Live 0x0000000000000000

Any suggestions on how to make this work is much appreciated.

Cheers,
Daniel.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ