lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 26 Nov 2017 20:31:33 +0100
From:   Jiri Olsa <jolsa@...hat.com>
To:     Milind Chabbi <chabbi.milind@...il.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Namhyung Kim <namhyung@...nel.org>,
        linux-kernel@...r.kernel.org,
        Michael Kerrisk-manpages <mtk.manpages@...il.com>,
        linux-man@...r.kernel.org, Michael Ellerman <mpe@...erman.id.au>,
        Andi Kleen <ak@...ux.intel.com>,
        Kan Liang <kan.liang@...el.com>,
        Hari Bathini <hbathini@...ux.vnet.ibm.com>,
        Sukadev Bhattiprolu <sukadev@...ux.vnet.ibm.com>,
        Jin Yao <yao.jin@...ux.intel.com>
Subject: Re: [PATCH] perf/core: fast breakpoint modification via
 _IOC_MODIFY_BREAKPOINT

On Mon, Nov 13, 2017 at 12:02:56AM -0800, Milind Chabbi wrote:
> SNIP
> 
> On Sun, Nov 12, 2017 at 11:46 PM, Jiri Olsa <jolsa@...hat.com> wrote:
> 
> > but you closed fd4 before openning fd5..?
> 
> Yes, that is correct. I closed fd4. The reason is by closing fd4, we
> are having a total of 3 hardware breakpoints active, but we are making
> the software counting in the kernel think that four TYPE_DATA
> breakpoints active. The counting should have disallowed us from
> creating fd5 as per the following logic in the kernel:
> 
> static int __reserve_bp_slot(struct perf_event *bp)
> 
> {
>  ....
> 
>         /* Flexible counters need to keep at least one slot */
>         if (slots.pinned + (!!slots.flexible) > nr_slots[type])
>                 return -ENOSPC;
> ....
> }

So the issue is with the cpu pinned breakpoints, because we keep
their slot counts for both breakpoint types. For task breakpoints
we dont keep the slot count, we just count it every time we need it.

The issue will not expose on x86, because both breakpoint types
share same slot count (CONFIG_HAVE_MIXED_BREAKPOINTS_REGS).

I'm seeing the issue on arm machine (with 4 watchpoints and 6 breakpoints)

creating 4 watchpoints:
	2028  perf_event_open(0xffffdb232bd0, -1, 0, -1, PERF_FLAG_FD_CLOEXEC) = 3
	2028  perf_event_open(0xffffdb232c40, -1, 0, -1, PERF_FLAG_FD_CLOEXEC) = 4
	2028  perf_event_open(0xffffdb232cb0, -1, 0, -1, PERF_FLAG_FD_CLOEXEC) = 5
	2028  perf_event_open(0xffffdb232d20, -1, 0, -1, PERF_FLAG_FD_CLOEXEC) = 6

changing last one to breakpoint:
	2028  ioctl(6, _IOC(_IOC_WRITE, 0x24, 0x0a, 0x08), 0xffffdb232e08) = 0

and trying to create one more watchpoint:
	2028  perf_event_open(0xffffdb232d90, -1, 0, -1, PERF_FLAG_FD_CLOEXEC) = -1 ENOSPC (No space left on device)

after this, we have slot counts:
	get_bp_info(0, TYPE_DATA)->cpu_pinned = 4
	get_bp_info(0, TYPE_INST)->cpu_pinned = 0

now when we close all of it:
	close(3)
	close(4)
	close(5)
	close(6)

we get the slot counts messed up, because fd 6 has different type now:
	get_bp_info(0, TYPE_DATA)->cpu_pinned = 1
	get_bp_info(0, TYPE_INST)->cpu_pinned = -1


I put together some fix and put it in here:
  https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
  perf/bp

if you could please run your tests on it, and if it's all
good I'll post it

thanks,
jirka

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ