[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2166dd91-7838-4ba9-9a5f-43b7ba4a5ce2@linux.intel.com>
Date: Thu, 14 Aug 2025 10:10:45 -0700
From: "Liang, Kan" <kan.liang@...ux.intel.com>
To: Yunseong Kim <ysk@...lloc.com>, Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>, Arnaldo Carvalho de Melo <acme@...nel.org>,
Namhyung Kim <namhyung@...nel.org>, Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>, Ian Rogers <irogers@...gle.com>
Cc: Adrian Hunter <adrian.hunter@...el.com>,
James Clark <james.clark@...aro.org>, Collin Funk <collin.funk1@...il.com>,
Ravi Bangoria <ravi.bangoria@....com>, linux-kernel@...r.kernel.org,
linux-perf-users@...r.kernel.org
Subject: Re: [PATCH] perf test: Add test case for event group throttling with
inactive events
On 2025-08-13 9:32 a.m., Yunseong Kim wrote:
> A recent UBSAN shift-out-of-bounds report was identified when throttling
> event groups that included inactive (PERF_EVENT_STATE_OFF) events.
> This occurred because pmu->start()/stop() could be called on these events,
> potentially leaving event->hw.idx at -1. This leads to undefined behavior
> when PMU code later uses this negative index as a shift exponent.
>
> The issue need to ensuring perf_event_throttle() and
> perf_event_unthrottle() skip inactive events entirely.
>
> Introduce a new perf test suite, "event group throttle", to verify this
> fix and prevent regressions.
>
> The test sets up a scenario designed to trigger frequent throttling:
> 1. A parent event (leader) is created with sample_period = 1.
> 2. A child event is created in the same group but initialized with
> disabled = 1 (inactive).
>
> A process opens these events and runs in a tight loop. The frequent
> sampling of the leader causes the entire group, including the inactive
> child event, to be rapidly throttled and unthrottled by the kernel.
>
> The test monitors /dev/kmsg during execution, looking for "UBSAN",
> "Invalid PMEV" "WARNING:", or "BUG:" messages.
I don't think the test case should focus on the error messages. It can
easily be found by many other ways.
Also, the current bug can trigger errors in kmsg. You cannot guarantee
that all the failing cases trigger an error message.
I think the test should make sure that the closed event cannot be
accidentally reopened by the throttling mechanism.
If so, I think it would be better to compare the results of the disabled
event before and after throttling. There should be noting count.
The enabled event should be count normally.
Thanks,
Kan>
> To ensure robustness and avoid false positives from unrelated prior kernel
> messages, the test opens /dev/kmsg and uses lseek(SEEK_END) to skip all
> existing log entries before starting the test loop. If /dev/kmsg cannot be
> accessed or seeked (e.g., lack of CAP_SYSLOG), the test handles it
> appropriately by skipping or failing.
>
> Related Reproducer by Mark Rutland
> Link: https://lore.kernel.org/lkml/aIEePonPatjNrJVk@J2N7QTR9R3/
>
> Related Kernel Fix
> Link: https://lore.kernel.org/lkml/20250812012722.127646-1-ysk@kzalloc.com/
> Cc: Mark Rutland <mark.rutland@....com>
> Signed-off-by: Yunseong Kim <ysk@...lloc.com>
> ---
> tools/perf/tests/Build | 1 +
> tools/perf/tests/builtin-test.c | 1 +
> tools/perf/tests/event_group_throttle.c | 132 ++++++++++++++++++++++++
> tools/perf/tests/tests.h | 1 +
> 4 files changed, 135 insertions(+)
> create mode 100644 tools/perf/tests/event_group_throttle.c
>
> diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
> index 3e8394be15ae..e22e2f285500 100644
> --- a/tools/perf/tests/Build
> +++ b/tools/perf/tests/Build
> @@ -70,6 +70,7 @@ perf-test-y += util.o
> perf-test-y += hwmon_pmu.o
> perf-test-y += tool_pmu.o
> perf-test-y += subcmd-help.o
> +perf-test-y += event_group_throttle.o
>
> ifeq ($(SRCARCH),$(filter $(SRCARCH),x86 arm arm64 powerpc))
> perf-test-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
> diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
> index 85142dfb3e01..d302bf9d1535 100644
> --- a/tools/perf/tests/builtin-test.c
> +++ b/tools/perf/tests/builtin-test.c
> @@ -141,6 +141,7 @@ static struct test_suite *generic_tests[] = {
> &suite__symbols,
> &suite__util,
> &suite__subcmd_help,
> + &suite__event_group_throttle,
> NULL,
> };
>
> diff --git a/tools/perf/tests/event_group_throttle.c b/tools/perf/tests/event_group_throttle.c
> new file mode 100644
> index 000000000000..7d5191d7e812
> --- /dev/null
> +++ b/tools/perf/tests/event_group_throttle.c
> @@ -0,0 +1,132 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <stdlib.h>
> +#include <unistd.h>
> +#include <time.h>
> +#include <sys/wait.h>
> +#include <signal.h>
> +#include <sys/syscall.h>
> +#include <sys/types.h>
> +#include <string.h>
> +#include <stdio.h>
> +#include <fcntl.h>
> +#include <unistd.h>
> +#include <errno.h>
> +#include <linux/perf_event.h>
> +#include "perf-sys.h"
> +#include "tests.h"
> +#include "debug.h"
> +
> +static struct perf_event_attr attr_parent = {
> + .type = PERF_TYPE_HARDWARE,
> + .size = sizeof(attr_parent),
> + .config = PERF_COUNT_HW_CPU_CYCLES,
> + .sample_period = 1,
> + .exclude_kernel = 1,
> +};
> +
> +static struct perf_event_attr attr_child = {
> + .type = PERF_TYPE_HARDWARE,
> + .size = sizeof(attr_child),
> + .config = PERF_COUNT_HW_CPU_CYCLES,
> + .exclude_kernel = 1,
> + .disabled = 1,
> +};
> +
> +static pid_t run_event_group_throttle(void)
> +{
> + pid_t pid = fork();
> +
> + if (pid == 0) {
> + int parent, child;
> +
> + parent = sys_perf_event_open(&attr_parent, 0, -1, -1, 0);
> + if (parent < 0) {
> + pr_debug("Unable to create event: %d\n", parent);
> + exit(-1);
> + }
> +
> + child = sys_perf_event_open(&attr_child, 0, -1, parent, 0);
> + if (child < 0) {
> + pr_debug("Unable to create event: %d\n", child);
> + exit(-1);
> + }
> +
> + for (;;)
> + asm("" ::: "memory");
> +
> + _exit(0);
> + }
> + return pid;
> +}
> +
> +static bool is_kmsg_err(int fd)
> +{
> + char buf[1024];
> + ssize_t len;
> +
> + while ((len = read(fd, buf, sizeof(buf) - 1)) > 0) {
> + buf[len] = '\0';
> +
> + if (strstr(buf, "UBSAN") || strstr(buf, "WARNING:") ||
> + strstr(buf, "BUG:") || strstr(buf, "Invalid PMEV")) {
> + pr_debug("Kernel log error detected: %s", buf);
> + return true;
> + }
> + }
> +
> + if (len < 0 && errno != EAGAIN) {
> + pr_debug("Error reading /dev/kmsg: %s\n", strerror(errno));
> + return true;
> + }
> +
> + return false;
> +}
> +
> +static int test__event_group_throttle(struct test_suite *test __maybe_unused,
> + int subtest __maybe_unused)
> +{
> + time_t start;
> + pid_t pid;
> + int fd;
> +
> + fd = open("/dev/kmsg", O_RDONLY | O_NONBLOCK);
> + if (fd < 0) {
> + /*
> + * If /dev/kmsg cannot be opened (e.g. permission denied), skip the test
> + * as we cannot verify the absence of kernel errors.
> + */
> + pr_debug("Failed to open /dev/kmsg: %s. Skipping test.\n", strerror(errno));
> + return TEST_SKIP;
> + }
> +
> + /*
> + * Seek to the end to ignore past events (like EFI boot warnings).
> + * This typically requires CAP_SYSLOG.
> + */
> + if (lseek(fd, 0, SEEK_END) < 0) {
> + pr_debug("Failed to seek to end of /dev/kmsg: %s\n", strerror(errno));
> + return TEST_FAIL;
> + }
> +
> + start = time(NULL);
> + do {
> + pr_debug("Starting event group throttling...\n");
> + pid = run_event_group_throttle();
> +
> + sleep(8);
> +
> + pr_debug("event group throttler(PID=%d)\n", pid);
> + kill(pid, SIGKILL);
> + waitpid(pid, NULL, 0);
> +
> + /* Check for errors during the run */
> + if (is_kmsg_err(fd)) {
> + close(fd);
> + return TEST_FAIL;
> + }
> + } while (time(NULL) - start < 10);
> +
> + return TEST_OK;
> +}
> +
> +DEFINE_SUITE("event group throttle", event_group_throttle);
> diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
> index 97e62db8764a..031856a710b2 100644
> --- a/tools/perf/tests/tests.h
> +++ b/tools/perf/tests/tests.h
> @@ -179,6 +179,7 @@ DECLARE_SUITE(event_groups);
> DECLARE_SUITE(symbols);
> DECLARE_SUITE(util);
> DECLARE_SUITE(subcmd_help);
> +DECLARE_SUITE(event_group_throttle);
>
> /*
> * PowerPC and S390 do not support creation of instruction breakpoints using the
Powered by blists - more mailing lists