[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <D1D50EA0-EFC3-4FAB-A898-A61BD081B527@optimatist.com>
Date: Tue, 19 Aug 2025 21:31:46 +0800
From: "Chengdong Li(李成栋)" <chengdongli@...imatist.com>
To: Mark Rutland <mark.rutland@....com>
CC: <peterz@...radead.org>,
<mingo@...hat.com>,
<acme@...nel.org>,
<namhyung@...nel.org>,
<alexander.shishkin@...ux.intel.com>,
<jolsa@...nel.org>,
<irogers@...gle.com>,
<adrian.hunter@...el.com>,
<kan.liang@...ux.intel.com>,
<linux-perf-users@...r.kernel.org>,
<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] perf/core: Improve arguments checking of inherited
per-task counters when sampling.
Hi Mark,
On 2025/8/19, 18:26, "Mark Rutland" <mark.rutland@....com <mailto:mark.rutland@....com>> wrote:
>On Thu, Aug 14, 2025 at 07:06:25PM +0800, Chengdong Li wrote:
>> It's not allowed to mmap() of inherited per-task counters with CPU ==
>> -1, this would create a performance issue. But it is not friendly to
>> developers as current implementation postponed the arguments checking to
>> perf_mmap(), developer can get an -EINVAL from mmap() but without
>> any previous error returned from perf_event_open().
>>
>> This patch improves it by moving the arguments checking from perf_mmap()
>> to perf_event_open().
>
>
>Why is that an improvement?
>
>
>IIUC before this patch, it would be possible to read() the event,
>whereas now the event cannot be opened at all.
That's true, could you provide a use case that using sampling mode but without
ring buffer? From my best knowledge, I think counting mode is more suitable
for read() only.
>
>
>AFAICT this is removing functionality people could legitimately use, so
>this doesn't seem like an improvement.
The problem is that using inherit per-task counter with sampling mode would lead
developer hard to debug why mmap() returns -EINVAL. There is not error returned from
perf_event_open(), everything done seems right but failed at mmap(), and there is
no clue in man open_event_open() as well.
Thanks,
Chengdong
>
>
>Mark.
>
>
>>
>> Signed-off-by: Chengdong Li <chengdongli@...imatist.com <mailto:chengdongli@...imatist.com>>
>> ---
>> kernel/events/core.c | 20 ++++++++++++--------
>> 1 file changed, 12 insertions(+), 8 deletions(-)
>>
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index 8060c2857bb2..f102adb395ec 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -6941,14 +6941,6 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma)
>> int ret, flags = 0;
>> mapped_f mapped;
>>
>> - /*
>> - * Don't allow mmap() of inherited per-task counters. This would
>> - * create a performance issue due to all children writing to the
>> - * same rb.
>> - */
>> - if (event->cpu == -1 && event->attr.inherit)
>> - return -EINVAL;
>> -
>> if (!(vma->vm_flags & VM_SHARED))
>> return -EINVAL;
>>
>> @@ -13392,6 +13384,18 @@ SYSCALL_DEFINE5(perf_event_open,
>> return -EACCES;
>> }
>>
>> + /*
>> + * Don't allow perf_event_open() of inherited per-task counters
>> + * with cpu == 1 when sampling. Otherwise, this would create a
>> + * performance issue due to all children writing to the same mmap()
>> + * created ring buffer.
>> + *
>> + * We recommend to call perf_event_open() for all cpus when sampling on
>> + * inherited per-task counters.
>> + */
>> + if (attr.sample_freq && attr.inherit && cpu == -1)
>> + return -EINVAL;
>> +
>> if (attr.freq) {
>> if (attr.sample_freq > sysctl_perf_event_sample_rate)
>> return -EINVAL;
>> --
>> 2.43.0
>>
Powered by blists - more mailing lists