[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8d6030a4-ff2c-230c-c36e-d0a8c68832ac@linux.intel.com>
Date: Tue, 21 Jul 2020 16:06:34 +0300
From: Alexey Budankov <alexey.budankov@...ux.intel.com>
To: Arnaldo Carvalho de Melo <acme@...nel.org>
Cc: Peter Zijlstra <peterz@...radead.org>,
Ravi Bangoria <ravi.bangoria@...ux.ibm.com>,
Alexei Starovoitov <ast@...nel.org>,
Ingo Molnar <mingo@...hat.com>,
James Morris <jmorris@...ei.org>,
Namhyung Kim <namhyung@...nel.org>,
Serge Hallyn <serge@...lyn.com>, Jiri Olsa <jolsa@...hat.com>,
Song Liu <songliubraving@...com>,
Andi Kleen <ak@...ux.intel.com>,
Stephane Eranian <eranian@...gle.com>,
Igor Lubashev <ilubashe@...mai.com>,
Thomas Gleixner <tglx@...utronix.de>,
linux-kernel <linux-kernel@...r.kernel.org>,
"linux-security-module@...r.kernel.org"
<linux-security-module@...r.kernel.org>,
"selinux@...r.kernel.org" <selinux@...r.kernel.org>,
"intel-gfx@...ts.freedesktop.org" <intel-gfx@...ts.freedesktop.org>,
"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
linux-man@...r.kernel.org
Subject: Re: [PATCH v8 00/12] Introduce CAP_PERFMON to secure system
performance monitoring and observability
On 13.07.2020 21:51, Arnaldo Carvalho de Melo wrote:
> Em Mon, Jul 13, 2020 at 03:37:51PM +0300, Alexey Budankov escreveu:
>>
>> On 13.07.2020 15:17, Arnaldo Carvalho de Melo wrote:
>>> Em Mon, Jul 13, 2020 at 12:48:25PM +0300, Alexey Budankov escreveu:
>>>>
>>>> On 10.07.2020 20:09, Arnaldo Carvalho de Melo wrote:
>>>>> Em Fri, Jul 10, 2020 at 05:30:50PM +0300, Alexey Budankov escreveu:
>>>>>> On 10.07.2020 16:31, Ravi Bangoria wrote:
>>>>>>>> Currently access to perf_events, i915_perf and other performance
>>>>>>>> monitoring and observability subsystems of the kernel is open only for
>>>>>>>> a privileged process [1] with CAP_SYS_ADMIN capability enabled in the
>>>>>>>> process effective set [2].
>>>
>>>>>>>> This patch set introduces CAP_PERFMON capability designed to secure
>>>>>>>> system performance monitoring and observability operations so that
>>>>>>>> CAP_PERFMON would assist CAP_SYS_ADMIN capability in its governing role
>>>>>>>> for performance monitoring and observability subsystems of the kernel.
>>>
>>>>>>> I'm seeing an issue with CAP_PERFMON when I try to record data for a
>>>>>>> specific target. I don't know whether this is sort of a regression or
>>>>>>> an expected behavior.
>>>
>>>>>> Thanks for reporting and root causing this case. The behavior looks like
>>>>>> kind of expected since currently CAP_PERFMON takes over the related part
>>>>>> of CAP_SYS_ADMIN credentials only. Actually Perf security docs [1] say
>>>>>> that access control is also subject to CAP_SYS_PTRACE credentials.
>>>
>>>>> I think that stating that in the error message would be helpful, after
>>>>> all, who reads docs? 8-)
>>>
>>>> At least those who write it :D ...
>>>
>>> Everybody should read it, sure :-)
>>>
>>>>> I.e., this:
>>>>>
>>>>> $ ./perf stat ls
>>>>> Error:
>>>>> Access to performance monitoring and observability operations is limited.
>>>>> $
>>>>>
>>>>> Could become:
>>>>>
>>>>> $ ./perf stat ls
>>>>> Error:
>>>>> Access to performance monitoring and observability operations is limited.
>>>>> Right now only CAP_PERFMON is granted, you may need CAP_SYS_PTRACE.
>>>>> $
>>>>
>>>> It would better provide reference to perf security docs in the tool output.
>>>
>>> So add a 3rd line:
>>>
>>> $ ./perf stat ls
>>> Error:
>>> Access to performance monitoring and observability operations is limited.
>>> Right now only CAP_PERFMON is granted, you may need CAP_SYS_PTRACE.
>>> Please read the 'Perf events and tool security' document:
>>> https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
>
>> If it had that patch below then message change would not be required.
>
> Sure, but the tool should continue to work and provide useful messages
> when running on kernels without that change. Pointing to the document is
> valid and should be done, that is an agreed point. But the tool can do
> some checks, narrow down the possible causes for the error message and
> provide something that in most cases will make the user make progress.
>
>> However this two sentences in the end of whole message would still add up:
>> "Please read the 'Perf events and tool security' document:
>> https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html"
>
> We're in violent agreement here. :-)
Here is the message draft mentioning a) CAP_SYS_PTRACE, for kernels prior
v5.8, and b) Perf security document link. The plan is to send a patch extending
perf_events with CAP_PERFMON check [1] for ptrace_may_access() and extending
the tool with this message.
"Access to performance monitoring and observability operations is limited.
Enforced MAC policy settings (SELinux) can limit access to performance
monitoring and observability operations. Inspect system audit records for
more perf_event access control information and adjusting the policy.
Consider adjusting /proc/sys/kernel/perf_event_paranoid setting to open
access to performance monitoring and observability operations for processes
without CAP_PERFMON, CAP_SYS_PTRACE or CAP_SYS_ADMIN Linux capability.
More information can be found at 'Perf events and tool security' document:
https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
perf_event_paranoid setting is -1:
-1: Allow use of (almost) all events by all users
Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
>= 0: Disallow raw and ftrace function tracepoint access
>= 1: Disallow CPU event access
>= 2: Disallow kernel profiling
To make the adjusted perf_event_paranoid setting permanent preserve it
in /etc/sysctl.conf (e.g. kernel.perf_event_paranoid = <setting>)"
Alexei
[1] https://lore.kernel.org/lkml/20200713121746.GA7029@kernel.org/
>
>>>
>>>> Looks like extending ptrace_may_access() check for perf_events with CAP_PERFMON
>>>
>>> You mean the following?
>>
>> Exactly that.
>
> Sure, lets then wait for others to chime in and then you can go ahead
> and submit that patch.
>
> Peter?
>
> - Arnaldo
>
>>>
>>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>>> index 856d98c36f56..a2397f724c10 100644
>>> --- a/kernel/events/core.c
>>> +++ b/kernel/events/core.c
>>> @@ -11595,7 +11595,7 @@ SYSCALL_DEFINE5(perf_event_open,
>>> * perf_event_exit_task() that could imply).
>>> */
>>> err = -EACCES;
>>> - if (!ptrace_may_access(task, PTRACE_MODE_READ_REALCREDS))
>>> + if (!perfmon_capable() && !ptrace_may_access(task, PTRACE_MODE_READ_REALCREDS))
>>> goto err_cred;
>>> }
>>>
>>>> makes monitoring simpler and even more secure to use since Perf tool need
>>>> not to start/stop/single-step and read/write registers and memory and so on
>>>> like a debugger or strace-like tool. What do you think?
>>>
>>> I tend to agree, Peter?
>>>
>>>> Alexei
>>>>
>>>>>
>>>>> - Arnaldo
>>
>> Alexei
>
Powered by blists - more mailing lists