[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y/+wlg5L8A1iebya@cmpxchg.org>
Date: Wed, 1 Mar 2023 15:07:50 -0500
From: Johannes Weiner <hannes@...xchg.org>
To: Suren Baghdasaryan <surenb@...gle.com>
Cc: tj@...nel.org, lizefan.x@...edance.com, peterz@...radead.org,
johunt@...mai.com, mhocko@...e.com, keescook@...omium.org,
quic_sudaraja@...cinc.com, cgroups@...r.kernel.org,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/1] psi: remove 500ms min window size limitation for
triggers
On Wed, Mar 01, 2023 at 11:34:03AM -0800, Suren Baghdasaryan wrote:
> Current 500ms min window size for psi triggers limits polling interval
> to 50ms to prevent polling threads from using too much cpu bandwidth by
> polling too frequently. However the number of cgroups with triggers is
> unlimited, so this protection can be defeated by creating multiple
> cgroups with psi triggers (triggers in each cgroup are served by a single
> "psimon" kernel thread).
> Instead of limiting min polling period, which also limits the latency of
> psi events, it's better to limit psi trigger creation to authorized users
> only, like we do for system-wide psi triggers (/proc/pressure/* files can
> be written only by processes with CAP_SYS_RESOURCE capability). This also
> makes access rules for cgroup psi files consistent with system-wide ones.
> Add a CAP_SYS_RESOURCE capability check for cgroup psi file writers and
> remove the psi window min size limitation.
>
> Suggested-by: Sudarshan Rajagopalan <quic_sudaraja@...cinc.com>
> Link: https://lore.kernel.org/all/cover.1676067791.git.quic_sudaraja@quicinc.com/
> Signed-off-by: Suren Baghdasaryan <surenb@...gle.com>
> ---
> kernel/cgroup/cgroup.c | 10 ++++++++++
> kernel/sched/psi.c | 4 +---
> 2 files changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
> index 935e8121b21e..b600a6baaeca 100644
> --- a/kernel/cgroup/cgroup.c
> +++ b/kernel/cgroup/cgroup.c
> @@ -3867,6 +3867,12 @@ static __poll_t cgroup_pressure_poll(struct kernfs_open_file *of,
> return psi_trigger_poll(&ctx->psi.trigger, of->file, pt);
> }
>
> +static int cgroup_pressure_open(struct kernfs_open_file *of)
> +{
> + return (of->file->f_mode & FMODE_WRITE && !capable(CAP_SYS_RESOURCE)) ?
> + -EPERM : 0;
> +}
I agree with the change, but it's a bit unfortunate that this check is
duplicated between system and cgroup.
What do you think about psi_trigger_create() taking the file and
checking FMODE_WRITE and CAP_SYS_RESOURCE against file->f_cred?
Powered by blists - more mailing lists