lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <CVSVH3ARQBRC.1QUTEQE3YNN5T@qgv27q77ld-mac>
Date:   Tue, 26 Sep 2023 16:10:30 +0300
From:   "Jarkko Sakkinen" <jarkko@...nel.org>
To:     "Haitao Huang" <haitao.huang@...ux.intel.com>,
        <dave.hansen@...ux.intel.com>, <tj@...nel.org>,
        <linux-kernel@...r.kernel.org>, <linux-sgx@...r.kernel.org>,
        <x86@...nel.org>, <cgroups@...r.kernel.org>, <tglx@...utronix.de>,
        <mingo@...hat.com>, <bp@...en8.de>, <hpa@...or.com>,
        <sohil.mehta@...el.com>
Cc:     <zhiquan1.li@...el.com>, <kristen@...ux.intel.com>,
        <seanjc@...gle.com>, <zhanb@...rosoft.com>,
        <anakrish@...rosoft.com>, <mikko.ylinen@...ux.intel.com>,
        <yangjie@...rosoft.com>
Subject: Re: [PATCH v5 01/18] cgroup/misc: Add per resource callbacks for
 CSS events

On Tue Sep 26, 2023 at 6:04 AM EEST, Haitao Huang wrote:
> Hi Jarkko
>
> On Mon, 25 Sep 2023 12:09:21 -0500, Jarkko Sakkinen <jarkko@...nel.org>  
> wrote:
>
> > On Sat Sep 23, 2023 at 6:06 AM EEST, Haitao Huang wrote:
> >> From: Kristen Carlson Accardi <kristen@...ux.intel.com>
> >>
> >> The misc cgroup controller (subsystem) currently does not perform
> >> resource type specific action for Cgroups Subsystem State (CSS) events:
> >> the 'css_alloc' event when a cgroup is created and the 'css_free' event
> >> when a cgroup is destroyed, or in event of user writing the max value to
> >> the misc.max file to set the usage limit of a specific resource
> >> [admin-guide/cgroup-v2.rst, 5-9. Misc].
> >>
> >> Define callbacks for those events and allow resource providers to
> >> register the callbacks per resource type as needed. This will be
> >> utilized later by the EPC misc cgroup support implemented in the SGX
> >> driver:
> >> - On css_alloc, allocate and initialize necessary structures for EPC
> >> reclaiming, e.g., LRU list, work queue, etc.
> >> - On css_free, cleanup and free those structures created in alloc.
> >> - On max_write, trigger EPC reclaiming if the new limit is at or below
> >> current usage.
> >>
> >> Signed-off-by: Kristen Carlson Accardi <kristen@...ux.intel.com>
> >> Signed-off-by: Haitao Huang <haitao.huang@...ux.intel.com>
> >> ---
> >> V5:
> >> - Remove prefixes from the callback names (tj)
> >> - Update commit message (Jarkko)
> >>
> >> V4:
> >> - Moved this to the front of the series.
> >> - Applies on cgroup/for-6.6 with the overflow fix for misc.
> >>
> >> V3:
> >> - Removed the released() callback
> >> ---
> >>  include/linux/misc_cgroup.h |  5 +++++
> >>  kernel/cgroup/misc.c        | 32 +++++++++++++++++++++++++++++---
> >>  2 files changed, 34 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/include/linux/misc_cgroup.h b/include/linux/misc_cgroup.h
> >> index e799b1f8d05b..96a88822815a 100644
> >> --- a/include/linux/misc_cgroup.h
> >> +++ b/include/linux/misc_cgroup.h
> >> @@ -37,6 +37,11 @@ struct misc_res {
> >>  	u64 max;
> >>  	atomic64_t usage;
> >>  	atomic64_t events;
> >> +
> >> +	/* per resource callback ops */
> >> +	int (*alloc)(struct misc_cg *cg);
> >> +	void (*free)(struct misc_cg *cg);
> >> +	void (*max_write)(struct misc_cg *cg);
> >>  };
> >>
> >>  /**
> >> diff --git a/kernel/cgroup/misc.c b/kernel/cgroup/misc.c
> >> index 79a3717a5803..62c9198dee21 100644
> >> --- a/kernel/cgroup/misc.c
> >> +++ b/kernel/cgroup/misc.c
> >> @@ -276,10 +276,13 @@ static ssize_t misc_cg_max_write(struct  
> >> kernfs_open_file *of, char *buf,
> >>
> >>  	cg = css_misc(of_css(of));
> >>
> >> -	if (READ_ONCE(misc_res_capacity[type]))
> >> +	if (READ_ONCE(misc_res_capacity[type])) {
> >>  		WRITE_ONCE(cg->res[type].max, max);
> >> -	else
> >> +		if (cg->res[type].max_write)
> >> +			cg->res[type].max_write(cg);
> >> +	} else {
> >>  		ret = -EINVAL;
> >> +	}
> >>
> >>  	return ret ? ret : nbytes;
> >>  }
> >> @@ -383,23 +386,39 @@ static struct cftype misc_cg_files[] = {
> >>  static struct cgroup_subsys_state *
> >>  misc_cg_alloc(struct cgroup_subsys_state *parent_css)
> >>  {
> >> +	struct misc_cg *parent_cg;
> >>  	enum misc_res_type i;
> >>  	struct misc_cg *cg;
> >> +	int ret;
> >>
> >>  	if (!parent_css) {
> >>  		cg = &root_cg;
> >> +		parent_cg = &root_cg;
> >>  	} else {
> >>  		cg = kzalloc(sizeof(*cg), GFP_KERNEL);
> >>  		if (!cg)
> >>  			return ERR_PTR(-ENOMEM);
> >> +		parent_cg = css_misc(parent_css);
> >>  	}
> >>
> >>  	for (i = 0; i < MISC_CG_RES_TYPES; i++) {
> >>  		WRITE_ONCE(cg->res[i].max, MAX_NUM);
> >>  		atomic64_set(&cg->res[i].usage, 0);
> >> +		if (parent_cg->res[i].alloc) {
> >> +			ret = parent_cg->res[i].alloc(cg);
> >> +			if (ret)
> >> +				goto alloc_err;
> >> +		}
> >>  	}
> >>
> >>  	return &cg->css;
> >> +
> >> +alloc_err:
> >> +	for (i = 0; i < MISC_CG_RES_TYPES; i++)
> >> +		if (parent_cg->res[i].free)
> >> +			cg->res[i].free(cg);
> >> +	kfree(cg);
> >> +	return ERR_PTR(ret);
> >>  }
> >>
> >>  /**
> >> @@ -410,7 +429,14 @@ misc_cg_alloc(struct cgroup_subsys_state  
> >> *parent_css)
> >>   */
> >>  static void misc_cg_free(struct cgroup_subsys_state *css)
> >>  {
> >> -	kfree(css_misc(css));
> >> +	struct misc_cg *cg = css_misc(css);
> >> +	enum misc_res_type i;
> >> +
> >> +	for (i = 0; i < MISC_CG_RES_TYPES; i++)
> >> +		if (cg->res[i].free)
> >> +			cg->res[i].free(cg);
> >> +
> >> +	kfree(cg);
> >>  }
> >>
> >>  /* Cgroup controller callbacks */
> >> --
> >> 2.25.1
> >
> > Since the only existing client feature requires all callbacks, should
> > this not have that as an invariant?
> >
> > I.e. it might be better to fail unless *all* ops are non-nil (e.g. to
> > catch issues in the kernel code).
> >
>
> These callbacks are chained from cgroup_subsys, and they are defined  
> separately so it'd be better follow the same pattern.  Or change together  
> with cgroup_subsys if we want to do that. Reasonable?

I noticed this one later:

It would better to create a separate ops struct and declare the instance
as const at minimum.

Then there is no need for dynamic assigment of ops and all that is in
rodata. This is improves both security and also allows static analysis
bit better.

Now you have to dynamically trace the struct instance, e.g. in case of
a bug. If this one done, it would be already in the vmlinux.

BR, Jarkko

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ