linux-kernel - Re: [PATCH 3/7] x86/intel_rdt: Add support for cache bit mask management

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.11.1505182143150.4225@nanos>
Date:	Mon, 18 May 2015 22:15:23 +0200 (CEST)
From:	Thomas Gleixner <tglx@...utronix.de>
To:	Vikas Shivappa <vikas.shivappa@...el.com>
cc:	Vikas Shivappa <vikas.shivappa@...ux.intel.com>, x86@...nel.org,
	linux-kernel@...r.kernel.org, hpa@...or.com, mingo@...nel.org,
	tj@...nel.org, peterz@...radead.org,
	Matt Fleming <matt.fleming@...el.com>,
	"Auld, Will" <will.auld@...el.com>, peter.zijlstra@...el.com,
	h.peter.anvin@...el.com,
	"Juvva, Kanaka D" <kanaka.d.juvva@...el.com>, mtosatti@...hat.com
Subject: Re: [PATCH 3/7] x86/intel_rdt: Add support for cache bit mask
 management

On Mon, 18 May 2015, Vikas Shivappa wrote:
> On Fri, 15 May 2015, Thomas Gleixner wrote:
> > > +/*
> > > + * Mask of CPUs for writing CBM values. We only need one per-socket.
> > 
> > One mask? One CPU? One what? Please add comments which are
> > understandable and useful for people who did NOT write that code.
> 
> when there is already code in the upstream code which has comments exactly
> consistent - when that made sense , this should ?
> 
> /*
>  * Mask of CPUs for reading CQM values. We only need one per-socket.
>  */
> static cpumask_t cqm_cpumask;

And just because there is a lousy comment upstream it does not become
better by copying it.
 
> > > +/*
> > > + * cbm_update_msrs() - Updates all the existing IA32_L3_MASK_n MSRs
> > > + * which are one per CLOSid, except IA32_L3_MASK_0 on the current
> > > package.
> > > + * @cpu : the cpu on which the mask is updated.
> > > + */
> > > +static inline void cbm_update_msrs(int cpu)
> > > +{
> > > +	int maxid = boot_cpu_data.x86_rdt_max_closid;
> > > +	unsigned int i;
> > > +
> > > +	if (WARN_ON(cpu != smp_processor_id()))
> > > +		return;
> > > +
> > > +	for (i = 1; i < maxid; i++) {
> > 
> > Lacks a comment why this starts with 1 and not with 0 as one would expect.
> 
> Its in the function comment just above - "except IA32_L3_MASK_0 on the current
> package."

No. That comment explains WHAT it does not WHY. Asided of that that
comment is not correct KernelDoc format.
 
> > > +/*
> > > + * intel_cache_alloc_cbm_write() - Validates and writes the
> > > + * cache bit mask(cbm) to the IA32_L3_MASK_n
> > > + * and also store the same in the ccmap.
> > > + *
> > > + * CLOSids are reused for cgroups which have same bitmask.
> > > + * - This helps to use the scant CLOSids optimally.
> > > + * - This also implies that at context switch write
> > > + * to PQR-MSR is done only when a task with a
> > > + * different bitmask is scheduled in.
> > > + */
> > > +static int intel_cache_alloc_cbm_write(struct cgroup_subsys_state *css,
> > 
> > Can you please drop these intel_ prefixes? They provide no value and
> > just eat space.
> 
> well , i got ample comments which wanted me to me to specify intel as this
> feature is specific to intel.

It's fine for public interfaces, but for functions which are only used
inside a file which contains only intel specific code it's just silly.

> the cache monitoring code has similar prefixes as well - is that pref specific

Sigh. There is so much crap in the kernel, you'll find an example for
everything.

> then ? I just dont want people to come back and ask to make it clear that
> things are intel specific

The file name tells everyone its INTEL specific, right? And following
your argumentation you should have named your helper functions
intel_get_closid() and intel_put_closid() as well.

If your code would have been in a proper shape otherwise, I would
probably have just let it slip.

> > > +				 struct cftype *cft, u64 cbmvalue)
> > > +{
> > > +	u32 max_cbm = boot_cpu_data.x86_rdt_max_cbm_len;
> > > +	struct intel_rdt *ir = css_rdt(css);
> > > +	unsigned long cache_mask, max_mask;
> > > +	unsigned long *cbm_tmp;
> > > +	unsigned int closid;
> > > +	ssize_t err = 0;
> > > +
> > > +	if (ir == &rdt_root_group)
> > > +		return -EPERM;
> > > +	bitmap_set(&max_mask, 0, max_cbm);
> > > +
> > > +	/*
> > > +	 * Need global mutex as cbm write may allocate a closid.
> > > +	 */
> > > +	mutex_lock(&rdt_group_mutex);
> > > +	bitmap_and(&cache_mask, (unsigned long *)&cbmvalue, &max_mask,
> > > max_cbm);
> > > +	cbm_tmp = &ccmap[ir->clos].cache_mask;
> > > +
> > > +	if (bitmap_equal(&cache_mask, cbm_tmp, MAX_CBM_LENGTH))
> > > +		goto out;
> > > +
> > 
> > Sigh, what's the point of this bitmap shuffling? Nothing as far as I
> > can tell, because you first set all bits of maxmask and then AND
> > cbmvalue.
> 
> Again , this contraditcs your earlier comment where you asked me to use the
> bit manipulation APIs to make it more *readable*. So do you want more readable
> now or is that not clear now ?

> https://lkml.org/lkml/2014/11/19/1145

 
> +	/* Reset the least significant bit.*/
> +	i = var & (var - 1);
> +
> +	if (i != (var & (var << 1)))
> +		return false;
> 
> was replaced with the
> 
> 	if (bitmap_weight(&var, maxcbm) < min_bitmask_len)
> 		return false;
> 
> 	first_bit = find_next_bit(&var, maxcbm, 0);
> 	zero_bit = find_next_zero_bit(&var, maxcbm, first_bit);
> 
> 	if (find_next_bit(&var, maxcbm, zero_bit) < maxcbm)
> 		return false;
> 
> 3 lines replaced with more to make it more readable ! so does this come back
> then ?

3 lines? 

> +static inline bool cbm_minbits(unsigned long var)
> +{
> +
> +	unsigned long i;
> +
> +	/*Minimum of 2 bits must be set.*/
> +
> +	i = var & (var - 1);
> +	if (!i || !var)
> +	   return false;
> +
> +	return true;
> +
> +}

These 14 lines are replaced by a two lines:

	if (bitmap_weight(&var, maxcbm) < min_bitmask_len)
		return false;

> static inline bool cbm_iscontiguous(unsigned long var)
> +{
> +
> +	unsigned long i;
> +
> +	/* Reset the least significant bit.*/
> +	i = var & (var - 1);
> +
> +	/*
> +	 * We would have a set of non-contiguous bits when
> +	 * there is at least one zero
> +	 * between the most significant 1 and least significant 1.
> +	 * In the below '&' operation,(var <<1) would have zero in
> +	 * at least 1 bit position in var apart from least
> +	 * significant bit if it does not have contiguous bits.
> +	 * Multiple sets of contiguous bits wont succeed in the below
> +	 * case as well.
> +	 */
> +
> +	if (i != (var & (var << 1)))
> +	   return false;
> +
> +	return true;
> +
> +}

And these 25 lines are replaced by 5 lines.

 	first_bit = find_next_bit(&var, maxcbm, 0);
 	zero_bit = find_next_zero_bit(&var, maxcbm, first_bit);
 
 	if (find_next_bit(&var, maxcbm, zero_bit) < maxcbm)
 		return false;


I asked you to use it for cbm_minbits and cbm_iscontiguous() instead
of the horrible open coded mess you had. But I certainly did not ask
you to blindly use bitmaps just everywhere.

> > All you achieve is, that you truncate cbmvalue to max_cbm bits. So if
> > cbmvalue has any of the invalid bits set, you happily ignore that.
> 
> taskset ignores the extra bits. is that a problem here ?

I do not care, what taskset does. I care what this code does.

It tells the user that: 0x100ffff is fine, even if the valid bits are
just 0xffff.

And taskset is a different story due to cpu hotplug.

But for CAT the maskbits are defined by hardware and cannot change
ever. So why would we allow to set invalid ones and claim that its
correct?

If there is a reason to do so, then it needs a proper comment in the
code and a proper explanation in Documentation/.....

> > > +static inline bool intel_rdt_update_cpumask(int cpu)
> > > +{
> > > +	int phys_id = topology_physical_package_id(cpu);
> > > +	struct cpumask *mask = &rdt_cpumask;
> > > +	int i;
> > > +
> > > +	for_each_cpu(i, mask) {
> > > +		if (phys_id == topology_physical_package_id(i))
> > > +			return false;
> > > +	}
> > 
> > You must be kidding.
> 
> the rapl and cqm use similar code. You want me to keep a seperate package mask
> for this code which not would be that frequent at all ?

You find for everything a place where you copied your stuff from
without thinking about it, right?

Other people dessperately try to fix the cpu online times which are
more and more interesting the larger the systems become. So it might
be a good idea to come up with a proper fast implementation which can
be used everywhere instead of blindly copying code.
 
> for c-state transitions only idle_notifier gets called - so this is only when
> a new package is physically added. and we dont need anything for cache alloc
> for idle transitions. not really frequent ?

Crap. It's called for every cpu which comes online and goes offline,
not for new packages.
 
Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/