linux-kernel - Re: [PATCH 2/7] x86/intel_rdt: Adds support for Class of service management

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.10.1505181232440.6658@vshiva-Udesk>
Date:	Mon, 18 May 2015 12:44:33 -0700 (PDT)
From:	Vikas Shivappa <vikas.shivappa@...el.com>
To:	Thomas Gleixner <tglx@...utronix.de>
cc:	Vikas Shivappa <vikas.shivappa@...el.com>,
	Vikas Shivappa <vikas.shivappa@...ux.intel.com>,
	x86@...nel.org, linux-kernel@...r.kernel.org, hpa@...or.com,
	mingo@...nel.org, tj@...nel.org, peterz@...radead.org,
	Matt Fleming <matt.fleming@...el.com>,
	"Auld, Will" <will.auld@...el.com>, peter.zijlstra@...el.com,
	h.peter.anvin@...el.com,
	"Juvva, Kanaka D" <kanaka.d.juvva@...el.com>, mtosatti@...hat.com
Subject: Re: [PATCH 2/7] x86/intel_rdt: Adds support for Class of service
 management



On Mon, 18 May 2015, Thomas Gleixner wrote:

> On Mon, 18 May 2015, Vikas Shivappa wrote:
>>>> +static void __clos_get(unsigned int closid)
>>>
>>> What's the purpose of these double underscores?
>>>
>>
>> Similar to __get_rmid.
>
> Errm. We use double underscore for such cases:
>
> __bla()
> {
> 	do_stuff();
> }
>
> bla()
> {
> 	lock();
> 	__bla();
> 	unlock();
> }
>
> So you have two sorts of callers. Locked and unlocked. But I don't see
> something like this. It's just random underscores for no value.

Ok - I saw this .. will remove underscores then since cpuset also dont use these 
anyways.

>
>>>> +{
>>>> +	struct clos_cbm_map *ccm = &ccmap[closid];
>>>> +
>>>> +	lockdep_assert_held(&rdt_group_mutex);
>>>
>>> Do we really need all these lockdep asserts for a handfull of call
>>> sites?
>>
>> Well, we still have some calls some may be frequent depending on the usage..
>> should the assert decided based on number of times its called ?
>
> We add these things for complex locking scenarios, but for single
> callsites where locking is obvious its not really valuable. But that's
> the least of my worries.

Ok. was just wanting to make sure help debug just in case as its not used by 
too many people as yet.

>
>>>> +static struct cgroup_subsys_state *
>>>> +intel_rdt_css_alloc(struct cgroup_subsys_state *parent_css)
>>>> +{
>>>> +	struct intel_rdt *parent = css_rdt(parent_css);
>>>> +	struct intel_rdt *ir;
>>>> +
>>>> +	/*
>>>> +	 * Cannot return failure on systems with no Cache Allocation
>>>> +	 * as the cgroup_init does not handle failures gracefully.
>>>
>>> This comment is blatantly wrong. 5 lines further down you return
>>> -ENOMEM. Now what?
>>>
>>
>> Well , this is for cgroup_init called with parent=null which is when its
>> initializing the cgroup subsystem. I cant return error in this case but I can
>> otherwise.
>
> And then you run into the same BUG_ON ...

May be I misled you - I pasted the code from cgroup.c which is not part of cache 
alloc below to show cgroup_init doesnt handle failure.. (cgroup_init_subsys ). 
The cache allocation code doesnt return BUG_ON. So I never run into BUG_ON.. 
(right?)

>
>> static void __init cgroup_init_subsys(struct cgroup_subsys *ss, bool early)
>> {
>>
>> 	/* Create the root cgroup state for this subsystem */
>> 	ss->root = &cgrp_dfl_root;
>> 	css = ss->css_alloc(cgroup_css(&cgrp_dfl_root.cgrp, ss));
>> 	/* We don't handle early failures gracefully */
>> 	BUG_ON(IS_ERR(css));
>
> I know. But still the comment is misleading.
>
>  	/*
> 	 * cgroup_init() expects  valid css pointer. Return
> 	 * the rdt_root_group.css instead of a failure code.
> 	 */
>
> Can you see the difference?

ok can fix the comment. Reminds me of my middle school days when English was my 
hardest subject.

>
>>>> +	 */
>>>> +	if (!parent)
>>>> +		return &rdt_root_group.css;
>>>> +
>>>> +	ir = kzalloc(sizeof(struct intel_rdt), GFP_KERNEL);
>>>> +	if (!ir)
>>>> +		return ERR_PTR(-ENOMEM);
>
> And why can't you return something useful here instead of letting the
> thing run into a BUG?

This is a return for no memory - this is not for the call from 
cgroup_init_subsys where failure turns into a BUG.. this is when user is trying 
to create new cgroups..

>
>>>>  	maxid = c->x86_rdt_max_closid;
>>>>  	max_cbm_len = c->x86_rdt_max_cbm_len;
>>>>
>>>> +	sizeb = BITS_TO_LONGS(maxid) * sizeof(long);
>>>> +	rdtss_info.closmap = kzalloc(sizeb, GFP_KERNEL);
>>>
>>> What's the point of this bitmap? In later patches you have a
>>> restriction to 64bits and the SDM says that CBM_LEN is limited to 32.
>>>
>>> So where i the point for allocating a bitmap?
>>
>> The cache bitmask is really a bitmask where every bit represents a cache way ,
>> so its way based mapping . Although currently the max is 32 bits we still need
>> to treat it as a bitmask.
>>
>> In the first patch version you are the one who suggested to use all the bitmap
>> functions to check the presence of contiguous bits (although I was already
>> using the bitmaps, I had not used for some cases). So i made the change and
>> other similar code was changed to using bitmap/bit manipulation APIs. There
>> are scenarios where in we need to check for subset of bitmasks and other cases
>> but agree they can all be done without the bitmask APIs as well. But now your
>> comment contradicts the old comment ?
>
> Oh well. You can still use bitmap functions on an u64 where
> appropriate.

I see - I did not notice this bitmap is not for the CBM though.. this is the 
closmap which manages the closids. the cbm is just kept as a unsigned long - i 
dont do bitmap allocation for that. So we are on same page i guess.

>
>>>
>>>> +	if (!rdtss_info.closmap) {
>>>> +		err = -ENOMEM;
>>>> +		goto out_err;
>>>> +	}
>>>> +
>>>> +	sizeb = maxid * sizeof(struct clos_cbm_map);
>>>> +	ccmap = kzalloc(sizeb, GFP_KERNEL);
>>>> +	if (!ccmap) {
>>>> +		kfree(rdtss_info.closmap);
>>>> +		err = -ENOMEM;
>>>> +		goto out_err;
>>>
>>> What's wrong with return -ENOMEM? We only use goto if we have to
>>> cleanup stuff, certainly not for just returning err.
>>
>> This was due to PeterZ's feedback that the return paradigm needs to not have
>> too many exit points or return a lot of times from the middle of code..
>> Both contradict now ?
>
> There are different opinions about that. But again, that's the least
> of my worries.

Great.

>
>>>
>>>> +	}
>>>> +
>>>> +	set_bit(0, rdtss_info.closmap);
>>>> +	rdt_root_group.clos = 0;
>>>> +	ccm = &ccmap[0];
>>>> +	bitmap_set(&ccm->cache_mask, 0, max_cbm_len);
>>>> +	ccm->clos_refcnt = 1;
>>>> +
>>>>  	pr_info("Max bitmask length:%u,Max ClosIds: %u\n", max_cbm_len,
>>>> maxid);
>>>
>>> We surely do not want to sprinkle these all over dmesg.
>>
>> This is just printed once! how is that sprinke all over?   - we have a dmsg
>> print for Cache monitoring as well when cqm is enabled.
>
> Sorry, mapped that to the wrong function. Though the message itself is
> horrible.
>
> 	  "Max bitmask length:32,Max ClosIds: 16"
>
> With some taste and formatting applied this would read:
>
>     	  "Max. bitmask length: 32, max. closids: 16"
>
> Can you spot the difference?

Will fix comment.

Thanks,
Vikas

>
> Thanks,
>
> 	tglx
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/