[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.10.1505201009590.6658@vshiva-Udesk>
Date: Wed, 20 May 2015 10:18:11 -0700 (PDT)
From: Vikas Shivappa <vikas.shivappa@...el.com>
To: Thomas Gleixner <tglx@...utronix.de>
cc: Vikas Shivappa <vikas.shivappa@...ux.intel.com>,
vikas.shivappa@...el.com, x86@...nel.org,
linux-kernel@...r.kernel.org, hpa@...or.com, mingo@...nel.org,
tj@...nel.org, peterz@...radead.org, matt.fleming@...el.com,
will.auld@...el.com, peter.zijlstra@...el.com,
h.peter.anvin@...el.com, kanaka.d.juvva@...el.com,
mtosatti@...hat.com
Subject: Re: [PATCH 5/7] x86/intel_rdt: Software Cache for IA32_PQR_MSR
On Fri, 15 May 2015, Thomas Gleixner wrote:
> On Mon, 11 May 2015, Vikas Shivappa wrote:
>
>> Signed-off-by: Vikas Shivappa <vikas.shivappa@...ux.intel.com>
>>
>> Conflicts:
>> arch/x86/kernel/cpu/perf_event_intel_cqm.c
>
> And that's interesting for what?
Will remove this, fixed some conflicts as this code changes both cqm and cache
allocation.
>
>> --- /dev/null
>> +++ b/arch/x86/include/asm/rdt_common.h
>
>> @@ -0,0 +1,13 @@
>> +#ifndef _X86_RDT_H_
>> +#define _X86_RDT_H_
>> +
>> +#define MSR_IA32_PQR_ASSOC 0x0c8f
>> +
>> +struct intel_pqr_state {
>> + raw_spinlock_t lock;
>> + int rmid;
>> + int clos;
>
> Those want to be u32. We have no type checkes there, but u32 is the
> proper data type for wrmsr.
will fix.
>
>> + int cnt;
>> +};
>> +
>
> What's wrong with having this in intel_rdt.h?
intel_rdt.h is specific for only cache allocation and rdt features in the
cgroup. well, thats a criteria to seperate them , isnt it ? cqm wont
use most of the things (and growing) in intel_rdt.h.
It seems you're a big
> fan of sprinkling stuff all over the place so reading becomes a
> chasing game.
>
>> {
>> struct task_struct *task = current;
>> struct intel_rdt *ir;
>> + struct intel_pqr_state *state = this_cpu_ptr(&pqr_state);
>> + unsigned long flags;
>>
>> + raw_spin_lock_irqsave(&state->lock, flags);
>
> finish_arch_switch() is called with interrupts disabled already ...
Ok , I somehow could not locate this cli when I thought about it. So if this is
interrupt disabled and no preempt ,
then we dont need both the spin lock and rcu
lock (since rcu lock sync would obviously wait for the preempt disable..).
will remove both of them.
>
>> rcu_read_lock();
>
> So we now have a spin_lock() and rcu_read_lock() and no explanation
> what is protecting what.
>
>> ir = task_rdt(task);
>> - if (ir->clos == clos) {
>> + if (ir->clos == state->clos) {
>
> And of course taking the spin lock for checking state->clos is
> complete and utter nonsense.
>
> state->clos can only be changed by this code and the only reason why
> we need the lock is to protect against concurrent modification of
> state->rmid.
>
> So the check for ir->clos == state->clos can be done lockless.
>
> And I seriously doubt, that state->lock is necessary at all.
>
> Q: What is it protecting?
> A: state->clos, state->rmid, state->cnt
>
> Q: What's the context?
> A: Per cpu context. The per cpu pqr state is NEVER modified from a
> remote cpu.
>
> Q: What is the lock for?
> A: Nothing.
>
> Q: Why?
> A: Because interrupt disabled regions protect per cpu state perfectly
> fine and there is is no memory ordering issue which would require a
> lock or barrier either.
>
> Peter explained it to you several times already that context switch is
> one the most sensitive hot pathes where we care about every cycle. But
> you just go ahead and mindlessly create pointless overhead.
>
>> + /*
>> + * PQR has closid in high 32 bits and CQM-RMID
>> + * in low 10 bits. Rewrite the exsting rmid from
>> + * software cache.
>
> This comment wouldnt be necessary if you would have proper documented
> struct pqr_state.
Will fix, that would be lot better.
>
>> + */
>> + wrmsr(MSR_IA32_PQR_ASSOC, state->rmid, ir->clos);
>> + state->clos = ir->clos;
>> rcu_read_unlock();
>> + raw_spin_unlock_irqrestore(&state->lock, flags);
>> +
>> }
>
>> -static DEFINE_PER_CPU(struct intel_cqm_state, cqm_state);
>> +DEFINE_PER_CPU(struct intel_pqr_state, pqr_state);
>
> With CONFIG_PERF=n and CONFIG_CGROUP_RDT=y the linker will fail.
copy from Makefile below -
obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_rapl.o
perf_event_intel_cqm.o
should work with CONFIG_PERF_EVENTS=n and CGROUP_RDT=y ?
Thanks,
Vikas
>
> Thanks,
>
> tglx
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists