linux-kernel - Re: [PATCH 01/14] x86/cqm: Intel Resource Monitoring Documentation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.10.1612271206460.5815@vshiva-Udesk>
Date:   Tue, 27 Dec 2016 12:21:44 -0800 (PST)
From:   Shivappa Vikas <vikas.shivappa@...el.com>
To:     Andi Kleen <andi@...stfloor.org>
cc:     Shivappa Vikas <vikas.shivappa@...el.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Vikas Shivappa <vikas.shivappa@...ux.intel.com>,
        linux-kernel@...r.kernel.org, x86@...nel.org, tglx@...utronix.de,
        ravi.v.shankar@...el.com, tony.luck@...el.com,
        fenghua.yu@...el.com, davidcc@...gle.com, eranian@...gle.com,
        hpa@...or.com
Subject: Re: [PATCH 01/14] x86/cqm: Intel Resource Monitoring Documentation



On Tue, 27 Dec 2016, Andi Kleen wrote:

> Shivappa Vikas <vikas.shivappa@...el.com> writes:
>>
>> Ok , looks like the interface  is the problem. Will try to fix
>> this. We are just trying to have a light weight monitoring
>> option so that its reasonable to monitor for a
>> very long time (like lifetime of process etc). Mainly to not have all
>> the perf scheduling overhead.
>
> That seems like an odd reason to define a completely new user interface.
> This is to avoid one MSR write for a RMID change per context switch
> in/out cgroup or is it other code too?
>
> Is there some number you can put to the overhead?
> Or is there some other overhead other than the MSR write
> you're concerned about?

Yes, seems like the interface of having a file is odd as even Peterz thinks.

Its the perf overhead actually we are trying to avoid.

The MSR writes(the driver/cqm overhead 
really not perf..) we try to optimize by having a per cpu cache/group the rmids/ 
have a common write for rmid/closid etc.

The perf overhead i was thinking atleast was during the context switch which is 
the more constant overhead (the event creation is just one time).

-I was trying to see an alternative where
1.user specifies the continuous monitor with perf-attr in open
2.driver allocates the task/cgroup RMID and stores the RMID in cgroup or 
task_struct
3.turns off the event. (hence no perf ctx switch overhead? (all the perf hook 
calls for start/stop/add we dont need any of those -
i was still finding out if this route works basically if i turn off the event 
there is minimal overhead for the event and not start/stop/add calls for the 
event.)
4.but during switch_to driver writes the RMID MSR, so we still monitor.
5.read -> calls the driver -> driver just returns the count by reading the 
RMID.

>
> Do you have an ftrace or better PT trace with the overhead before-after?
>
> Perhaps some optimization could be done in the code to make it faster,
> then the new interface wouldn't be needed.
>
> FWIW there are some pending changes to context switch that will
> eliminate at least one common MSR write [1]. If that was fixed
> you could do the RMID MSR write "for free"

I see, thats good to know..

Thanks,
Vikas

>
> -Andi
>
> [1] https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/log/?h=x86/fsgsbase
>
>