lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 4 Sep 2017 13:46:45 +0300
From:   Alexey Budankov <alexey.budankov@...ux.intel.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Andi Kleen <ak@...ux.intel.com>,
        Kan Liang <kan.liang@...el.com>,
        Dmitri Prokhorov <Dmitry.Prohorov@...el.com>,
        Valery Cherepennikov <valery.cherepennikov@...el.com>,
        Mark Rutland <mark.rutland@....com>,
        Stephane Eranian <eranian@...gle.com>,
        David Carrillo-Cisneros <davidcc@...gle.com>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Vince Weaver <vince@...ter.net>,
        Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [RFC][PATCH] perf: Rewrite enabled/running timekeeping

Hi,
On 31.08.2017 20:18, Peter Zijlstra wrote:
> On Wed, Aug 23, 2017 at 11:54:15AM +0300, Alexey Budankov wrote:
>> On 22.08.2017 23:47, Peter Zijlstra wrote:
>>> On Thu, Aug 10, 2017 at 06:57:43PM +0300, Alexey Budankov wrote:
>>>> The key thing in the patch is explicit updating of tstamp fields for
>>>> INACTIVE events in update_event_times().
>>>
>>>> @@ -1405,6 +1426,9 @@ static void update_event_times(struct perf_event *event)
>>>>  	    event->group_leader->state < PERF_EVENT_STATE_INACTIVE)
>>>>  		return;
>>>>  
>>>> +	if (event->state == PERF_EVENT_STATE_INACTIVE)
>>>> +		perf_event_tstamp_update(event);
>>>> +
>>>>  	/*
>>>>  	 * in cgroup mode, time_enabled represents
>>>>  	 * the time the event was enabled AND active
>>>
>>> But why!? I thought the whole point was to not need to do this.
>>
>> update_event_times() is not called from timer interrupt handler 
>> thus it is not on the critical path which is optimized in this patch set.
>>
>> But update_event_times() is called in the context of read() syscall so
>> this is the place where we may update event times for INACTIVE events 
>> instead of timer interrupt.
>>
>> Also update_event_times() is called on thread context switch out so
>> we get event times also updated when the thread migrates to other CPU.
>>
>>>
>>> The thing I outlined earlier would only need to update timestamps when
>>> events change state and at no other point in time.
>>
>> But we still may request times while event is in INACTIVE state 
>> thru read() syscall and event timings need to be up-to-date. 
> 
> Sure, read() also updates.
> 
> So the below completely rewrites timekeeping (and probably breaks
> world) but does away with the need to touch events that don't get
> scheduled.

We still need and do iterate thru all events at some points e.g. on context switches.

> 
> Esp the cgroup stuff is entirely untested since I simply don't know how
> to operate that. I did run Vince's tests on it, and I think it doesn't
> regress, but I'm near a migraine so I can't really see straight atm.
> 
> Vince, Stephane, could you guys have a peek?
> 
> (There's a few other bits in, I'll break up into patches and write
> comments and Changelogs later, I think its can be split in some 5
> patches).
> 
> The basic idea is really simple, we have a single timestamp and
> depending on the state we update enabled/running. This obviously only
> requires updates when we change state and when we need up-to-date
> timestamps (read).

I would prefer to have this rework in a FSM similar to that below, 
so state transition and the corresponding tstamp, total_time_enabled 
and total_time_running manipulation logic would be consolidated in 
one place and adjacent lines of code.

>From the table below event->state FSM is not as simple as it may seem 
on the first sight so in order to avoid regressions after rework we 
better keep that in mind and explicitly implement allowed and disallowed
state transitions.

    A	  	I	    O	       E	   X	      D          U

A   Te+,Tr+     Te+,Tr+     Te+,Tr+    Te+,Tr+     Te+,Tr+    Te+,Tr+    ---
    ts 	        ts          ts         ts          ts         ts

I   Te+,ts      Te+,ts      Te+,ts     Te+,ts      Te+,ts     Te+,ts     ---

O   Te=0,Tr=0,  Te=0,Tr=0,  Te=0,Tr=0  Te=0,Tr=0   Te=0,Tr=0  Te=0,Tr=0  ---
    ts          ts          ts         ts          ts         ts

E   Te=0,Tr=0,  Te=0,Tr=0,  Te=0,Tr=0  Te=0,Tr=0   Te=0,Tr=0  Te=0,Tr=0  ---
    ts          ts          ts         ts          ts         ts

X   ---         ---         ---        ---         ---        ---        ---

D   ---         ---         ---        ---         ---        ---        ---

U   ---         Te=0,Tr=0   Te=0,Tr=0  ---         ---        ---        ---
                ts          ts          

LEGEND:

U - allocation, A - ACTIVE, I - INACTIVE, O - OFF, 
E - ERROR, X - EXIT, D - DEAD,

Te=0  - event->total_time_enabled  = 0
Te+   - event->total_time_enabled += delta

Tr=0  - event->total_time_running  = 0
Tr+   - event->total_time_running += delta

ts    - event->tstamp = perf_event_time(event)

static void
perf_event_change_state(struct perf_event *event, enum perf_event_state state)
{
        u64 delta = 0;
	u64 now = perf_event_time(event);

        delta = now - event->tstamp;
	event->tstamp = now;

	switch(event->state)
	{
	case A:
		switch(state)
		{
		case A:
                        ...
			break;
		case I:
			event->total_time_enabled += delta;
			event->total_time_running += delta;
			event->state = state;
			break;
		case O:
			...
			break;
		case E:
			...
	...
	case I:
		...
		break;
	...
	}
}

---

Regards,
Alexey

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ