lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 21 Mar 2008 00:18:46 -0700 (PDT)
From:	Roland McGrath <roland@...hat.com>
To:	Frank Mayhar <fmayhar@...gle.com>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: posix-cpu-timers revamp

Sorry for the delay.

> Please take a look and let me know what you think.  In the meantime I'll
> be working on a similar patch to 2.6-head that has optimizations for
> uniprocessor and two-CPU operation, to avoid the overhead of the percpu
> functions when they are unneeded.

My mention of a 2-CPU special case was just an off-hand idea.  I don't
really have any idea if that would be optimal given the tradeoff of
increaing signal_struct size.  The performance needs be analyzed.

> 	disappeared entirely and the arm_timer() routine merely fills
> 	p->signal->it_*_expires from timer->it.cpu.expires.*.  The
> 	cpu_clock_sample_group_locked() loses its summing loops, using the
> 	the shared structure instead.  Finally, set_process_cpu_timer() sets
> 	tsk->signal->it_*_expires directly rather than calling the deleted
> 	rebalance routine.

I think I misled you about the use of the it_*_expires fields, sorry.
The task_struct.it_*_expires fields are used solely as a cache of the
head of cpu_timers[].  Despite the poor choice of the same name, the
signal_struct.it_*_expires fields serve a different purpose.  For an
analogous cache of the soonest timer to expire, you need to add new
fields.  The signal_struct.it_{prof,virt}_{expires,incr} fields hold
the setitimer settings for ITIMER_{PROF,VTALRM}.  You can't change
those in arm_timer.  For a quick cache you need a new field that is
the sooner of it_foo_expires or the head cpu_timers[foo] expiry time.

The shared_utime_sum et al names are somewhat oblique to anyone who
hasn't just been hacking on exactly this thing like you and I have.
Things like thread_group_*time make more sense.

There are now several places where you call both shared_utime_sum and
shared_stime_sum.  It looks simple because they're nicely encapsulated.
But now you have two loops through all CPUs, and three loops in
check_process_timers.

I think what we want instead is this:

	struct task_cputime
	{
		cputime_t utime;
		cputime_t stime;
		unsigned long long schedtime;
	};

Use one in task_struct to replace the utime, stime, and sum_sched_runtime
fields, and another to replace it_*_expires.  Use a single inline function
thread_group_cputime() that fills a sum struct task_cputime using a single
loop.  For the places only one or two of the sums is actually used, the
compiler should optimize away the extra summing from the loop.

Don't use __cacheline_aligned on this struct type itself, because most of
the uses don't need that.  When using alloc_percpu, you can rely on it to
take care of those needs--that's what it's for.  If you implement a
variant that uses a flat array, you can use a wrapper struct with
__cacheline_aligned for that.


Thanks,
Roland
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ