linux-kernel - Re: [PATCH v4 5/6] timerfd: Add support for deferrable timers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrVxvCaLUyeMoaEHXvUzOgj_531HENu1G90_WKnS3dE4zA@mail.gmail.com>
Date:	Tue, 4 Mar 2014 16:42:57 -0800
From:	Andy Lutomirski <luto@...capital.net>
To:	Thomas Gleixner <tglx@...utronix.de>
Cc:	Alexey Perevalov <a.perevalov@...sung.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	John Stultz <john.stultz@...aro.org>,
	Anton Vorontsov <anton@...msg.org>,
	Kyungmin Park <kyungmin.park@...sung.com>,
	cw00.choi@...sung.com, Andrew Morton <akpm@...ux-foundation.org>,
	Anton Vorontsov <anton.vorontsov@...aro.org>
Subject: Re: [PATCH v4 5/6] timerfd: Add support for deferrable timers

On Tue, Mar 4, 2014 at 4:10 PM, Thomas Gleixner <tglx@...utronix.de> wrote:
> On Tue, 4 Mar 2014, Andy Lutomirski wrote:
>> On Tue, Mar 4, 2014 at 2:11 PM, Thomas Gleixner <tglx@...utronix.de> wrote:
>> > We do no add another random special case syscall for timerfd just
>> > because timerfd is linux specific.
>>
>> What syscalls?  I can think of exactly two timer interfaces that
>> actually accept a clock id and flags: clock_nanosleep and
>> timerfd_settime.
>
> Sure, and what you can think of is reality?
>
>  sys_timer_settime() which relies on sys_timer_create() are outside
>  your universe, right?
>

Sigh, I forgot about those.  I would argue that there is no real
reason to make timer_create any fancier.  That kind of sucks.

> Aside of that if you want to make the slack thing usefull on a per
> call basis then you want to add it to a lot of other interfaces like
> poll.

Same with deferrable timers.  And things that want MONOTONIC *and*
REALTIME.  Etc.

>
> And you are completely ignoring the fact that the slack works
> completely differrent:
>
> A slacked timer still gets enqueued into the main timer queue. It just
> relies on the fact that it gets batched with some other expiring
> timer. But thats completely different to the deferrable approach.
>
>        start_timer(timer, expiry, slack);
>
>            timer.hard_expiry = expiry + slack;
>            timer.soft_expiry = expiry;
>            enqueue_timer(timer, timer.hard_expiry);
>
> The enqueueing code puts it into the queue by looking at the
> hard_expiry code. And the expiry code looks at the timer.soft_expiry
> value to expire a timer early.
>
> Now assume the following:
>
>        start_timer(timer, +100ms, 100s);
>
> So that puts that timer into the hard expiry line of 100.1 sec from
> now. So if the cpu is busy and is firing a lot of timers then your
> timer could be delayed up to the hard expiry time, i.e. 100.1 seconds
> from now, which has completely differrent semantics than the
> deferrrable timers.

Erk.  I didn't realize that.  Is that really the desired behavior?  I
assumed that a timer with slack would fire at the earliest time after
the soft timeout at which the system wasn't idle.  The idea is to
batch wakeups, right?

>
> The deferrable timer is guaranteed to expire (halfways) on time when
> the system is active and does not affect the system from going idle,
> but it expires right away when the system comes back out of idle.
>
> The slack timers are just a batching mechanism to align expiry times
> of non deferrable timers to a common time.
>
> So how do you map those together?

By thinking of what semantics are actually useful for userspace developers.

I think that most userspace developers probably want the semantics
that I thought that timer slack had: I want to do work between time A
and time B.  Before A is too early, but I'm willing to wait until time
B if it improves power consumption.

Presumably, if the kernel chooses *not* to fire the timer just after
time A even if the system is awake, then it's risking an unnecessary
wakeup at time B.

(I admit that I don't really understand the hrtimer code.  I guess
that two indexes on the list of timers would be needed.)

>> > But we cannot do that right now as we cannot whip up severl dozen of
>> > new syscalls just because we want to add slack/deferrable whatever
>> > properties.
>
>> Two syscalls, right?
>
> It does not matter at all how many syscalls this affects. We are not
> adding any random new syscalls just because we can.
>
>> Once we agree on a solution to the Y2038 issue on 32bit with a unified
>> 32/64 bit syscall interface which simply gets rid of the timespec/val
>> nonsense and takes a simple u64 nsec value we can add the slack
>> property to that without any further inconvenience.
>
> Ignoring this wont get you anywhere.

I'm not entirely sure why per-timer slack can't be added without
simultaneously fixing Y2038 (and presumably leap seconds, too) but a
new flag can be.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/