linux-kernel - Re: [PATCH 4/5] select: make select() use schedule

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.1.10.0808291009450.3300@nehalem.linux-foundation.org>
Date:	Fri, 29 Aug 2008 10:26:02 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Alan Cox <alan@...rguk.ukuu.org.uk>
cc:	Arjan van de Ven <arjan@...radead.org>,
	linux-kernel@...r.kernel.org, mingo@...e.hu, tglx@...x.de
Subject: Re: [PATCH 4/5] select: make select() use schedule_hrtimeout()

On Fri, 29 Aug 2008, Alan Cox wrote:
>
> > "schedule_timeout()", there's a big difference between asking for two 
> > ticks and asking for two seconds. The latter should probably try to round 
> > to a nice timer tick basis for power reasons).
> 
> I disagree - that is fixing the problem in the wrong place. The timer
> structure needs an accuracy field of some form that the existing timer
> functions initialise to 0.

I do agree that we could do that too, but you miss one big issue: even if 
we were to add an accuracy field inside the kernel, there is no such field 
in the user interfaces.

We just pass timevals (and sometimes timespecs) around, and no, they don't 
have any way to specify accuracy.

Yeah, we could use the high bits in the usec/nsec words, but then older 
kernels would basically do random things, so that would be a horrible 
interface.

The other thing to do would be to just add totally new system calls with 
totally new interfaces, but (a) nobody would use them anyway and (b) it's 
simply not worth it.

So given that reality, and _if_ we want to support nice high-resolution 
sleeping by select/poll, the only reasonable thing to do is to estimate 
some kind of expected accuracy from the existing timeval/timespec.

And the only reasonable way to do that is to just look at the range. You 
can probably do something fairly trivial with

	/* Estimate expected accuracy in ns from a timeval */
	unsigned long estimate_accuracy(struct timeval *tv)
	{
		/*
		 * Tens of ms if we're looking at seconds, even
		 * more for 10s+ sleeping
		 */
		if (tv->tv_sec) {
			/* Tenths of seconds for long sleeps */
			if (tv->tv_sec > 10)
				return 100000000;
			/*
			 * Tens of ms for second-granularity sleeps. This,
			 * btw, is the historical Linux 100Hz timer range.
			 */
			return 10000000;
		}

		/* Single msecs if we're looking at milliseconds */
		if (tv->tv_usec > 1000)
			return 1000000;

		/* Aim for tenths of msecs otherwise */
		return 100000;
	}

and yes, it's just a heuristic, but it's probably not a horribly stupid 
one or a very unreasonable one. 

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/