linux-kernel - resend: KERNEL BUG: nice level should not affect SCHED

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Date:	Wed, 07 Mar 2007 17:19:44 -0600
From:	"Chris Friesen" <cfriesen@...tel.com>
To:	Robert Love <rml@...ell.com>, Ingo Molnar <mingo@...e.hu>,
	Linus Torvalds <torvalds@...l.org>,
	Linux kernel <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...l.org>, Con Kolivas <kernel@...ivas.org>
Subject: resend: KERNEL BUG: nice level should not affect SCHED_RR timeslice

I still haven't seen any replies, so I'm resending with a few more 
people directly in the TO list.

The timeslice of a SCHED_RR process currently varies with nice level the 
same way that it does for SCHED_OTHER.  I've included a small app below 
that demonstrates the issue.  So while niceness doesn't affect the 
priority of a SCHED_RR task, it does impact how much cpu it gets 
relative to other SCHED_RR tasks.

SUSv3 indicates, "Any processes or threads using SCHED_FIFO or SCHED_RR 
shall be unaffected by a call to setpriority()."

In addition, the code in set_user_nice() has a comment that leads me to 
believe the current behaviour is accidental (although I think the "not" 
in the last line of the comment isn't meant to be there):

/*
  * The RT priorities are set via sched_setscheduler(), but we still
  * allow the 'normal' nice value to be set - but as expected
  * it wont have any effect on scheduling until the task is
  * not SCHED_NORMAL/SCHED_BATCH:
  */

It appears that the desired behaviour is to allow setting the nice level 
of a realtime task, but to not have it affect anything until (and 
unless) it drops that realtime status.  This seems reasonable, but 
doesn't match current behaviour.

Chris


#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
#include <sched.h>
#include <errno.h>
#include <string.h>
#include <sys/syscall.h>
#include <sys/time.h>
#include <sys/resource.h>

#define THRESHOLD_USEC 2000

unsigned long long stamp()
{
	struct timeval tv;
	gettimeofday(&tv, 0);
	return (unsigned long long) tv.tv_usec + ((unsigned long long) 
tv.tv_sec)*1000000;
}

void chewcpu(int cpu)
{
	unsigned long long thresh_ticks = THRESHOLD_USEC;
	unsigned long long cur,last;
	
	last = stamp();
	while(1) {
		cur = stamp();
		unsigned long long delta = cur-last;
		if (delta > thresh_ticks) {
			printf("pid %d, out for %llu ms\n", getpid(), delta/1000);
			cur = stamp();
		}
		last = cur;
	}
		
}


int main()
{
	int cpu;
         cpu_set_t cpumask;
         CPU_ZERO(&cpumask);
         CPU_SET(0, &cpumask);

	int kidpid = fork();	
	
	struct sched_param p;
	p.sched_priority = 1;
	sched_setscheduler(0, SCHED_RR, &p);

	struct timespec ts;
	
	if (kidpid) {
		setpriority(PRIO_PROCESS, 0, 19);
		printf("pid %d, prio of %d\n", getpid(), getpriority(PRIO_PROCESS, 0));
		sched_rr_get_interval(0, &ts);
		printf("pid %d, interval of %d nsec\n", getpid(), ts.tv_nsec);
	} else {
		setpriority(PRIO_PROCESS, 0, -19);
		printf("pid %d, prio of %d\n", getpid(), getpriority(PRIO_PROCESS, 0));
		sched_rr_get_interval(0, &ts);
		printf("pid %d, interval of %d nsec\n", getpid(), ts.tv_nsec);
	}
		
	int rc = syscall(__NR_sched_setaffinity, 0, sizeof(cpumask), &cpumask);
         if (rc < 0)
                 printf("unable to set affinity: %m\n");
	

	sleep(1);
	
	chewcpu(cpu);
	return 0;
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/