[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5331CF58.8000909@oracle.com>
Date: Tue, 25 Mar 2014 12:47:52 -0600
From: Khalid Aziz <khalid.aziz@...cle.com>
To: Andi Kleen <andi@...stfloor.org>
CC: tglx@...utronix.de, mingo@...hat.com, hpa@...or.com,
peterz@...radead.org, akpm@...ux-foundation.org,
andi.kleen@...el.com, rob@...dley.net, viro@...iv.linux.org.uk,
oleg@...hat.com, gnomes@...rguk.ukuu.org.uk, riel@...hat.com,
snorcht@...il.com, dhowells@...hat.com, luto@...capital.net,
daeseok.youn@...il.com, ebiederm@...ssion.com,
linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org
Subject: Re: [PATCH v2] Pre-emption control for userspace
On 03/25/2014 12:20 PM, Andi Kleen wrote:
> Khalid Aziz <khalid.aziz@...cle.com> writes:
>
> First it would be nice to have some standard reference lock library that
> uses this. What would it take to support this in glibc?
I am not sure if it would be practical and useful to integrate this into
any of the standard locking interfaces, but I have not looked into it
much either. My initial intent is to let individual apps decide if they
could benefit from this interface and code it in if so since the
interface is meant to be very simple. Do you see any of the standard
locking interfaces where it would make sense to integrate this feature
in, or are you thinking of creating a new interface?
>
>> +==================================
>> +Using the preemption delay feature
>> +==================================
>> +
>> +This feature is enabled in the kernel by setting
>> +CONFIG_SCHED_PREEMPT_DELAY in kernel configuration. Once this feature is
>> +enabled, the userspace process communicates with the kernel using a
>> +4-byte memory location in its address space. It first gives the kernel
>> +address for this memory location by writing its address to
>> +/proc/<tgid>/task/<tid>/sched_preempt_delay. This memory location is
>> +interpreted as a sequence of 4 bytes:
>> +
>> + byte[0] = flag to request preemption delay
>> + byte[1] = flag from kernel indicating preemption delay was granted
>> + byte[2] = reserved for future use
>> + byte[3] = reserved for future use
>
> Should reserve more bytes (64, 128?) and rename the proc flag
> to a more generic name. I could well assume other things
> using such a mechanism in the future. Also please add a flag
> word with feature bits (similar to the perf mmap page)
I am reluctant to make it too big since reading larger quantities from
userspace will take longer and start to impact performance. Keeping
shared data limited to 32-bits allows us to move it between userspace
and kernel with one instruction.
>
> How about alignment? x86 will not care, but other architectures
> may.
>
You are right. These 4-bytes need to be aligned to a 4-byte boundary. I
will look into adding an alignment check at the time this address is
passed to kernel.
>> #endif
>> +#ifdef CONFIG_SCHED_PREEMPT_DELAY
>> + REG("sched_preempt_delay", S_IRUGO|S_IWUSR,
>> proc_tid_preempt_delay_ops),
>
> This shouldn't be readable by group/other, as it exposes the address space,
> so could help exploits.
I like Oleg's suggestion of using prctl() instead of procfs to pass the
userspace address to kernel. Above problem will disappear when I switch
to prctl() in v3 of this patch.
>
>> @@ -2061,6 +2069,13 @@ extern u64 scheduler_tick_max_deferment(void);
>> static inline bool sched_can_stop_tick(void) { return false; }
>> #endif
>>
>> +#if defined(CONFIG_SCHED_PREEMPT_DELAY) && defined(CONFIG_PROC_FS)
>> +extern void sched_preempt_delay_show(struct seq_file *m,
>> + struct task_struct *task);
>> +extern void sched_preempt_delay_set(struct task_struct *task,
>> + unsigned char *val);
>> +#endif
>
> Prototypes don't need to be ifdefed.
>
> -Andi
>
Ah, you are right. Thanks, Andi!
--
Khalid
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists