linux-kernel - Re: [RFC PATCH 2/3] rseq: extend struct rseq with per thread group vcpu id

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87bkzqz75q.fsf@mid.deneb.enyo.de>
Date:   Tue, 01 Feb 2022 21:03:13 +0100
From:   Florian Weimer <fw@...eb.enyo.de>
To:     Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
        "Paul E . McKenney" <paulmck@...nel.org>,
        Boqun Feng <boqun.feng@...il.com>,
        "H . Peter Anvin" <hpa@...or.com>, Paul Turner <pjt@...gle.com>,
        linux-api@...r.kernel.org,
        Christian Brauner <christian.brauner@...ntu.com>,
        David.Laight@...LAB.COM, carlos@...hat.com,
        Peter Oskolkov <posk@...k.io>
Subject: Re: [RFC PATCH 2/3] rseq: extend struct rseq with per thread group
 vcpu id

* Mathieu Desnoyers:

> If a thread group has fewer threads than cores, or is limited to run on
> few cores concurrently through sched affinity or cgroup cpusets, the
> virtual cpu ids will be values close to 0, thus allowing efficient use
> of user-space memory for per-cpu data structures.

>From a userspace programmer perspective, what's a good way to obtain a
reasonable upper bound for the possible tg_vcpu_id values?

I believe not all users of cgroup cpusets change the affinity mask.

> diff --git a/kernel/rseq.c b/kernel/rseq.c
> index 13f6d0419f31..37b43735a400 100644
> --- a/kernel/rseq.c
> +++ b/kernel/rseq.c
> @@ -86,10 +86,14 @@ static int rseq_update_cpu_node_id(struct task_struct *t)
>  	struct rseq __user *rseq = t->rseq;
>  	u32 cpu_id = raw_smp_processor_id();
>  	u32 node_id = cpu_to_node(cpu_id);
> +	u32 tg_vcpu_id = task_tg_vcpu_id(t);
>  
>  	if (!user_write_access_begin(rseq, t->rseq_len))
>  		goto efault;
>  	switch (t->rseq_len) {
> +	case offsetofend(struct rseq, tg_vcpu_id):
> +		unsafe_put_user(tg_vcpu_id, &rseq->tg_vcpu_id, efault_end);
> +		fallthrough;
>  	case offsetofend(struct rseq, node_id):
>  		unsafe_put_user(node_id, &rseq->node_id, efault_end);
>  		fallthrough;

Is the switch really useful?  I suspect it's faster to just write as
much as possible all the time.  The switch should be well-predictable
if running uniform userspace, but still …