lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <77543974-cd67-3999-103e-6714d04f0e5e@efficios.com>
Date:   Fri, 23 Sep 2022 09:46:15 -0400
From:   Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To:     Chris Kennelly <ckennelly@...gle.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Paul Turner <pjt@...gle.com>, Peter Oskolkov <posk@...k.io>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        "carlos@...hat.com" <carlos@...hat.com>,
        Florian Weimer <fw@...eb.enyo.de>,
        "linux-api@...r.kernel.org" <linux-api@...r.kernel.org>
Subject: Re: [PATCH v4 00/25] RSEQ node id and virtual cpu id extensions

On 2022-09-22 16:10, Chris Kennelly wrote:
> Hi,
> 
> I still need to update the code in TCMalloc to cooperate with the new 
> glibc ABI/convention.  One concern I have is that it looks like I might 
> need to add a extra memory dereference (or two) to get the early 
> initialized offsets provided by glibc folded into the read of the cpu_id 
> field.

If you have a concrete example of this, I'd be happy to help and perhaps 
we can improve your usage pattern.

> 
> I think I can avoid this by using %gs to point to the address of the 
> cpu_id field itself (which I think could be used to select between vCPUs 
> or not*), but %gs is a global piece of state that all of the libraries 
> in the program need to cooperate on.

I think what we are all looking for here is a scheme that would allow us 
the fastest per-vcpu data structure accesses possible from userspace.

I think we could do something similar to what is done in the Linux 
kernel for that, but in userspace. Here are some random ideas I have on 
this topic:

We could introduce a new prctl(2) PT_{SET,GET}_GS_MODE on x86-64. This 
would take as arguments the indexing mode and offset multiplier we want 
to be applied to the GS segment selector on return to userspace:

enum gs_index_mode {
	GS_INDEX_MODE_MM_VCPU,
};

struct prctl_set_gs_mode {
	enum gs_index_mode index_mode;
	u64 stride;
};

For a memory space which has this gs mode set, the return to userspace 
code would populate the GS segment selector register with:

   stride * current->mm_vcpu_id

The "stride" would be the virtual address space size allowed for 
per-vcpu-data. This could be decided by the libc, with a tunable 
allowing to increase/decrease this size. Another libc tunable could 
disable populating the GS segment selector altogether (e.g. for 
compatibility with applications like Wine which AFAIK use it).

With this in place, I hope we could then do per-vcpu data access by 
simply prefixing memory access instructions with a %%gs: segment 
selector prefix.

Thoughts ?

Thanks,

Mathieu


> 
> Thanks,
> Chris
> 
> * TCMalloc is already paying a load+pointer arithmetic to select between 
> cpu_id versus vcpu_id, so this would actually make things a little bit 
> faster.
> 
> On Thu, Sep 22, 2022 at 3:21 PM Mathieu Desnoyers 
> <mathieu.desnoyers@...icios.com <mailto:mathieu.desnoyers@...icios.com>> 
> wrote:
> 
>     Hi Chris,
> 
>     Sorry it looks like I forgot to CC you on this series. If you can give
>     it a spin with tcmalloc I would be very much interested in the result.
> 
>     Thanks,
> 
>     Mathieu
> 


-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ