lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADU+-uAw8wLbY6E0msuwZFgW9hDXz8a6oeDJBTSURUfS3aO1gw@mail.gmail.com>
Date:	Fri, 19 Sep 2014 20:46:41 -0700
From:	Filipe Brandenburger <filbranden@...gle.com>
To:	Andy Lutomirski <luto@...capital.net>
Cc:	"H. Peter Anvin" <hpa@...or.com>, Greg Thelen <gthelen@...gle.com>,
	X86 ML <x86@...nel.org>, Thomas Gleixner <tglx@...utronix.de>,
	Michael Davidson <md@...gle.com>,
	Ingo Molnar <mingo@...hat.com>,
	Richard Larocque <rlarocque@...gle.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Linux API <linux-api@...r.kernel.org>
Subject: Re: [PATCH] x86/vdso: Add prctl to set per-process VDSO load

On Fri, Sep 19, 2014 at 8:27 PM, Andy Lutomirski <luto@...capital.net> wrote:
>> I thought of doing that from the prctl but AFAICT remap_pfn_range
>> requires that it's unmapped before the call (remap_pte_range has
>> BUG_ON(!pte_none(*pte));) and doing a zap_page_range followed by
>> remap_pfn_range might incur a race condition if another thread of the
>> same process is accessing the vvar page at that time... Am I wrong
>> about that race?
>
> No, you're right.  Ugh.
>
> It might pay to add an explicit .fault callback to vm_special_mapping,
> but your approach will work, too.  The main downside is more memory
> overhead per mm.

One way to limit that overhead was to keep only two static
vm_special_mapping structs (one for the real vvar page and one for the
zero page) and then change the vma->vm_private_data to point to one or
the other when the prctl is called. I just thought of that recently
and haven't tried it yet so not sure if it will cause other
problems...

>>> Changing out the text is a whole can of worms involving self-modifying
>>> code, although it may be completely safe if done through the page
>>> tables.  But I don't think you can't use remap_pfn_range for that.
>>
>> No, I'm not planning to change the text, just replacing the vvar page
>> swapping the one where the vsyscall_gtod_data lives with a zero page
>> (and back depending on the parameter of the prctl.)
>
> Hmm.  This will break the _COARSE timing functions as well as
> __vdso_time.  That'll need fixing, either by swapping out the code
> (yuck!) or by adding a branch to all of those code paths.
>
> Maybe there's a non-branchy way, though.  Let me think.

So this is what I was thinking of extending the vvar region to include
one extra page (right now it's two pages, the one with the
vsyscall_gtod_data and the one with the hrtimer I/O mapping, so this
would be a third page.) This third page would be initially mapped to
the zero page, so it wouldn't cost an extra page of memory per process
or even globally.

The prctl would swap that page with another page allocated as the
prctl is called. That one page that's allocated with the prctl would
be inherited over forks and execs, so if you have a bunch of processes
that need the same offset then they'd all share that one single page.

The vdso clock routines would get the base clock information from
vsyscall_gtod_data, hpet, tsc, etc. and would simply add the offsets
from the task_vvar page (not too branch-y, just a few extra adds.) In
the regular case, that would be a zero page so it would be simply
adding zero. If we need to apply offsets, then we replace that page
with one that has a struct with the right offsets for each clock type.

When running in kernel mode, the clock code (and timer code, etc.)
would find that page through the mm_struct and find the clock offsets
that way.

I was working on this approach before you suggested swapping the
actual vvar page (and not sure if you were actually looking at
hot-swapping the vdso itself.) I can go back to working on it and send
you a few commits since I still have some of that code around.

Cheers,
Filipe
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ