lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9edbdf8b-b5fb-5a82-43b4-b639f5ec8484@gmail.com>
Date:   Tue, 30 Oct 2018 22:43:14 +0200
From:   Igor Stoppa <igor.stoppa@...il.com>
To:     Matthew Wilcox <willy@...radead.org>,
        Tycho Andersen <tycho@...ho.ws>
Cc:     Andy Lutomirski <luto@...capital.net>,
        Kees Cook <keescook@...omium.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Mimi Zohar <zohar@...ux.vnet.ibm.com>,
        Dave Chinner <david@...morbit.com>,
        James Morris <jmorris@...ei.org>,
        Michal Hocko <mhocko@...nel.org>,
        Kernel Hardening <kernel-hardening@...ts.openwall.com>,
        linux-integrity <linux-integrity@...r.kernel.org>,
        linux-security-module <linux-security-module@...r.kernel.org>,
        Igor Stoppa <igor.stoppa@...wei.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Jonathan Corbet <corbet@....net>,
        Laura Abbott <labbott@...hat.com>,
        Randy Dunlap <rdunlap@...radead.org>,
        Mike Rapoport <rppt@...ux.vnet.ibm.com>,
        "open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH 10/17] prmem: documentation

On 30/10/2018 21:20, Matthew Wilcox wrote:
> On Tue, Oct 30, 2018 at 12:28:41PM -0600, Tycho Andersen wrote:
>> On Tue, Oct 30, 2018 at 10:58:14AM -0700, Matthew Wilcox wrote:
>>> On Tue, Oct 30, 2018 at 10:06:51AM -0700, Andy Lutomirski wrote:
>>>>> On Oct 30, 2018, at 9:37 AM, Kees Cook <keescook@...omium.org> wrote:
>>>> I support the addition of a rare-write mechanism to the upstream kernel.
>>>> And I think that there is only one sane way to implement it: using an
>>>> mm_struct. That mm_struct, just like any sane mm_struct, should only
>>>> differ from init_mm in that it has extra mappings in the *user* region.
>>>
>>> I'd like to understand this approach a little better.  In a syscall path,
>>> we run with the user task's mm.  What you're proposing is that when we
>>> want to modify rare data, we switch to rare_mm which contains a
>>> writable mapping to all the kernel data which is rare-write.
>>>
>>> So the API might look something like this:
>>>
>>> 	void *p = rare_alloc(...);	/* writable pointer */
>>> 	p->a = x;
>>> 	q = rare_protect(p);		/* read-only pointer */

With pools and memory allocated from vmap_areas, I was able to say

protect(pool)

and that would do a swipe on all the pages currently in use.
In the SELinux policyDB, for example, one doesn't really want to 
individually protect each allocation.

The loading phase happens usually at boot, when the system can be 
assumed to be sane (one might even preload a bare-bone set of rules from 
initramfs and then replace it later on, with the full blown set).

There is no need to process each of these tens of thousands allocations 
and initialization as write-rare.

Would it be possible to do the same here?

>>>
>>> To subsequently modify q,
>>>
>>> 	p = rare_modify(q);
>>> 	q->a = y;
>>
>> Do you mean
>>
>>      p->a = y;
>>
>> here? I assume the intent is that q isn't writable ever, but that's
>> the one we have in the structure at rest.
> 
> Yes, that was my intent, thanks.
> 
> To handle the list case that Igor has pointed out, you might want to
> do something like this:
> 
> 	list_for_each_entry(x, &xs, entry) {
> 		struct foo *writable = rare_modify(entry);

Would this mapping be impossible to spoof by other cores?

I'm asking this because, from what I understand, local interrupts are 
enabled here, so an attack could freeze the core performing the 
write-rare operation, while another scrapes the memory.

But blocking interrupts for the entire body of the loop would make RT 
latency unpredictable.

> 		kref_get(&writable->ref);
> 		rare_protect(writable);
> 	}
> 
> but we'd probably wrap it in list_for_each_rare_entry(), just to be nicer.

This seems suspiciously close to the duplication of kernel interfaces 
that I was roasted for :-)

--
igor

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ