lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGXu5jKLgL0Kt5xCWv-3ZUX94m1DNXLqsEDQKHoq7T=m6P7tvQ@mail.gmail.com>
Date:	Fri, 6 Nov 2015 11:08:06 -0800
From:	Kees Cook <keescook@...omium.org>
To:	Laura Abbott <labbott@...hat.com>
Cc:	Russell King - ARM Linux <linux@....linux.org.uk>,
	Laura Abbott <labbott@...oraproject.org>,
	Catalin Marinas <catalin.marinas@....com>,
	Will Deacon <will.deacon@....com>,
	"linux-arm-kernel@...ts.infradead.org" 
	<linux-arm-kernel@...ts.infradead.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Linux-MM <linux-mm@...ck.org>
Subject: Re: [PATCH] arm: Use kernel mm when updating section permissions

On Fri, Nov 6, 2015 at 10:44 AM, Laura Abbott <labbott@...hat.com> wrote:
> On 11/05/2015 05:15 PM, Kees Cook wrote:
>>
>> On Thu, Nov 5, 2015 at 5:05 PM, Laura Abbott <labbott@...hat.com> wrote:
>>>
>>> On 11/05/2015 08:27 AM, Russell King - ARM Linux wrote:
>>>>
>>>>
>>>> On Thu, Nov 05, 2015 at 08:20:42AM -0800, Laura Abbott wrote:
>>>>>
>>>>>
>>>>> On 11/05/2015 01:46 AM, Russell King - ARM Linux wrote:
>>>>>>
>>>>>>
>>>>>> On Wed, Nov 04, 2015 at 05:00:39PM -0800, Laura Abbott wrote:
>>>>>>>
>>>>>>>
>>>>>>> Currently, read only permissions are not being applied even
>>>>>>> when CONFIG_DEBUG_RODATA is set. This is because section_update
>>>>>>> uses current->mm for adjusting the page tables. current->mm
>>>>>>> need not be equivalent to the kernel version. Use pgd_offset_k
>>>>>>> to get the proper page directory for updating.
>>>>>>
>>>>>>
>>>>>>
>>>>>> What are you trying to achieve here?  You can't use these functions
>>>>>> at run time (after the first thread has been spawned) to change
>>>>>> permissions, because there will be multiple copies of the kernel
>>>>>> section mappings, and those copies will not get updated.
>>>>>>
>>>>>> In any case, this change will probably break kexec and ftrace, as
>>>>>> the running thread will no longer see the updated page tables.
>>>>>>
>>>>>
>>>>> I think I was hitting that exact problem with multiple copies
>>>>> not getting updated. The section_update code was being called
>>>>> and I was seeing the tables get updated but nothing was being
>>>>> applied when I tried to write to text or check the debugfs
>>>>> page table. The current flow is:
>>>>>
>>>>> rest_init -> kernel_thread(kernel_init) and from that thread
>>>>> mark_rodata_ro. So mark_rodata_ro is always going to happen
>>>>> in a thread.
>>>>>
>>>>> Do we need to update for both init_mm and the first running
>>>>> thread?
>>>>
>>>>
>>>>
>>>> The "first running thread" is merely coincidental for things like kexec.
>>>>
>>>> Hmm.  Actually, I think the existing code _should_ be fine.  At the
>>>> point where mark_rodata_ro() is, we should still be using init_mm, so
>>>> updating the current threads page tables should actually be updating
>>>> the swapper_pg_dir.
>>>
>>>
>>>
>>> That doesn't seem to hold true. Based on what I'm seeing, we lose
>>> the the guarantee of init_mm after the first exec. If usermodehelper
>>> gets called to load a module, that triggers an exec and the kernel
>>> thread is no longer using init_mm after that. I'm testing with the
>>> multi-v7 defconfig which uses the smsc911x driver which loads a
>>> module during initcall. That gets called before mark_rodata_ro so
>>> the init_mm is never updated. I verified that disabling smsc911x
>>> makes things work as expected. I suspect the testing was never done
>>> with a driver that tried to call usermodehelper during init time.
>>
>>
>> Ooooh. Nice catch. Yeah, my testing didn't include that case.
>>
>>> I got as far as narrowing it down that it happens after the
>>> usermodehelper
>>> but I wasn't able to pinpoint where exactly the switch happened. It seems
>>> like we need to have the page tables set up before any initcalls
>>> happen otherwise we risk having an exec create stray processes which we
>>> can't update.
>>
>>
>> Can we just make mark_rodata_ro() a no-op and do the RO setting
>> earlier when we do the NX setting?
>>
>
> Unfortunately no. The time we are doing the nx setting is before we've
> finished
> with the initmem so we need the initmem to be finished and freed before we
> can
> mark anything RO.
>
> More importantly, the NX settings are also not getting set. Compare before:
>
> ---[ Kernel Mapping ]---
> 0xc0000000-0xc0300000           3M     RW NX
> 0xc0300000-0xc1300000          16M     RW x
> 0xc1300000-0xcc000000         173M     RW NX
> 0xcc000000-0xcc040000         256K     RW NX     MEM/BUFFERABLE/WC
> 0xcc040000-0xcc100000         768K     RW NX     MEM/CACHED/WBRA
> 0xcc100000-0xcc280000        1536K     RW NX     MEM/BUFFERABLE/WC
> 0xcc280000-0xd0000000       62976K     RW NX     MEM/CACHED/WBRA
> 0xd0000000-0xd0200000           2M     RW NX
>
> and after
>
> ---[ Kernel Mapping ]---
> 0xc0000000-0xc0300000           3M     RW NX
> 0xc0300000-0xc0c00000           9M     ro x
> 0xc0c00000-0xc1100000           5M     ro NX
> 0xc1100000-0xcc000000         175M     RW NX
> 0xcc000000-0xcc040000         256K     RW NX     MEM/BUFFERABLE/WC
> 0xcc040000-0xcc100000         768K     RW NX     MEM/CACHED/WBRA
> 0xcc100000-0xcc280000        1536K     RW NX     MEM/BUFFERABLE/WC
> 0xcc280000-0xd0000000       62976K     RW NX     MEM/CACHED/WBRA
> 0xd0000000-0xd0200000           2M     RW NX
>
>
> with my test patch. I think setting both current->active_mm and &init_mm
> is sufficient. Maybe explicitly setting swapper_pg_dir would be cleaner?
>
> Is there a test that should be running in a CI somewhere to catch cases like
> this where the permissions are not working as expected

I wrote these for lkdtm -- actually just mentioned it here:
http://lwn.net/Articles/663531/

EXEC_DATA
EXEC_STACK
EXEC_KMALLOC
EXEC_VMALLOC
EXEC_USERSPACE
ACCESS_USERSPACE
WRITE_RO
WRITE_KERN

Each of those should Oops the kernel if things are working correctly.
I'm not aware of a public CI that currently handles checking for
expected Oops via lkdtm.

The other tests I ran when building this were to turn ftrace on and
off. If that works for you, then this patch seems fine. (AIUI, the
code would be unchanged from original when running ftrace, so I would
expect this to work.)

-Kees

>
> My test patch that seems to be working:
>
> ----8<-----
>
> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index 8a63b4c..6276b234 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -627,12 +627,10 @@ static struct section_perm ro_perms[] = {
>   * safe to be called with preemption disabled, as under stop_machine().
>   */
>  static inline void section_update(unsigned long addr, pmdval_t mask,
> -                                 pmdval_t prot)
> +                                 pmdval_t prot, struct mm_struct *mm)
>  {
> -       struct mm_struct *mm;
>         pmd_t *pmd;
>  -      mm = current->active_mm;
>         pmd = pmd_offset(pud_offset(pgd_offset(mm, addr), addr), addr);
>   #ifdef CONFIG_ARM_LPAE
> @@ -656,7 +654,7 @@ static inline bool arch_has_strict_perms(void)
>         return !!(get_cr() & CR_XP);
>  }
>  -#define set_section_perms(perms, field)       {
> \
> +#define set_section_perms(perms, field, all)   {                       \
>         size_t i;                                                       \
>         unsigned long addr;                                             \
>                                                                         \
> @@ -674,31 +672,35 @@ static inline bool arch_has_strict_perms(void)
>                                                                         \
>                 for (addr = perms[i].start;                             \
>                      addr < perms[i].end;                               \
> -                    addr += SECTION_SIZE)                              \
> +                    addr += SECTION_SIZE) {                            \
>                         section_update(addr, perms[i].mask,             \
> -                                      perms[i].field);                 \
> +                                      perms[i].field, current->active_mm);
> \
> +                       if (all)                                        \
> +                               section_update(addr, perms[i].mask,     \
> +                                       perms[i].field, &init_mm);      \
> +               }                                                       \
>         }                                                               \
>  }
>  -static inline void fix_kernmem_perms(void)
> +void fix_kernmem_perms(void)
>  {
> -       set_section_perms(nx_perms, prot);
> +       set_section_perms(nx_perms, prot, true);
>  }
>   #ifdef CONFIG_DEBUG_RODATA
>  void mark_rodata_ro(void)
>  {
> -       set_section_perms(ro_perms, prot);
> +       set_section_perms(ro_perms, prot, true);
>  }
>   void set_kernel_text_rw(void)
>  {
> -       set_section_perms(ro_perms, clear);
> +       set_section_perms(ro_perms, clear, false);
>  }
>   void set_kernel_text_ro(void)
>  {
> -       set_section_perms(ro_perms, prot);
> +       set_section_perms(ro_perms, prot, false);
>  }
>  #endif /* CONFIG_DEBUG_RODATA */
>
>
>
>
>



-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ