lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <43f35219-ec39-810b-ebfd-16c14e7b6150@suse.com>
Date:   Fri, 20 Dec 2019 09:43:44 +0100
From:   Jürgen Groß <jgross@...e.com>
To:     Sergey Dyasli <sergey.dyasli@...rix.com>, xen-devel@...ts.xen.org,
        kasan-dev@...glegroups.com, linux-kernel@...r.kernel.org
Cc:     Andrey Ryabinin <aryabinin@...tuozzo.com>,
        Alexander Potapenko <glider@...gle.com>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        Stefano Stabellini <sstabellini@...nel.org>,
        George Dunlap <george.dunlap@...rix.com>,
        Ross Lagerwall <ross.lagerwall@...rix.com>
Subject: Re: [RFC PATCH 1/3] x86/xen: add basic KASAN support for PV kernel

On 19.12.19 17:42, Sergey Dyasli wrote:
> On 18/12/2019 09:24, Jürgen Groß wrote:
>> On 17.12.19 15:08, Sergey Dyasli wrote:
>>> This enables to use Outline instrumentation for Xen PV kernels.
>>>
>>> KASAN_INLINE and KASAN_VMALLOC options currently lead to boot crashes
>>> and hence disabled.
>>>
>>> Rough edges in the patch are marked with XXX.
>>>
>>> Signed-off-by: Sergey Dyasli <sergey.dyasli@...rix.com>
>>> ---
>>>    arch/x86/mm/init.c          | 14 ++++++++++++++
>>>    arch/x86/mm/kasan_init_64.c | 28 ++++++++++++++++++++++++++++
>>>    arch/x86/xen/Makefile       |  7 +++++++
>>>    arch/x86/xen/enlighten_pv.c |  3 +++
>>>    arch/x86/xen/mmu_pv.c       | 13 +++++++++++--
>>>    arch/x86/xen/multicalls.c   | 10 ++++++++++
>>>    drivers/xen/Makefile        |  2 ++
>>>    kernel/Makefile             |  2 ++
>>>    lib/Kconfig.kasan           |  3 ++-
>>>    9 files changed, 79 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
>>> index e7bb483557c9..0c98a45eec6c 100644
>>> --- a/arch/x86/mm/init.c
>>> +++ b/arch/x86/mm/init.c
>>> @@ -8,6 +8,8 @@
>>>    #include <linux/kmemleak.h>
>>>    #include <linux/sched/task.h>
>>>    +#include <xen/xen.h>
>>> +
>>>    #include <asm/set_memory.h>
>>>    #include <asm/e820/api.h>
>>>    #include <asm/init.h>
>>> @@ -835,6 +837,18 @@ void free_kernel_image_pages(const char *what, void *begin, void *end)
>>>        unsigned long end_ul = (unsigned long)end;
>>>        unsigned long len_pages = (end_ul - begin_ul) >> PAGE_SHIFT;
>>>    +    /*
>>> +     * XXX: skip this for now. Otherwise it leads to:
>>> +     *
>>> +     * (XEN) mm.c:2713:d157v0 Bad type (saw 8c00000000000001 != exp e000000000000000) for mfn 36f40 (pfn 02f40)
>>> +     * (XEN) mm.c:1043:d157v0 Could not get page type PGT_writable_page
>>> +     * (XEN) mm.c:1096:d157v0 Error getting mfn 36f40 (pfn 02f40) from L1 entry 8010000036f40067 for l1e_owner d157, pg_owner d157
>>> +     *
>>> +     * and further #PF error: [PROT] [WRITE] in the kernel.
>>> +     */
>>> +    if (xen_pv_domain() && IS_ENABLED(CONFIG_KASAN))
>>> +        return;
>>> +
>>
>> I guess this is related to freeing some kasan page tables without
>> unpinning them?
> 
> Your guess was correct. Turned out that early_top_pgt which I pinned and made RO
> is located in .init section and that was causing issues. Unpinning it and making
> RW again right after kasan_init() switches to use init_top_pgt seem to fix this
> issue.
> 
>>
>>>        free_init_pages(what, begin_ul, end_ul);
>>>          /*
>>> diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
>>> index cf5bc37c90ac..caee2022f8b0 100644
>>> --- a/arch/x86/mm/kasan_init_64.c
>>> +++ b/arch/x86/mm/kasan_init_64.c
>>> @@ -13,6 +13,8 @@
>>>    #include <linux/sched/task.h>
>>>    #include <linux/vmalloc.h>
>>>    +#include <xen/xen.h>
>>> +
>>>    #include <asm/e820/types.h>
>>>    #include <asm/pgalloc.h>
>>>    #include <asm/tlbflush.h>
>>> @@ -20,6 +22,9 @@
>>>    #include <asm/pgtable.h>
>>>    #include <asm/cpu_entry_area.h>
>>>    +#include <xen/interface/xen.h>
>>> +#include <asm/xen/hypervisor.h>
>>> +
>>>    extern struct range pfn_mapped[E820_MAX_ENTRIES];
>>>      static p4d_t tmp_p4d_table[MAX_PTRS_PER_P4D] __initdata __aligned(PAGE_SIZE);
>>> @@ -305,6 +310,12 @@ static struct notifier_block kasan_die_notifier = {
>>>    };
>>>    #endif
>>>    +#ifdef CONFIG_XEN
>>> +/* XXX: this should go to some header */
>>> +void __init set_page_prot(void *addr, pgprot_t prot);
>>> +void __init pin_pagetable_pfn(unsigned cmd, unsigned long pfn);
>>> +#endif
>>> +
>>
>> Instead of exporting those, why don't you ...
>>
>>>    void __init kasan_early_init(void)
>>>    {
>>>        int i;
>>> @@ -332,6 +343,16 @@ void __init kasan_early_init(void)
>>>        for (i = 0; pgtable_l5_enabled() && i < PTRS_PER_P4D; i++)
>>>            kasan_early_shadow_p4d[i] = __p4d(p4d_val);
>>>    +    if (xen_pv_domain()) {
>>> +        /* PV page tables must have PAGE_KERNEL_RO */
>>> +        set_page_prot(kasan_early_shadow_pud, PAGE_KERNEL_RO);
>>> +        set_page_prot(kasan_early_shadow_pmd, PAGE_KERNEL_RO);
>>> +        set_page_prot(kasan_early_shadow_pte, PAGE_KERNEL_RO);
>>
>> add a function doing that to mmu_pv.c (e.g. xen_pv_kasan_early_init())?
> 
> Sounds like a good suggestion, but new functions still need some header for
> declarations (xen/xen.h?). And kasan_map_early_shadow() will need exporting

xen/xen-ops.h

> through kasan.h as well, but that's probably not an issue.

You could let the new function return (pgd_t *)xen_start_info->pt_base
and use that here, e.g.:

if (xen_pv_domain()) {
     pgd_t *pgd;

     pgd = xen_kasan_early_init();
     kasan_map_early_shadow(pgd);
}

> 
>>
>>> +
>>> +        /* Add mappings to the initial PV page tables */
>>> +        kasan_map_early_shadow((pgd_t *)xen_start_info->pt_base);
>>> +    }
>>> +
>>>        kasan_map_early_shadow(early_top_pgt);
>>>        kasan_map_early_shadow(init_top_pgt);
>>>    }
>>> @@ -369,6 +390,13 @@ void __init kasan_init(void)
>>>                    __pgd(__pa(tmp_p4d_table) | _KERNPG_TABLE));
>>>        }
>>>    +    if (xen_pv_domain()) {
>>> +        /* PV page tables must be pinned */
>>> +        set_page_prot(early_top_pgt, PAGE_KERNEL_RO);
>>> +        pin_pagetable_pfn(MMUEXT_PIN_L4_TABLE,
>>> +                  PFN_DOWN(__pa_symbol(early_top_pgt)));
>>
>> and another one like xen_pv_kasan_init() here.
> 
> Now there needs to be a 3rd function to unpin early_top_pgt.

Not if you do the load_cr3 in the xen pv case in the new function:

if (xen_pv_domain())
     xen_kasan_load_cr3(early_top_pgt);
else
     load_cr3(early_top_pgt);

> 
>>
>>> +    }
>>> +
>>>        load_cr3(early_top_pgt);
>>>        __flush_tlb_all();
>>>    diff --git a/arch/x86/xen/Makefile b/arch/x86/xen/Makefile
>>> index 084de77a109e..102fad0b0bca 100644
>>> --- a/arch/x86/xen/Makefile
>>> +++ b/arch/x86/xen/Makefile
>>> @@ -1,3 +1,10 @@
>>> +KASAN_SANITIZE_enlighten_pv.o := n
>>> +KASAN_SANITIZE_enlighten.o := n
>>> +KASAN_SANITIZE_irq.o := n
>>> +KASAN_SANITIZE_mmu_pv.o := n
>>> +KASAN_SANITIZE_p2m.o := n
>>> +KASAN_SANITIZE_multicalls.o := n
>>> +
>>>    # SPDX-License-Identifier: GPL-2.0
>>>    OBJECT_FILES_NON_STANDARD_xen-asm_$(BITS).o := y
>>>    diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
>>> index ae4a41ca19f6..27de55699f24 100644
>>> --- a/arch/x86/xen/enlighten_pv.c
>>> +++ b/arch/x86/xen/enlighten_pv.c
>>> @@ -72,6 +72,7 @@
>>>    #include <asm/mwait.h>
>>>    #include <asm/pci_x86.h>
>>>    #include <asm/cpu.h>
>>> +#include <asm/kasan.h>
>>>      #ifdef CONFIG_ACPI
>>>    #include <linux/acpi.h>
>>> @@ -1231,6 +1232,8 @@ asmlinkage __visible void __init xen_start_kernel(void)
>>>        /* Get mfn list */
>>>        xen_build_dynamic_phys_to_machine();
>>>    +    kasan_early_init();
>>> +
>>>        /*
>>>         * Set up kernel GDT and segment registers, mainly so that
>>>         * -fstack-protector code can be executed.
>>> diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c
>>> index c8dbee62ec2a..eaf63f1f26af 100644
>>> --- a/arch/x86/xen/mmu_pv.c
>>> +++ b/arch/x86/xen/mmu_pv.c
>>> @@ -1079,7 +1079,7 @@ static void xen_exit_mmap(struct mm_struct *mm)
>>>      static void xen_post_allocator_init(void);
>>>    -static void __init pin_pagetable_pfn(unsigned cmd, unsigned long pfn)
>>> +void __init pin_pagetable_pfn(unsigned cmd, unsigned long pfn)
>>>    {
>>>        struct mmuext_op op;
>>>    @@ -1767,7 +1767,7 @@ static void __init set_page_prot_flags(void *addr, pgprot_t prot,
>>>        if (HYPERVISOR_update_va_mapping((unsigned long)addr, pte, flags))
>>>            BUG();
>>>    }
>>> -static void __init set_page_prot(void *addr, pgprot_t prot)
>>> +void __init set_page_prot(void *addr, pgprot_t prot)
>>>    {
>>>        return set_page_prot_flags(addr, prot, UVMF_NONE);
>>>    }
>>> @@ -1943,6 +1943,15 @@ void __init xen_setup_kernel_pagetable(pgd_t *pgd, unsigned long max_pfn)
>>>        if (i && i < pgd_index(__START_KERNEL_map))
>>>            init_top_pgt[i] = ((pgd_t *)xen_start_info->pt_base)[i];
>>>    +#ifdef CONFIG_KASAN
>>> +    /*
>>> +     * Copy KASAN mappings
>>> +     * ffffec0000000000 - fffffbffffffffff (=44 bits) kasan shadow memory (16TB)
>>> +     */
>>> +    for (i = 0xec0 >> 3; i < 0xfc0 >> 3; i++)
>>> +        init_top_pgt[i] = ((pgd_t *)xen_start_info->pt_base)[i];
>>> +#endif
>>> +
>>>        /* Make pagetable pieces RO */
>>>        set_page_prot(init_top_pgt, PAGE_KERNEL_RO);
>>>        set_page_prot(level3_ident_pgt, PAGE_KERNEL_RO);
>>> diff --git a/arch/x86/xen/multicalls.c b/arch/x86/xen/multicalls.c
>>> index 07054572297f..5e4729efbbe2 100644
>>> --- a/arch/x86/xen/multicalls.c
>>> +++ b/arch/x86/xen/multicalls.c
>>> @@ -99,6 +99,15 @@ void xen_mc_flush(void)
>>>                    ret++;
>>>        }
>>>    +    /*
>>> +     * XXX: Kasan produces quite a lot (~2000) of warnings in a form of:
>>> +     *
>>> +     *     (XEN) mm.c:3222:d155v0 mfn 3704b already pinned
>>> +     *
>>> +     * during kasan_init(). They are benign, but silence them for now.
>>> +     * Otherwise, booting takes too long due to printk() spam.
>>> +     */
>>> +#ifndef CONFIG_KASAN
>>
>> It might be interesting to identify the problematic page tables.
>>
>> I guess this would require some hacking to avoid the multicalls in order
>> to identify which page table should not be pinned again.
> 
> I tracked this down to xen_alloc_ptpage() in mmu_pv.c:
> 
> 			if (level == PT_PTE && USE_SPLIT_PTE_PTLOCKS)
> 				__pin_pagetable_pfn(MMUEXT_PIN_L1_TABLE, pfn);
> 
> kasan_populate_early_shadow() is doing lots pmd_populate_kernel() with
> kasan_early_shadow_pte (mfn of which is reported by Xen). Currently I'm not
> sure how to fix that. Is it possible to check that pfn has already been pinned
> from Linux kernel? xen_page_pinned() seems to be an incorrect way to check that.

Right, xen_page_pinned() is not yet working at this stage of booting.

But using pmd_populate_kernel() with the same page table multiple times
is just wrong. Doing so the first time is fine, all the other cases
should just use set_pmd().


Juergen

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ