lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOSf1CHjkyX2NTex7dc1AEHXSDcWA_UGYX8NoSyHpb5s_RkwXQ@mail.gmail.com>
Date:   Thu, 28 Feb 2019 20:40:29 +1100
From:   Oliver <oohall@...il.com>
To:     "Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>,
        Dan Williams <dan.j.williams@...el.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        Jan Kara <jack@...e.cz>, Michael Ellerman <mpe@...erman.id.au>,
        Ross Zwisler <zwisler@...nel.org>,
        Linux MM <linux-mm@...ck.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>
Subject: Re: [PATCH 2/2] mm/dax: Don't enable huge dax mapping by default

On Thu, Feb 28, 2019 at 7:35 PM Aneesh Kumar K.V
<aneesh.kumar@...ux.ibm.com> wrote:
>
> Add a flag to indicate the ability to do huge page dax mapping. On architecture
> like ppc64, the hypervisor can disable huge page support in the guest. In
> such a case, we should not enable huge page dax mapping. This patch adds
> a flag which the architecture code will update to indicate huge page
> dax mapping support.

*groan*

> Architectures mostly do transparent_hugepage_flag = 0; if they can't
> do hugepages. That also takes care of disabling dax hugepage mapping
> with this change.
>
> Without this patch we get the below error with kvm on ppc64.
>
> [  118.849975] lpar: Failed hash pte insert with error -4
>
> NOTE: The patch also use
>
> echo never > /sys/kernel/mm/transparent_hugepage/enabled
> to disable dax huge page mapping.
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@...ux.ibm.com>
> ---
> TODO:
> * Add Fixes: tag
>
>  include/linux/huge_mm.h | 4 +++-
>  mm/huge_memory.c        | 4 ++++
>  2 files changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> index 381e872bfde0..01ad5258545e 100644
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
> @@ -53,6 +53,7 @@ vm_fault_t vmf_insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr,
>                         pud_t *pud, pfn_t pfn, bool write);
>  enum transparent_hugepage_flag {
>         TRANSPARENT_HUGEPAGE_FLAG,
> +       TRANSPARENT_HUGEPAGE_DAX_FLAG,
>         TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG,
>         TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG,
>         TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG,
> @@ -111,7 +112,8 @@ static inline bool __transparent_hugepage_enabled(struct vm_area_struct *vma)
>         if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_FLAG))
>                 return true;
>
> -       if (vma_is_dax(vma))
> +       if (vma_is_dax(vma) &&
> +           (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_DAX_FLAG)))
>                 return true;

Forcing PTE sized faults should be fine for fsdax, but it'll break
devdax. The devdax driver requires the fault size be >= the namespace
alignment since devdax tries to guarantee hugepage mappings will be
used and PMD alignment is the default. We can probably have devdax
fall back to the largest size the hypervisor has made available, but
it does run contrary to the design. Ah well, I suppose it's better off
being degraded rather than unusable.

>         if (transparent_hugepage_flags &
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index faf357eaf0ce..43d742fe0341 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -53,6 +53,7 @@ unsigned long transparent_hugepage_flags __read_mostly =
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE_MADVISE
>         (1<<TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG)|
>  #endif
> +       (1 << TRANSPARENT_HUGEPAGE_DAX_FLAG) |
>         (1<<TRANSPARENT_HUGEPAGE_DEFRAG_REQ_MADV_FLAG)|
>         (1<<TRANSPARENT_HUGEPAGE_DEFRAG_KHUGEPAGED_FLAG)|
>         (1<<TRANSPARENT_HUGEPAGE_USE_ZERO_PAGE_FLAG);
> @@ -475,6 +476,8 @@ static int __init setup_transparent_hugepage(char *str)
>                           &transparent_hugepage_flags);
>                 clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG,
>                           &transparent_hugepage_flags);
> +               clear_bit(TRANSPARENT_HUGEPAGE_DAX_FLAG,
> +                         &transparent_hugepage_flags);
>                 ret = 1;
>         }
>  out:

> @@ -753,6 +756,7 @@ static void insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr,
>         spinlock_t *ptl;
>
>         ptl = pmd_lock(mm, pmd);
> +       /* should we check for none here again? */

VM_WARN_ON() maybe? If THP is disabled and we're here then something
has gone wrong.

>         entry = pmd_mkhuge(pfn_t_pmd(pfn, prot));
>         if (pfn_t_devmap(pfn))
>                 entry = pmd_mkdevmap(entry);
> --
> 2.20.1
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ