[<prev] [next>] [day] [month] [year] [list]
Message-ID: <1350903750.2768.90.camel@twins>
Date: Mon, 22 Oct 2012 13:02:30 +0200
From: Peter Zijlstra <a.p.zijlstra@...llo.nl>
To: linux-kernel@...r.kernel.org, mingo@...nel.org, hpa@...or.com,
torvalds@...ux-foundation.org, riel@...hat.com,
akpm@...ux-foundation.org, aarcange@...hat.com, tglx@...utronix.de
Cc: linux-tip-commits@...r.kernel.org
Subject: Re: [tip:numa/core] x86, mm: Prevent gcc to re-read the pagetables
On Sun, 2012-10-21 at 05:56 -0700, tip-bot for Andrea Arcangeli wrote:
> In get_user_pages_fast() the TLB shootdown code can clear the pagetables
> before firing any TLB flush (the page can't be freed until the TLB
> flushing IPI has been delivered but the pagetables will be cleared well
> before sending any TLB flushing IPI).
I think we want to do this for all gup_fast() implementations. When I
reported this issue I also proposed adding something like
page_table_deref() which we could use through-out. Not sure we want to,
but at least all archs need an audit for this.
> ---
> arch/x86/mm/gup.c | 23 ++++++++++++++++++++---
> mm/memory.c | 2 +-
> 2 files changed, 21 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/mm/gup.c b/arch/x86/mm/gup.c
> index dd74e46..6dc9921 100644
> --- a/arch/x86/mm/gup.c
> +++ b/arch/x86/mm/gup.c
> @@ -150,7 +150,13 @@ static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end,
>
> pmdp = pmd_offset(&pud, addr);
> do {
> - pmd_t pmd = *pmdp;
> + /*
> + * With THP and hugetlbfs the pmd can change from
> + * under us and it can be cleared as well by the TLB
> + * shootdown, so read it with ACCESS_ONCE to do all
> + * computations on the same sampling.
> + */
> + pmd_t pmd = ACCESS_ONCE(*pmdp);
>
> next = pmd_addr_end(addr, end);
> /*
> @@ -220,7 +226,13 @@ static int gup_pud_range(pgd_t pgd, unsigned long addr, unsigned long end,
>
> pudp = pud_offset(&pgd, addr);
> do {
> - pud_t pud = *pudp;
> + /*
> + * With hugetlbfs giga pages the pud can change from
> + * under us and it can be cleared as well by the TLB
> + * shootdown, so read it with ACCESS_ONCE to do all
> + * computations on the same sampling.
> + */
> + pud_t pud = ACCESS_ONCE(*pudp);
>
> next = pud_addr_end(addr, end);
> if (pud_none(pud))
> @@ -280,7 +292,12 @@ int __get_user_pages_fast(unsigned long start, int nr_pages, int write,
> local_irq_save(flags);
> pgdp = pgd_offset(mm, addr);
> do {
> - pgd_t pgd = *pgdp;
> + /*
> + * The pgd could be cleared by the TLB shootdown from
> + * under us so read it with ACCESS_ONCE to do all
> + * computations on the same sampling.
> + */
> + pgd_t pgd = ACCESS_ONCE(*pgdp);
>
> next = pgd_addr_end(addr, end);
> if (pgd_none(pgd))
> diff --git a/mm/memory.c b/mm/memory.c
> index cc8e280..c0de477 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3555,7 +3555,7 @@ int handle_pte_fault(struct mm_struct *mm,
> pte_t entry;
> spinlock_t *ptl;
>
> - entry = *pte;
> + entry = ACCESS_ONCE(*pte);
> if (!pte_present(entry)) {
> if (pte_none(entry)) {
> if (vma->vm_ops) {
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists