lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 07 Jan 2013 21:14:54 -0600
From:	Simon Jeons <simon.jeons@...il.com>
To:	Shaohua Li <shli@...nel.org>
Cc:	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	akpm@...ux-foundation.org, mingo@...hat.com, hpa@...or.com,
	riel@...hat.com
Subject: Re: [RFC]x86: clearing access bit don't flush tlb

Hi Shaohua,

On Mon, 2013-01-07 at 16:12 +0800, Shaohua Li wrote:
> We use access bit to age a page at page reclaim. When clearing pte access bit,

Who sets this flag to pte? mmu? tlb?

> we could skip tlb flush for the virtual address. The side effect is if the pte
> is in tlb and pte access bit is unset, when cpu access the page again, cpu will
> not set pte's access bit. So next time page reclaim can reclaim hot pages
> wrongly, but this doesn't corrupt anything. And according to intel manual, tlb
> has less than 1k entries, which coverers < 4M memory. In today's system,
> several giga byte memory is normal. After page reclaim clears pte access bit
> and before cpu access the page again, it's quite unlikely this page's pte is
> still in TLB. Skiping the tlb flush for this case sounds ok to me.
> 

If one page is accessed more frequently than the other page before page
reclaim, page reclaim treat them the same hot according to access flag
since the flag used to age page just at page reclaim. How to handle this
issue?

> And in some workloads, TLB flush overhead is very heavy. In my simple
> multithread app with a lot of swap to several pcie SSD, removing the tlb flush
> gives about 20% ~ 30% swapout speedup.
> 
> Signed-off-by: Shaohua Li <shli@...ionio.com>
> ---
>  arch/x86/mm/pgtable.c |    7 +------
>  1 file changed, 1 insertion(+), 6 deletions(-)
> 
> Index: linux/arch/x86/mm/pgtable.c
> ===================================================================
> --- linux.orig/arch/x86/mm/pgtable.c	2012-12-17 16:54:37.847770807 +0800
> +++ linux/arch/x86/mm/pgtable.c	2013-01-07 14:59:40.898066357 +0800
> @@ -376,13 +376,8 @@ int pmdp_test_and_clear_young(struct vm_
>  int ptep_clear_flush_young(struct vm_area_struct *vma,
>  			   unsigned long address, pte_t *ptep)
>  {
> -	int young;
>  
> -	young = ptep_test_and_clear_young(vma, address, ptep);
> -	if (young)
> -		flush_tlb_page(vma, address);
> -
> -	return young;
> +	return ptep_test_and_clear_young(vma, address, ptep);
>  }
>  
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@...ck.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@...ck.org"> email@...ck.org </a>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ