linux-kernel - [PATCH -mm] mm, hugetlb: Pass fault address to no page handler

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <20180515005756.28942-1-ying.huang@intel.com>
Date:   Tue, 15 May 2018 08:57:56 +0800
From:   "Huang, Ying" <ying.huang@...el.com>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Huang Ying <ying.huang@...el.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Andi Kleen <andi.kleen@...el.com>, Jan Kara <jack@...e.cz>,
        Michal Hocko <mhocko@...e.com>,
        Matthew Wilcox <mawilcox@...rosoft.com>,
        Hugh Dickins <hughd@...gle.com>,
        Minchan Kim <minchan@...nel.org>, Shaohua Li <shli@...com>,
        Christopher Lameter <cl@...ux.com>,
        "Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
        Punit Agrawal <punit.agrawal@....com>,
        Anshuman Khandual <khandual@...ux.vnet.ibm.com>
Subject: [PATCH -mm] mm, hugetlb: Pass fault address to no page handler

From: Huang Ying <ying.huang@...el.com>

This is to take better advantage of huge page clearing
optimization (c79b57e462b5d, "mm: hugetlb: clear target sub-page last
when clearing huge page").  Which will clear to access sub-page last
to avoid the cache lines of to access sub-page to be evicted when
clearing other sub-pages.  This needs to get the address of the
sub-page to access, that is, the fault address inside of the huge
page.  So the hugetlb no page fault handler is changed to pass that
information.  This will benefit workloads which don't access the begin
of the huge page after page fault.

With this patch, the throughput increases ~28.1% in vm-scalability
anon-w-seq test case with 88 processes on a 2 socket Xeon E5 2699 v4
system (44 cores, 88 threads).  The test case creates 88 processes,
each process mmap a big anonymous memory area and writes to it from
the end to the begin.  For each process, other processes could be seen
as other workload which generates heavy cache pressure.  At the same
time, the cache miss rate reduced from ~36.3% to ~25.6%, the
IPC (instruction per cycle) increased from 0.3 to 0.37, and the time
spent in user space is reduced ~19.3%

Signed-off-by: "Huang, Ying" <ying.huang@...el.com>
Cc: Andrea Arcangeli <aarcange@...hat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Cc: Andi Kleen <andi.kleen@...el.com>
Cc: Jan Kara <jack@...e.cz>
Cc: Michal Hocko <mhocko@...e.com>
Cc: Matthew Wilcox <mawilcox@...rosoft.com>
Cc: Hugh Dickins <hughd@...gle.com>
Cc: Minchan Kim <minchan@...nel.org>
Cc: Shaohua Li <shli@...com>
Cc: Christopher Lameter <cl@...ux.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>
Cc: Punit Agrawal <punit.agrawal@....com>
Cc: Anshuman Khandual <khandual@...ux.vnet.ibm.com>
---
 mm/hugetlb.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 129088710510..3de6326abf39 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3677,7 +3677,7 @@ int huge_add_to_page_cache(struct page *page, struct address_space *mapping,
 
 static int hugetlb_no_page(struct mm_struct *mm, struct vm_area_struct *vma,
 			   struct address_space *mapping, pgoff_t idx,
-			   unsigned long address, pte_t *ptep, unsigned int flags)
+			   unsigned long faddress, pte_t *ptep, unsigned int flags)
 {
 	struct hstate *h = hstate_vma(vma);
 	int ret = VM_FAULT_SIGBUS;
@@ -3686,6 +3686,7 @@ static int hugetlb_no_page(struct mm_struct *mm, struct vm_area_struct *vma,
 	struct page *page;
 	pte_t new_pte;
 	spinlock_t *ptl;
+	unsigned long address = faddress & huge_page_mask(h);
 
 	/*
 	 * Currently, we are forced to kill the process in the event the
@@ -3749,7 +3750,7 @@ static int hugetlb_no_page(struct mm_struct *mm, struct vm_area_struct *vma,
 				ret = VM_FAULT_SIGBUS;
 			goto out;
 		}
-		clear_huge_page(page, address, pages_per_huge_page(h));
+		clear_huge_page(page, faddress, pages_per_huge_page(h));
 		__SetPageUptodate(page);
 		set_page_huge_active(page);
 
@@ -3871,7 +3872,7 @@ u32 hugetlb_fault_mutex_hash(struct hstate *h, struct mm_struct *mm,
 #endif
 
 int hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
-			unsigned long address, unsigned int flags)
+			unsigned long faddress, unsigned int flags)
 {
 	pte_t *ptep, entry;
 	spinlock_t *ptl;
@@ -3883,8 +3884,7 @@ int hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 	struct hstate *h = hstate_vma(vma);
 	struct address_space *mapping;
 	int need_wait_lock = 0;
-
-	address &= huge_page_mask(h);
+	unsigned long address = faddress & huge_page_mask(h);
 
 	ptep = huge_pte_offset(mm, address, huge_page_size(h));
 	if (ptep) {
@@ -3914,7 +3914,7 @@ int hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 
 	entry = huge_ptep_get(ptep);
 	if (huge_pte_none(entry)) {
-		ret = hugetlb_no_page(mm, vma, mapping, idx, address, ptep, flags);
+		ret = hugetlb_no_page(mm, vma, mapping, idx, faddress, ptep, flags);
 		goto out_mutex;
 	}
 
-- 
2.16.1