lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1382033352-21225-1-git-send-email-damien.ramonda@intel.com>
Date:	Thu, 17 Oct 2013 20:09:12 +0200
From:	Damien Ramonda <damien.ramonda@...el.com>
To:	linux-mm@...ck.org
Cc:	linux-kernel@...r.kernel.org, damien.ramonda@...el.com,
	pierre.tardy@...el.com, fengguang.wu@...el.com,
	david.a.cohen@...el.com
Subject: [PATCH] readahead: fix sequential read cache miss detection

The kernel's readahead algorithm sometimes interprets random read
accesses as sequential and triggers unnecessary data prefecthing
from storage device (impacting random read average latency).

In order to identify sequential cache read misses, the readahead
algorithm intends to check whether offset - previous offset == 1
(trivial sequential reads) or offset - previous offset == 0
(sequential reads not aligned on page boundary):

if (offset - (ra->prev_pos >> PAGE_CACHE_SHIFT) <= 1UL)

The current offset is stored in the "offset" variable of type
"pgoff_t" (unsigned long), while previous offset is stored in
"ra->prev_pos" of type "loff_t" (long long). Therefore,
operands of the if statement are implicitly converted to type
long long. Consequently, when previous offset > current offset
(which happens on random pattern), the if condition is true
and access is wrongly interpeted as sequential. An unnecessary
data prefetching is triggered, impacting the average
random read latency.

Storing the previous offset value in a "pgoff_t" variable
(unsigned long) fixes the sequential read detection logic.

Signed-off-by: Damien Ramonda <damien.ramonda@...el.com>
Reviewed-by: Fengguang Wu <fengguang.wu@...el.com>
Acked-by: Pierre Tardy <pierre.tardy@...el.com>
Acked-by: David Cohen <david.a.cohen@...ux.intel.com>
---
 mm/readahead.c |    6 +++++-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/mm/readahead.c b/mm/readahead.c
index e4ed041..5b637b5 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -401,6 +401,7 @@ ondemand_readahead(struct address_space *mapping,
 		   unsigned long req_size)
 {
 	unsigned long max = max_sane_readahead(ra->ra_pages);
+	pgoff_t prev_offset;
 
 	/*
 	 * start of file
@@ -452,8 +453,11 @@ ondemand_readahead(struct address_space *mapping,
 
 	/*
 	 * sequential cache miss
+	 * trivial case: (offset - prev_offset) == 1
+	 * unaligned reads: (offset - prev_offset) == 0
 	 */
-	if (offset - (ra->prev_pos >> PAGE_CACHE_SHIFT) <= 1UL)
+	prev_offset = (unsigned long long)ra->prev_pos >> PAGE_CACHE_SHIFT;
+	if (offset - prev_offset <= 1UL)
 		goto initial_readahead;
 
 	/*
-- 
1.7.0.4

---------------------------------------------------------------------
Intel Corporation SAS (French simplified joint stock company)
Registered headquarters: "Les Montalets"- 2, rue de Paris, 
92196 Meudon Cedex, France
Registration Number:  302 456 199 R.C.S. NANTERRE
Capital: 4,572,000 Euros

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ