lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <1646803679-11433-1-git-send-email-quic_charante@quicinc.com>
Date:   Wed, 9 Mar 2022 10:57:59 +0530
From:   Charan Teja Kalla <quic_charante@...cinc.com>
To:     <akpm@...ux-foundation.org>, <yuehaibing@...wei.com>,
        <minchan@...nel.org>, <sfr@...b.auug.org.au>,
        <rientjes@...gle.com>, <edgararriaga@...gle.com>, <mhocko@...e.com>
CC:     <linux-mm@...ck.org>, <linux-kernel@...r.kernel.org>,
        Charan Teja Kalla <quic_charante@...cinc.com>
Subject: [PATCH] mm: madvise: return correct bytes advised with process_madvise

The process_madvise() system call returns error even after processing
some VMA's passed in the 'struct iovec' vector list which leaves the
user confused to know where to restart the advise next. It is also
against this syscall man page[1] documentation where it mentions that
"return value may be less than the total number of requested bytes, if
an error occurred after some iovec elements were already processed.".

Consider a user passed 10 VMA's in the 'struct iovec' vector list of
which 9 are processed but one. Then it just returns the error caused on
that failed VMA despite the first 9 VMA's processed, leaving the user
confused about on which VMA it is failed. Returning the number of bytes
processed here can help the user to know which VMA it is failed on and
thus can retry/skip the advise on that VMA.

[1]https://man7.org/linux/man-pages/man2/process_madvise.2.html.

Fixes: ecb8ac8b1f14("mm/madvise: introduce process_madvise() syscall: an external memory hinting API"
Signed-off-by: Charan Teja Kalla <quic_charante@...cinc.com>
---
 mm/madvise.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/mm/madvise.c b/mm/madvise.c
index 38d0f51..d3b49b3 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -1426,15 +1426,21 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
 
 	while (iov_iter_count(&iter)) {
 		iovec = iov_iter_iovec(&iter);
+		/*
+		 * Even when [start, end) passed to do_madvise covers
+		 * some unmapped addresses, it continues processing with
+		 * returning ENOMEM at the end. Thus consider the range
+		 * as processed when do_madvise() returns ENOMEM.
+		 * This makes process_madvise() never returns ENOMEM.
+		 */
 		ret = do_madvise(mm, (unsigned long)iovec.iov_base,
 					iovec.iov_len, behavior);
-		if (ret < 0)
+		if (ret < 0 && ret != -ENOMEM)
 			break;
 		iov_iter_advance(&iter, iovec.iov_len);
 	}
 
-	if (ret == 0)
-		ret = total_len - iov_iter_count(&iter);
+	ret = (total_len - iov_iter_count(&iter)) ? : ret;
 
 release_mm:
 	mmput(mm);
-- 
2.7.4

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ