[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1446662406-4590-8-git-send-email-jsimmons@infradead.org>
Date: Wed, 4 Nov 2015 13:40:04 -0500
From: James Simmons <jsimmons@...radead.org>
To: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
devel@...verdev.osuosl.org, Oleg Drokin <oleg.drokin@...el.com>,
Andreas Dilger <andreas.dilger@...el.com>
Cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
lustre-devel@...ts.lustre.org, Fan Yong <fan.yong@...el.com>
Subject: [PATCH 08/10] staging: lustre: race condition for check/use cfs_fail_val
From: Fan Yong <fan.yong@...el.com>
There are some race conditions when check/use cfs_fail_val.
For example: when inject failure stub for LFSCK test as following:
764 if (OBD_FAIL_CHECK(OBD_FAIL_LFSCK_DELAY2) &&
765 cfs_fail_val > 0) {
766 struct l_wait_info lwi;
767
768 lwi = LWI_TIMEOUT(cfs_time_seconds(cfs_fail_val),
769 NULL, NULL);
770 l_wait_event(thread->t_ctl_waitq,
771 !thread_is_running(thread),
772 &lwi);
773
774 if (unlikely(!thread_is_running(thread))) {
775 CDEBUG(D_LFSCK, "%s: scan dir exit for engine "
776 "stop, parent "DFID", cookie "LPX64"n",
777 lfsck_lfsck2name(lfsck),
778 PFID(lfsck_dto2fid(dir)),
779 lfsck->li_cookie_dir);
780 RETURN(0);
781 }
782 }
The "cfs_fail_val" may be changed as zero by others after the check
at the line 765 but before using it at the line 768. Then the LFSCK
engine will fall into "wait" until someone run "lfsck_stop".
Signed-off-by: Fan Yong <fan.yong@...el.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6146
Reviewed-on: http://review.whamcloud.com/13481
Reviewed-by: Lai Siyao <lai.siyao@...el.com>
Reviewed-by: Andreas Dilger <andreas.dilger@...el.com>
---
drivers/staging/lustre/lustre/libcfs/fail.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/drivers/staging/lustre/lustre/libcfs/fail.c b/drivers/staging/lustre/lustre/libcfs/fail.c
index d39fece..ea059b0 100644
--- a/drivers/staging/lustre/lustre/libcfs/fail.c
+++ b/drivers/staging/lustre/lustre/libcfs/fail.c
@@ -126,7 +126,7 @@ int __cfs_fail_timeout_set(__u32 id, __u32 value, int ms, int set)
int ret;
ret = __cfs_fail_check_set(id, value, set);
- if (ret) {
+ if (ret && likely(ms > 0)) {
CERROR("cfs_fail_timeout id %x sleeping for %dms\n",
id, ms);
set_current_state(TASK_UNINTERRUPTIBLE);
--
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists