lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241106134741.26948-1-othacehe@gnu.org>
Date: Wed,  6 Nov 2024 14:47:40 +0100
From: Mathieu Othacehe <othacehe@....org>
To: Theodore Ts'o <tytso@....edu>,
	Andreas Dilger <adilger.kernel@...ger.ca>
Cc: linux-ext4@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	lukas.skupinski@...disgyr.com,
	anton.reding@...disgyr.com,
	Mathieu Othacehe <othacehe@....org>
Subject: [PATCH 0/1] ext4: Prevent an infinite loop in the lazyinit thread.

Hello,

Under the following conditions, the lazyinit thread can reschedule itself
indefinitely without doing anything, consuming a large amount of the system
resources:

In the ext4_run_li_request function, a start_time timestamp is taken. Right
before elr->lr_timeout is computed, in the same function, the system clock is
updated in userspace, from the Unix Epoch to the current time. The
elr->lr_timeout takes a large value. The elr->lr_next_sched is then set to a
value far away in the future.

/*
 * Away from jiffies because of a time jump when computing
 * elr->lr_timeout.
 */
elr->lr_next_sched = jiffies + elr->lr_timeout;

Back, in the ext4_lazyinit_thread that called the ext4_run_li_request, the
following condition can be false:

// elr->lr_next_sched > next_wakeup
if (time_before(elr->lr_next_sched, next_wakeup))
        next_wakeup = elr->lr_next_sched;

so that next_wakeup is not updated. Assuming that next_wakeup was not updated
above and still has the MAX_JIFFY_OFFSET value, the following condition will
be true:

// next_wakeup == MAX_JIFFY_OFFSET
if ((time_after_eq(cur, next_wakeup)) ||
    (MAX_JIFFY_OFFSET == next_wakeup)) {
	cond_resched();
	continue;
}

causing us to process the li_request_list again. If we now have jiffies < 
elr->lr_next_sched, as we have already elr->lr_next_sched > next_wakeup, we
will just continue without updating next_wakeup,

// jiffies < elr->lr_next_sched && elr->lr_next_sched > next_wakeup
if (time_before(jiffies, elr->lr_next_sched)) {
	if (time_before(elr->lr_next_sched, next_wakeup))
		next_wakeup = elr->lr_next_sched;
	continue;
}

and again, we will call cond_resched because next_wakeup is not updated, and
we now have an infinite loop.

This was put into evidence with the following values:

jiffies = 4294938821
elr->lr_next_sched = 1966790060
next_wakeup = 1073741822 (MAX_JIFFY_OFFSET)

on an armv7 (32 bits) system, without an RTC, while updating the system clock
during the lazyinit thread is working.

Fix that by using ktime_get_ns insted of ktime_get_real_ns and by using a
boolean instead of MAX_JIFFY_OFFSET to determine whether the next_wakeup value
has been set.

Thanks,

Mathieu

Mathieu Othacehe (1):
  ext4: Prevent an infinite loop in the lazyinit thread.

 fs/ext4/super.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

-- 
2.46.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ