lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8cd026a0-ada6-9ae5-9ea1-a685b482173c@kernel.dk>
Date:   Mon, 1 Mar 2021 18:35:45 -0700
From:   Jens Axboe <axboe@...nel.dk>
To:     "Alex Xu (Hello71)" <alex_y_xu@...oo.ca>,
        linux-kernel@...r.kernel.org, linux-block@...r.kernel.org
Subject: Re: 5.12-rc1 regression: freezing iou-mgr/wrk failed

On 3/1/21 6:25 PM, Jens Axboe wrote:
> On 3/1/21 6:11 PM, Jens Axboe wrote:
>> On 3/1/21 5:57 PM, Alex Xu (Hello71) wrote:
>>> Hi,
>>>
>>> On Linux 5.12-rc1, I am unable to suspend to RAM. The system freezes for 
>>> about 40 seconds and then continues operation. The following messages 
>>> are printed to the kernel log:
>>>
>>> [  240.650300] PM: suspend entry (deep)
>>> [  240.650748] Filesystems sync: 0.000 seconds
>>> [  240.725605] Freezing user space processes ...
>>> [  260.739483] Freezing of tasks failed after 20.013 seconds (3 tasks refusing to freeze, wq_busy=0):
>>> [  260.739497] task:iou-mgr-446     state:S stack:    0 pid:  516 ppid:   439 flags:0x00004224
>>> [  260.739504] Call Trace:
>>> [  260.739507]  ? sysvec_apic_timer_interrupt+0xb/0x81
>>> [  260.739515]  ? pick_next_task_fair+0x197/0x1cde
>>> [  260.739519]  ? sysvec_reschedule_ipi+0x2f/0x6a
>>> [  260.739522]  ? asm_sysvec_reschedule_ipi+0x12/0x20
>>> [  260.739525]  ? __schedule+0x57/0x6d6
>>> [  260.739529]  ? del_timer_sync+0xb9/0x115
>>> [  260.739533]  ? schedule+0x63/0xd5
>>> [  260.739536]  ? schedule_timeout+0x219/0x356
>>> [  260.739540]  ? __next_timer_interrupt+0xf1/0xf1
>>> [  260.739544]  ? io_wq_manager+0x73/0xb1
>>> [  260.739549]  ? io_wq_create+0x262/0x262
>>> [  260.739553]  ? ret_from_fork+0x22/0x30
>>> [  260.739557] task:iou-mgr-517     state:S stack:    0 pid:  522 ppid:   439 flags:0x00004224
>>> [  260.739561] Call Trace:
>>> [  260.739563]  ? sysvec_apic_timer_interrupt+0xb/0x81
>>> [  260.739566]  ? pick_next_task_fair+0x16f/0x1cde
>>> [  260.739569]  ? sysvec_apic_timer_interrupt+0xb/0x81
>>> [  260.739571]  ? asm_sysvec_apic_timer_interrupt+0x12/0x20
>>> [  260.739574]  ? __schedule+0x5b7/0x6d6
>>> [  260.739578]  ? del_timer_sync+0x70/0x115
>>> [  260.739581]  ? schedule_timeout+0x211/0x356
>>> [  260.739585]  ? __next_timer_interrupt+0xf1/0xf1
>>> [  260.739588]  ? io_wq_check_workers+0x15/0x11f
>>> [  260.739592]  ? io_wq_manager+0x69/0xb1
>>> [  260.739596]  ? io_wq_create+0x262/0x262
>>> [  260.739600]  ? ret_from_fork+0x22/0x30
>>> [  260.739603] task:iou-wrk-517     state:S stack:    0 pid:  523 ppid:   439 flags:0x00004224
>>> [  260.739607] Call Trace:
>>> [  260.739609]  ? __schedule+0x5b7/0x6d6
>>> [  260.739614]  ? schedule+0x63/0xd5
>>> [  260.739617]  ? schedule_timeout+0x219/0x356
>>> [  260.739621]  ? __next_timer_interrupt+0xf1/0xf1
>>> [  260.739624]  ? task_thread.isra.0+0x148/0x3af
>>> [  260.739628]  ? task_thread_unbound+0xa/0xa
>>> [  260.739632]  ? task_thread_bound+0x7/0x7
>>> [  260.739636]  ? ret_from_fork+0x22/0x30
>>> [  260.739647] OOM killer enabled.
>>> [  260.739648] Restarting tasks ... done.
>>> [  260.740077] PM: suspend exit
>>>
>>> and then a set of similar messages except with s2idle instead of deep.
>>>
>>> Reverting 5695e51619 ("Merge tag 'io_uring-worker.v3-2021-02-25' of 
>>> git://git.kernel.dk/linux-block") appears to resolve the issue. I have 
>>> not yet bisected further. Let me know which troubleshooting steps I 
>>> should perform next.
>>
>> Can you try and pull in:
>>
>> git://git.kernel.dk/linux-block io_uring-5.12
>>
>> and see if that resolves it? I usually always run -git on my laptop as
>> well, but something broke it in the merge window so I need to figure
>> out what that is first...
>>
>> What distro are you running?
> 
> You probably want this on top...

And if you've verified that that one works OK, can you try this variant
instead?

diff --git a/fs/io-wq.c b/fs/io-wq.c
index 1fdb2b621b51..fe004cf93c4b 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -16,6 +16,7 @@
 #include <linux/rculist_nulls.h>
 #include <linux/cpu.h>
 #include <linux/tracehook.h>
+#include <linux/freezer.h>
 
 #include "../kernel/sched/sched.h"
 #include "io-wq.h"
@@ -480,6 +481,7 @@ static int io_wqe_worker(void *data)
 	io_worker_start(worker);
 
 	while (!test_bit(IO_WQ_BIT_EXIT, &wq->state)) {
+		try_to_freeze();
 		set_current_state(TASK_INTERRUPTIBLE);
 loop:
 		raw_spin_lock_irq(&wqe->lock);
@@ -731,6 +733,7 @@ static int io_wq_manager(void *data)
 		set_current_state(TASK_INTERRUPTIBLE);
 		io_wq_check_workers(wq);
 		schedule_timeout(HZ);
+		try_to_freeze();
 		if (fatal_signal_pending(current))
 			set_bit(IO_WQ_BIT_EXIT, &wq->state);
 	} while (!test_bit(IO_WQ_BIT_EXIT, &wq->state));
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 2757675ab417..03c42f1f9862 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -74,13 +74,11 @@
 #include <linux/fsnotify.h>
 #include <linux/fadvise.h>
 #include <linux/eventpoll.h>
-#include <linux/fs_struct.h>
 #include <linux/splice.h>
 #include <linux/task_work.h>
 #include <linux/pagemap.h>
 #include <linux/io_uring.h>
-#include <linux/blk-cgroup.h>
-#include <linux/audit.h>
+#include <linux/freezer.h>
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/io_uring.h>
@@ -6744,6 +6748,7 @@ static int io_sq_thread(void *data)
 				io_ring_set_wakeup_flag(ctx);
 
 			schedule();
+			try_to_freeze();
 			list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
 				io_ring_clear_wakeup_flag(ctx);
 		}

-- 
Jens Axboe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ