lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 19 Aug 2020 14:12:24 +0100
From:   Jim Baxter <jim_baxter@...tor.com>
To:     <linux-fsdevel@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        <linux-mm@...ck.org>, <linux-block@...r.kernel.org>
CC:     "Resch Carsten (CM/ESO6)" <Carsten.Resch@...bosch.com>,
        "Rosca, Eugeniu (ADITG/ESB)" <erosca@...adit-jv.com>
Subject: Re: PROBLEM: Long Workqueue delays V2

Added linux-block List which may also be relevant to this issue.

-------- Original Message --------
Subject: PROBLEM: Long Workqueue delays V2
From: Jim Baxter <jim_baxter@...tor.com>
To: linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org, linux-mm@...ck.org
CC: "Resch Carsten (CM/ESO6)" <Carsten.Resch@...bosch.com>, "Rosca, Eugeniu (ADITG/ESB)" <erosca@...adit-jv.com>
Date: Tue, 18 Aug 2020 12:58:13 +0100

> I am asking this question again to include the fs-devel list.
> 
> 
> We have issues with the workqueue of the kernel overloading the CPU 0 
> when we we disconnect a USB stick.
> 
> This results in other items on the shared workqueue being delayed by
> around 6.5 seconds with a default kernel configuration and 2.3 seconds
> on a config tailored for our RCar embedded platform.
> 
> 
> 
> We first noticed this issue on custom hardware and we have recreated it
> on an RCar Starter Kit using a test module [1] to replicate the
> behaviour, the test module outputs any delays of greater then 9ms.
> 
> To run the test we have a 4GB random file on a USB stick and perform
> the following test.
> The stick is mounted as R/O and we are copying data from the stick:
> 
> - Mount the stick.
> mount -o ro,remount /dev/sda1
> 
> - Load the Module:
> # taskset -c 0 modprobe latency-mon
> 
> - Copy large amount of data from the stick:
> # dd if=/run/media/sda1/sample.txt of=/dev/zero
> [ 1437.517603] DELAY: 10
> 8388607+1 records in
> 8388607+1 records out
> 
> 
> - Disconnect the USB stick:
> [ 1551.796792] usb 2-1: USB disconnect, device number 2
> [ 1558.625517] DELAY: 6782
> 
> 
> The Delay output 6782 is in milliseconds.
> 
> 
> 
> Using umount stops the issue occurring but is unfortunately not guaranteed
> in our particular system.
> 
> 
> From my analysis the hub_event workqueue kworker/0:1+usb thread uses around
> 98% of the CPU.
> 
> I have traced the workqueue:workqueue_queue_work function while unplugging the USB
> and there is no particular workqueue function being executed a lot more then the 
> others for the kworker/0:1+usb thread.
> 
> 
> Using perf I identified the hub_events workqueue was spending a lot of time in
> invalidate_partition(), I have included a cut down the captured data from perf in
> [2] which shows the additional functions where the kworker spends most of its time.
> 
> 
> I am aware there will be delays on the shared workqueue, are the delays
> we are seeing considered normal?
> 
> 
> Is there any way to mitigate or identify where the delay is?
> I am unsure if this is a memory or filesystem subsystem issue.
> 
> 
> Thank you for you help.
> 
> Thanks,
> Jim Baxter
> 
> [1] Test Module:
> // SPDX-License-Identifier: GPL-2.0
> /*
>  * Simple WQ latency monitoring
>  *
>  * Copyright (C) 2020 Advanced Driver Information Technology.
>  */
> 
> #include <linux/init.h>
> #include <linux/ktime.h>
> #include <linux/module.h>
> 
> #define PERIOD_MS 100
> 
> static struct delayed_work wq;
> static u64 us_save;
> 
> static void wq_cb(struct work_struct *work)
> {
> 	u64 us = ktime_to_us(ktime_get());
> 	u64 us_diff = us - us_save;
> 	u64 us_print = 0;
> 
> 	if (!us_save)
> 		goto skip_print;
> 
> 
> 	us_print = us_diff / 1000 - PERIOD_MS;
> 	if (us_print > 9)
> 		pr_crit("DELAY: %lld\n", us_print);
> 
> skip_print:
> 	us_save = us;
> 	schedule_delayed_work(&wq, msecs_to_jiffies(PERIOD_MS));
> }
> 
> static int latency_mon_init(void)
> {
> 	us_save = 0;
> 	INIT_DELAYED_WORK(&wq, wq_cb);
> 	schedule_delayed_work(&wq, msecs_to_jiffies(PERIOD_MS));
> 
> 	return 0;
> }
> 
> static void latency_mon_exit(void)
> {
> 	cancel_delayed_work_sync(&wq);
> 	pr_info("%s\n", __func__);
> }
> 
> module_init(latency_mon_init);
> module_exit(latency_mon_exit);
> MODULE_AUTHOR("Eugeniu Rosca <erosca@...adit-jv.com>");
> MODULE_LICENSE("GPL");
> 
> 
> [2] perf trace:
>     95.22%     0.00%  kworker/0:2-eve  [kernel.kallsyms]
>     |
>     ---ret_from_fork
>        kthread
>        worker_thread
>        |          
>         --95.15%--process_one_work
> 		  |          
> 		   --94.99%--hub_event
> 			 |          
> 			  --94.99%--usb_disconnect
> 			  <snip>
> 				|  
> 				--94.90%--invalidate_partition
> 				   __invalidate_device
> 				   |          
> 				   |--64.55%--invalidate_bdev
> 				   |  |          
> 				   |   --64.13%--invalidate_mapping_pages
> 				   |     |          
> 				   |     |--24.09%--invalidate_inode_page
> 				   |     |   |          
> 				   |     |   --23.44%--remove_mapping
> 				   |     |     |          
> 				   |     |      --23.20%--__remove_mapping
> 				   |     |        |          
> 				   |     |         --21.90%--arch_local_irq_restore
> 				   |     |          
> 				   |     |--22.44%--arch_local_irq_enable
> 				   |          
> 					--30.35%--shrink_dcache_sb 
> 					<snip>
> 					  |      
> 					  --30.17%--truncate_inode_pages_range
> 

Powered by blists - more mailing lists