lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Tue, 19 Mar 2019 11:02:44 -0400
From:   "Theodore Ts'o" <tytso@....edu>
To:     "Darrick J. Wong" <darrick.wong@...cle.com>
Cc:     Paul Menzel <pmenzel@...gen.mpg.de>, linux-ext4@...r.kernel.org
Subject: Re: New service e2scrub_reap

On Mon, Mar 18, 2019 at 04:32:38PM -0700, Darrick J. Wong wrote:
> That's ... interesting.  On my developer workstations (Ubuntu 16.04 and
> 18.04) it generally takes 1/10th the amount of time to run
> e2scrub_all.
> 
> Even on my aging ~2010 era server that only has disks it takes 0.3s:
> 
> # time e2scrub_all -A -r
> 
> real    0m0.280s
> user    0m0.160s
> sys     0m0.126s
> 
> I wonder what's different between our computers?  Do you have a
> lvm2-lvmetad service running?

No, I don't.  I do have lvm2-lvmpolld service active, but I don't
think that's used by lvs.

What I can see is from running the script under bash -vx is that
e2scrub_all is calling:

	lvs --nameprefixes -o vg_name,lv_name,lv_role --noheadings <dev>

for each device returned by lsblk (whether or not it is a LVM device).
At least on my system, it takes around a seventh of a second to run:

# sudo time lvs --nameprefixes -o vg_name,lv_name,lv_role --noheadings /dev/lambda/root 
  LVM2_VG_NAME='lambda' LVM2_LV_NAME='root' LVM2_LV_ROLE='public'
0.01user 0.01system 0:00.14elapsed 16%CPU (0avgtext+0avgdata 13528maxresident)k
8704inputs+0outputs (0major+1056minor)pagefaults 0swaps

# sudo time lvs --nameprefixes -o vg_name,lv_name,lv_role --noheadings /dev/nvmen0
  Volume group "nvmen0" not found
  Cannot process volume group nvmen0
Command exited with non-zero status 5
0.02user 0.01system 0:00.18elapsed 25%CPU (0avgtext+0avgdata 13648maxresident)k
8704inputs+0outputs (0major+1554minor)pagefaults 0swaps

"e2scrub -A -r" is running the lvs command ten times.  So that's
around 1.5 to 2 seconds of the five second run.

Looking at the strace output of the lvs command, I don't see it doing
anything that would a long time, but it *is* doing a huge amount of
open/fstat/close on a huge number of sysfs files.  (Essentially, it's
searching all block devices looking for a match for the given LVM
volume.)  So the e2scrub -A script is going to be opening O(N**2)
sysfs files where N is the number of block devices in the system.

> However, since e2scrub is tied to lvm, Ted is right that calling lvs in
> the outer loop would be far more efficient.  I'll have a look at
> reworking this.

We can also use lvm's selection criteria so we don't have to call eval
on the output of the lvs unnecessarily.

Something else that I noticed --- I don't think lvs --nameprefixes
escapes shell magic characters.  Fortunately lvm only allows names to
contain characters from the set [a-zA-Z+_.-], and I don't *think*
there is the way for userspace to trick lvm to returning a device
pathname that might include the string "/dev/bobby/$(rm -rf /)"
But we might want to take a second, more paranoid look at whether
we are sure it's safe.

       	    	 	 	     	  	     - Ted

P.S. Obligatory xkcd reference: https://xkcd.com/327/

P.P.S.  Just for yucks, it might also be worth testing to see whether
or not the automounts that some desktops use based on the volume label
on the USB stick is doing shell escape santization....

Powered by blists - more mailing lists