lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJB88a2J+HY_n8=7nJHp9GQ0jVDJdYLw5TPrBxN-ySqnBXNi9g@mail.gmail.com>
Date:   Wed, 19 Jul 2017 23:07:08 -0700
From:   Brian Malehorn <bmalehorn@...il.com>
To:     linux-ext4@...r.kernel.org
Subject: write() hangs during flush

Hi,

I'm debugging an issue where write() can sometimes take several
seconds to complete. I'm looking for general guidance on why this
might happen, and what I can do about it.

While I originally encountered the problem on an embedded device
writing to an MMC, I can also reproduce it on my laptop (Ubuntu
16.04):

    pv -L 100m /dev/zero |
      strace -s 8 -T -e trace=write dd bs=32k of=./zero 2>&1 |
      awk 'substr($NF, 2, length($NF)-2) + 0 > 0.1'

    write(1, "\0\0\0\0\0\0\0\0"..., 32768) = 32768 <0.145895>
    write(1, "\0\0\0\0\0\0\0\0"..., 32768) = 32768 <0.673575>
    write(1, "\0\0\0\0\0\0\0\0"..., 32768) = 32768 <0.126722>
    write(1, "\0\0\0\0\0\0\0\0"..., 32768) = 32768 <1.284791>

In the above example, I write 100 MiB / second, and print out any
write() that took over 100 ms. In the output, the slowest write was
1.28 seconds. I can provide more details about the setup if needed.

I believe this is happening:

  1. accumulate lots of data in cache
  2. filesystem decides to flush
  3. nobody can write during flush
  4. flush flushes "a lot" of data - maybe everything

Do my guesses align with reality, or is there another explanation?

Does anybody have ideas on how to make this more "smooth"? Ideally,
I'd like each write() to be slowed down a little bit, rather than
99.99% of writes completing instantly and 0.01% taking over a second.

Thanks,
Brian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ