lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140306184532.GA15459@xanadu.blop.info>
Date:	Thu, 6 Mar 2014 19:45:32 +0100
From:	Lucas Nussbaum <lucas.nussbaum@...ia.fr>
To:	Theodore Ts'o <tytso@....edu>
Cc:	linux-ext4@...r.kernel.org,
	Emmanuel Jeanvoine <emmanuel.jeanvoine@...ia.fr>
Subject: Re: [PATCH, RFC] jbd2: don't write non-commit blocks synchronously

On 06/03/14 at 13:27 -0500, Theodore Ts'o wrote:
> Hmm... OK, let me make sure I understand what is going on.  So you
> have a single file system which is mounted read/write,

Yes

> and you are
> doing a huge number of copies into the file system, which is keeping
> it busy.

On that minimal system, I'm just copying ~300 MB of data. I wouldn't
qualify it as huge.

> You are then running a huge number of "mount -o remount" on
> that same file system,

at the same time as the data copy, yes.

> which should effectively be no-op's, since the
> remount isn't actually change the read/only or read/write or any other
> mount options.  Is that right?

Yes

> Why were you doing the remount in in your actual production workload,
> anyway?

We are booting a large number (hundreds) of LXC containers in order to
setup an experimental environment. Those LXC containers simply use
subdirectories on the ext4 filesystem as root directory.
What we saw is that the boot of LXC containers "deadlocked".

We later discovered that:
- this was caused by Debian's /etc/init.d/checkroot.sh that calls
  mount -o remount,defaults,rw 
- it was not a deadlock, but rather something looking like severe lock
  contention. After a seemingly random amount of time (2 to to 15 mins),
  the boot of LXC containers finishes.
- it was possible to reproduce the problem outside of LXC, using the
  "write data and do lots of remounts at the same time" setup I
  described earlier.
-- 
| Lucas Nussbaum      Assistant professor @ Univ. de Lorraine |
| lucas.nussbaum@...ia.fr                   LORIA / AlGorille |
| http://www.loria.fr/~lnussbau/            +33 3 54 95 86 19 |
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ