linux-kernel - Re: cgroup: rmdir() does not complete

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LNX.2.01.1008300949460.4381@fgnk.ybpnyqbznva>
Date:	Mon, 30 Aug 2010 10:13:13 +0100 (BST)
From:	Mark Hills <mark@...o.org.uk>
To:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
cc:	Daisuke Nishimura <nishimura@....nes.nec.co.jp>,
	linux-kernel@...r.kernel.org, balbir@...ux.vnet.ibm.com
Subject: Re: cgroup: rmdir() does not complete

On Fri, 27 Aug 2010, KAMEZAWA Hiroyuki wrote:

> On Fri, 27 Aug 2010 12:39:48 +0900
> Daisuke Nishimura <nishimura@....nes.nec.co.jp> wrote:
> 
> > On Fri, 27 Aug 2010 11:35:06 +0900
> > KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com> wrote:
> > 
> > > On Fri, 27 Aug 2010 09:56:39 +0900
> > > Daisuke Nishimura <nishimura@....nes.nec.co.jp> wrote:
> > > 
> > > > > Or is it likely to be some other cause, and how best to find it?
> > > > > 
> > > > What cgroup subsystem did you mount where the directory existed you tried
> > > > to rmdir() first ?
> > > > If you mounted several subsystems on the same hierarchy, can you mount them
> > > > separately to narrow down the cause ?
> > > > 
> > > 
> > > It seems I can reproduce the issue on mmotm-0811, too.
> > > 
> > > try this.
> > > 
> > > Here, memory cgroup is mounted at /cgroups.
> > > ==
> > > #!/bin/bash -x
> > > 
> > > while sleep 1; do
> > >         date
> > >         mkdir /cgroups/test
> > >         echo 0 > /cgroups/test/tasks
> > >         echo 300M > /cgroups/test/memory.limit_in_bytes
> > >         cat /proc/self/cgroup
> > >         dd if=/dev/zero of=./tmpfile bs=4096 count=100000
> > >         echo 0 > /cgroups/tasks
> > >         cat /proc/self/cgroup
> > >         rmdir /cgroups/test
> > >         rm ./tmpfile
> > > done
> > > ==
> > > 
> > > hangs at rmdir. I'm no investigating force_empty.
> > > 
> > Thank you very much for your information.
> > 
> > Some questions.
> > 
> > Is "tmpfile" created on a normal filesystem(e.g. ext3) or tmpfs ?
> on ext4.
> 
> > And, how long does it likely to take to cause this problem ?
> 
> very soon. 10-20 loop.

The test case I was running is similar to the above. With the Lustre 
filesystem the problem takes 4 hours or more to show itself. Recently I 
ran 4 threads for over 24 hours without it being seen -- I suspect some 
external factor is involved.

I also tried NFS, and did not see a problem after 8 hours or so, but this 
is inconclusive.

The use of the Fedora kernel, and the Lustre filesystem is not 
satisfactory to trace the bug. Until I can get a test case which is more 
readily reproducable, I'm not able to reasonably think about changing 
variables.

It is interesting you see the problem so readily on ext4; I will test that 
soon (it is currently holiday weekend in the UK). I hope it will give me 
the test case I am looking for.

Thanks

-- 
Mark
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/