linux-ext4 - ext4 hang and per-memcg dirty throttling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180912001054.bu3x3xwukusnsa26@US-160370MP2.local>
Date:   Tue, 11 Sep 2018 17:10:55 -0700
From:   Liu Bo <bo.liu@...ux.alibaba.com>
To:     linux-ext4@...r.kernel.org
Cc:     fengguang.wu@...el.com, tj@...nel.org, jack@...e.cz,
        cgroups@...r.kernel.org, gthelen@...gle.com, linux-mm@...ck.org,
        yang.shi@...ux.alibaba.com
Subject: ext4 hang and per-memcg dirty throttling

Hi,

With ext4's data=ordered mode and the underlying blk throttle setting, we
can easily run to hang,

1.
mount /dev/sdc /mnt -odata=ordered
2.
mkdir /sys/fs/cgroup/unified/cg
3.
echo "+io" > /sys/fs/cgroup/unified/cgroup.subtree_control
4.
echo "`cat /sys/block/sdc/dev` wbps=$((1 << 20))" > /sys/fs/cgroup/unified/cg/io.max
5.
echo $$ >  /sys/fs/cgroup/unified/cg/cgroup.procs
6.
// background dirtier
xfs_io -f -c "pwrite 0 1G" $M/dummy &
7.
echo $$ > /sys/fs/cgroup/unified/cgroup.procs
8.
// issue synchronous IO
for i in `seq 1 100`;
do
    xfs_io -f -s -c "pwrite 0 4k" $M/foo > /dev/null
done

And the hang is like

      [jbd2-sdc]
jbd2_journal_commit_transaction                              
  journal_submit_data_buffers
    # file 'dummy' has been written by writeback kthread
  journal_finish_inode_data_buffers
    # wait on page's writeback

Then all the operations of ext4 which need to start journal will have
to wait until journal committing transaction completes.

Since there is no per-memcg throttling, such as dirty ratio or dirty
bytes throttle, balance_dirty_pages() may not be able to slow down the
background dirtier task as expected.

I googled a little bit and found that Greg did the related work[1]
back in 2011, but seems the patch set didn't make it to kernel.

Now that we have writeback aware cgroup, is there any plan to push the
patch set again or are there any alternative solutions/suggestions?

[1]: https://lwn.net/Articles/455341/

thanks,
-liubo