lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20190211213417.uhfiz5iqwkfrvk25@macbook-pro-91.dhcp.thefacebook.com>
Date:   Mon, 11 Feb 2019 16:34:19 -0500
From:   Josef Bacik <josef@...icpanda.com>
To:     Andrea Righi <righi.andrea@...il.com>
Cc:     Josef Bacik <josef@...icpanda.com>,
        Paolo Valente <paolo.valente@...aro.org>,
        Tejun Heo <tj@...nel.org>, Li Zefan <lizefan@...wei.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Jens Axboe <axboe@...nel.dk>, Vivek Goyal <vgoyal@...hat.com>,
        Dennis Zhou <dennis@...nel.org>, cgroups@...r.kernel.org,
        linux-block@...r.kernel.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH v2] blkcg: prevent priority inversion problem during
 sync()

On Mon, Feb 11, 2019 at 09:40:29PM +0100, Andrea Righi wrote:
> On Mon, Feb 11, 2019 at 10:39:34AM -0500, Josef Bacik wrote:
> > On Sat, Feb 09, 2019 at 03:07:49PM +0100, Andrea Righi wrote:
> > > This is an attempt to mitigate the priority inversion problem of a
> > > high-priority blkcg issuing a sync() and being forced to wait the
> > > completion of all the writeback I/O generated by any other low-priority
> > > blkcg, causing massive latencies to processes that shouldn't be
> > > I/O-throttled at all.
> > > 
> > > The idea is to save a list of blkcg's that are waiting for writeback:
> > > every time a sync() is executed the current blkcg is added to the list.
> > > 
> > > Then, when I/O is throttled, if there's a blkcg waiting for writeback
> > > different than the current blkcg, no throttling is applied (we can
> > > probably refine this logic later, i.e., a better policy could be to
> > > adjust the throttling I/O rate using the blkcg with the highest speed
> > > from the list of waiters - priority inheritance, kinda).
> > > 
> > > This topic has been discussed here:
> > > https://lwn.net/ml/cgroups/20190118103127.325-1-righi.andrea@gmail.com/
> > > 
> > > But we didn't come up with any definitive solution.
> > > 
> > > This patch is not a definitive solution either, but it's an attempt to
> > > continue addressing this issue and handling the priority inversion
> > > problem with sync() in a better way.
> > > 
> > > Signed-off-by: Andrea Righi <righi.andrea@...il.com>
> > 
> > Talked with Tejun about this some and we agreed the following is probably the
> > best way forward
> 
> First of all thanks for the update!
> 
> > 
> > 1) Track the submitter of the wb work to the writeback code.
> 
> Are we going to track the cgroup that originated the dirty pages (or
> maybe dirty inodes) or do you have any idea in particular?
> 

The guy doing the sync(), so that way we can accomplish #3.  But really this is
an implementation detail, however you want to accomplish it is fine by me.

> > 2) Sync() defaults to the root cg, and and it writes all the things as the root
> >    cg.
> 
> OK.
> 
> > 3) Add a flag to the cgroups that would make sync()'ers in that group only be
> >    allowed to write out things that belong to its group.
> 
> So, IIUC, when this flag is enabled a cgroup that is doing sync() would
> trigger the writeback of the pages that belong to that cgroup only and
> it waits only for these pages to be sync-ed, right? In this case
> writeback can still go at cgroup's speed.
> 
> Instead when the flag is disabled, sync() would trigger writeback I/O
> globally, as usual, and it goes at full speed (root cgroup's speed).
> 
> Am I understanding correctly?
> 

Yup that's exactly it.  Thanks,

Josef

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ