[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190815143404.GK14313@quack2.suse.cz>
Date: Thu, 15 Aug 2019 16:34:04 +0200
From: Jan Kara <jack@...e.cz>
To: Tejun Heo <tj@...nel.org>
Cc: axboe@...nel.dk, jack@...e.cz, hannes@...xchg.org,
mhocko@...nel.org, vdavydov.dev@...il.com, cgroups@...r.kernel.org,
linux-mm@...ck.org, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org, kernel-team@...com, guro@...com,
akpm@...ux-foundation.org
Subject: Re: [PATCH 4/4] writeback, memcg: Implement foreign dirty flushing
On Sat 03-08-19 07:01:55, Tejun Heo wrote:
> +void mem_cgroup_track_foreign_dirty_slowpath(struct page *page,
> + struct bdi_writeback *wb)
> +{
> + struct mem_cgroup *memcg = page->mem_cgroup;
> + struct memcg_cgwb_frn *frn;
> + u64 now = jiffies_64;
> + u64 oldest_at = now;
> + int oldest = -1;
> + int i;
> +
> + /*
> + * Pick the slot to use. If there is already a slot for @wb, keep
> + * using it. If not replace the oldest one which isn't being
> + * written out.
> + */
> + for (i = 0; i < MEMCG_CGWB_FRN_CNT; i++) {
> + frn = &memcg->cgwb_frn[i];
> + if (frn->bdi_id == wb->bdi->id &&
> + frn->memcg_id == wb->memcg_css->id)
> + break;
> + if (frn->at < oldest_at && atomic_read(&frn->done.cnt) == 1) {
> + oldest = i;
> + oldest_at = frn->at;
> + }
> + }
> +
> + if (i < MEMCG_CGWB_FRN_CNT) {
> + unsigned long update_intv =
> + min_t(unsigned long, HZ,
> + msecs_to_jiffies(dirty_expire_interval * 10) / 8);
> + /*
> + * Re-using an existing one. Let's update timestamp lazily
> + * to avoid making the cacheline hot.
> + */
> + if (frn->at < now - update_intv)
> + frn->at = now;
> + } else if (oldest >= 0) {
> + /* replace the oldest free one */
> + frn = &memcg->cgwb_frn[oldest];
> + frn->bdi_id = wb->bdi->id;
> + frn->memcg_id = wb->memcg_css->id;
> + frn->at = now;
> + }
I have to say I'm a bit nervous about the completely lockless handling
here. I understand that garbage in the cgwb_frn will just result in this
mechanism not working and possibly flushing wrong wb's but still it seems a
bit fragile. But I don't see any cheap way of synchronizing this so I guess
let's try how this will work in practice.
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists