lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150630143100.GL7252@quack.suse.cz>
Date:	Tue, 30 Jun 2015 16:31:00 +0200
From:	Jan Kara <jack@...e.cz>
To:	Tejun Heo <tj@...nel.org>
Cc:	axboe@...nel.dk, linux-kernel@...r.kernel.org, jack@...e.cz,
	hch@...radead.org, hannes@...xchg.org,
	linux-fsdevel@...r.kernel.org, vgoyal@...hat.com,
	lizefan@...wei.com, cgroups@...r.kernel.org, linux-mm@...ck.org,
	mhocko@...e.cz, clm@...com, fengguang.wu@...el.com,
	david@...morbit.com, gthelen@...gle.com, khlebnikov@...dex-team.ru
Subject: Re: [PATCH 26/51] writeback: let balance_dirty_pages() work on the
 matching cgroup bdi_writeback

On Fri 22-05-15 17:13:40, Tejun Heo wrote:
> Currently, balance_dirty_pages() always work on bdi->wb.  This patch
> updates it to work on the wb (bdi_writeback) matching memcg and blkcg
> of the current task as that's what the inode is being dirtied against.
> 
> balance_dirty_pages_ratelimited() now pins the current wb and passes
> it to balance_dirty_pages().
> 
> As no filesystem has FS_CGROUP_WRITEBACK yet, this doesn't lead to
> visible behavior differences.
...
>  void balance_dirty_pages_ratelimited(struct address_space *mapping)
>  {
> -	struct backing_dev_info *bdi = inode_to_bdi(mapping->host);
> -	struct bdi_writeback *wb = &bdi->wb;
> +	struct inode *inode = mapping->host;
> +	struct backing_dev_info *bdi = inode_to_bdi(inode);
> +	struct bdi_writeback *wb = NULL;
>  	int ratelimit;
>  	int *p;
>  
>  	if (!bdi_cap_account_dirty(bdi))
>  		return;
>  
> +	if (inode_cgwb_enabled(inode))
> +		wb = wb_get_create_current(bdi, GFP_KERNEL);
> +	if (!wb)
> +		wb = &bdi->wb;
> +

So this effectively adds a radix tree lookup (of wb belonging to memcg) for
every set_page_dirty() call. That seems relatively costly to me. And all
that just to check wb->dirty_exceeded. Cannot we just use inode_to_wb()
instead? I understand results may be different if multiple memcgs share an
inode and that's the reason why you use wb_get_create_current(), right?
But for dirty_exceeded check it may be good enough?

								Honza

>  	ratelimit = current->nr_dirtied_pause;
>  	if (wb->dirty_exceeded)
>  		ratelimit = min(ratelimit, 32 >> (PAGE_SHIFT - 10));
> @@ -1616,7 +1622,9 @@ void balance_dirty_pages_ratelimited(struct address_space *mapping)
>  	preempt_enable();
>  
>  	if (unlikely(current->nr_dirtied >= ratelimit))
> -		balance_dirty_pages(mapping, current->nr_dirtied);
> +		balance_dirty_pages(mapping, wb, current->nr_dirtied);
> +
> +	wb_put(wb);
>  }
>  EXPORT_SYMBOL(balance_dirty_pages_ratelimited);
>  
> -- 
> 2.4.0
> 
-- 
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ