lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zfriwb03HCRWJ24q@bfoster>
Date: Wed, 20 Mar 2024 09:21:05 -0400
From: Brian Foster <bfoster@...hat.com>
To: Kemeng Shi <shikemeng@...weicloud.com>
Cc: akpm@...ux-foundation.org, tj@...nel.org, linux-mm@...ck.org,
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
	willy@...radead.org, jack@...e.cz, dsterba@...e.com,
	mjguzik@...il.com, dhowells@...hat.com, peterz@...radead.org
Subject: Re: [PATCH 1/6] writeback: collect stats of all wb of bdi in
 bdi_debug_stats_show

On Wed, Mar 20, 2024 at 07:02:17PM +0800, Kemeng Shi wrote:
> /sys/kernel/debug/bdi/xxx/stats is supposed to show writeback information
> of whole bdi, but only writeback information of bdi in root cgroup is
> collected. So writeback information in non-root cgroup are missing now.
> To be more specific, considering following case:
> 
> /* create writeback cgroup */
> cd /sys/fs/cgroup
> echo "+memory +io" > cgroup.subtree_control
> mkdir group1
> cd group1
> echo $$ > cgroup.procs
> /* do writeback in cgroup */
> fio -name test -filename=/dev/vdb ...
> /* get writeback info of bdi */
> cat /sys/kernel/debug/bdi/xxx/stats
> The cat result unexpectedly implies that there is no writeback on target
> bdi.
> 
> Fix this by collecting stats of all wb in bdi instead of only wb in
> root cgroup.
> 
> Signed-off-by: Kemeng Shi <shikemeng@...weicloud.com>
> ---
>  mm/backing-dev.c | 93 ++++++++++++++++++++++++++++++++++++------------
>  1 file changed, 70 insertions(+), 23 deletions(-)
> 
> diff --git a/mm/backing-dev.c b/mm/backing-dev.c
> index 5f2be8c8df11..788702b6c5dd 100644
> --- a/mm/backing-dev.c
> +++ b/mm/backing-dev.c
..
> @@ -46,31 +59,65 @@ static void bdi_debug_init(void)
>  	bdi_debug_root = debugfs_create_dir("bdi", NULL);
>  }
>  
..
> +#ifdef CONFIG_CGROUP_WRITEBACK
> +static void bdi_collect_stats(struct backing_dev_info *bdi,
> +			      struct wb_stats *stats)
> +{
> +	struct bdi_writeback *wb;
> +
> +	/* protect wb from release */
> +	mutex_lock(&bdi->cgwb_release_mutex);
> +	list_for_each_entry(wb, &bdi->wb_list, bdi_node)
> +		collect_wb_stats(stats, wb);
> +	mutex_unlock(&bdi->cgwb_release_mutex);
> +}
> +#else
> +static void bdi_collect_stats(struct backing_dev_info *bdi,
> +			      struct wb_stats *stats)
> +{
> +	collect_wb_stats(stats, &bdi->wb);
> +}
> +#endif
> +

I'm not familiar enough with the cgwb code to say for sure (and I'd
probably wait for more high level feedback before worrying too much
about this), but do we need the ifdef here just to iterate ->wb_list?
>From looking at the code, it appears bdi->wb ends up on the list while
the bdi is registered for both cases, so that distinction seems
unnecessary. WRT to wb release protection, I wonder if this could use a
combination of rcu_read_lock()/list_for_each_safe() and wb_tryget() on
each wb before collecting its stats..? See how bdi_split_work_to_wbs()
works, for example.

Also I see a patch conflict/compile error on patch 2 due to
__wb_calc_thresh() only taking one parameter in my tree. What's the
baseline commit for this series?

Brian

> +static int bdi_debug_stats_show(struct seq_file *m, void *v)
> +{
> +	struct backing_dev_info *bdi = m->private;
> +	unsigned long background_thresh;
> +	unsigned long dirty_thresh;
> +	struct wb_stats stats;
> +	unsigned long tot_bw;
> +
>  	global_dirty_limits(&background_thresh, &dirty_thresh);
> -	wb_thresh = wb_calc_thresh(wb, dirty_thresh);
> +
> +	memset(&stats, 0, sizeof(stats));
> +	stats.dirty_thresh = dirty_thresh;
> +	bdi_collect_stats(bdi, &stats);
> +
> +	tot_bw = atomic_long_read(&bdi->tot_write_bandwidth);
>  
>  	seq_printf(m,
>  		   "BdiWriteback:       %10lu kB\n"
> @@ -87,18 +134,18 @@ static int bdi_debug_stats_show(struct seq_file *m, void *v)
>  		   "b_dirty_time:       %10lu\n"
>  		   "bdi_list:           %10u\n"
>  		   "state:              %10lx\n",
> -		   (unsigned long) K(wb_stat(wb, WB_WRITEBACK)),
> -		   (unsigned long) K(wb_stat(wb, WB_RECLAIMABLE)),
> -		   K(wb_thresh),
> +		   K(stats.nr_writeback),
> +		   K(stats.nr_reclaimable),
> +		   K(stats.wb_thresh),
>  		   K(dirty_thresh),
>  		   K(background_thresh),
> -		   (unsigned long) K(wb_stat(wb, WB_DIRTIED)),
> -		   (unsigned long) K(wb_stat(wb, WB_WRITTEN)),
> -		   (unsigned long) K(wb->write_bandwidth),
> -		   nr_dirty,
> -		   nr_io,
> -		   nr_more_io,
> -		   nr_dirty_time,
> +		   K(stats.nr_dirtied),
> +		   K(stats.nr_written),
> +		   K(tot_bw),
> +		   stats.nr_dirty,
> +		   stats.nr_io,
> +		   stats.nr_more_io,
> +		   stats.nr_dirty_time,
>  		   !list_empty(&bdi->bdi_list), bdi->wb.state);
>  
>  	return 0;
> -- 
> 2.30.0
> 
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ