lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 5 Jun 2019 15:05:52 +0800
From:   Xiaoguang Wang <xiaoguang.wang@...ux.alibaba.com>
To:     Theodore Ts'o <tytso@....edu>
Cc:     linux-ext4@...r.kernel.org, adilger.kernel@...ger.ca
Subject: Re: [RFC] jbd2: add new "stats" proc file

hi,

> On Mon, Jun 03, 2019 at 08:42:38PM +0800, Xiaoguang Wang wrote:
>> /proc/fs/jbd2/${device}/info only shows whole average statistical
>> info about jbd2's life cycle, but it can not show jbd2 info in
>> specified time interval and sometimes this capability is very useful
>> for trouble shooting. For example, we can not see how rs_locked and
>> rs_flushing grows in specified time interval, but these two indexes
>> can explain some reasons for app's behaviours.
> 
> We actually had something like this, but we removed it in commit
> bf6993276f7: "jbd2: Use tracepoints for history file".  The idea was
> that you can get the same information using the jbd2_run_tracepoints
> 
> # echo jbd2_run_stats > /sys/kernel/debug/tracing/set_event
> # cat /sys/kernel/debug/tracing/trace_pipe
> 
> ... which will produce output like this:
> 
>        jbd2/vdg-8-293   [000] ...2   122.822487: jbd2_run_stats: dev 254,96 tid 4403 wait 0 request_delay 0 running 4 locked 0 flushing 0 logging 7 handle_count 98 blocks 3 blocks_logged 4
>        jbd2/vdg-8-293   [000] ...2   122.833101: jbd2_run_stats: dev 254,96 tid 4404 wait 0 request_delay 0 running 14 locked 0 flushing 0 logging 4 handle_count 198 blocks 1 blocks_logged 2
>        jbd2/vdg-8-293   [000] ...2   122.839325: jbd2_run_stats: dev 254,96 tid 4405 wait
> 
> With eBPF, we should be able to do something even more user friendly.
Yes, I'm learning it :)
For this patch, it's because we'd like to implement a monitor system based
on web to show jbd2 status's change, then for example if buffered write reports
high latency and jbd2 rs_locked and rs_flushing also report high value, we may
build a connection between buffered write and jbd2.
Previously we planned to make above monitor system parse a jbd2 status file provided
by kernel,this would be simplest. But ok, we can try to use ebpf.

> 
> BTW, if you are looking to try to optimize jbd2, a good thing to do is
> to take a look at jbd2_handle_stats, filtered on ones where the
> interval is larger than some cut-off.  Ideally, the time between a
> handle getting started and stopped should be as small as possible,
> because if a transaction is trying to close, an open handle will get
> in the way of that, and other CPU's will be stuck waiting for handle
> to complete.  This means that pre-reading blocks before starting a
> handle, etc., is a really good idea.  And monitoring jbd2_handle_stats
> is a good way to find potential spots to topimize in ext4.
Thanks for your detailed explanation and suggestions.

Regards,
Xiaoguang Wang
> 
>       	      	      		      	 	  - Ted
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ