linux-ext4 - Re: [RFC PATCH 0/4] make jbd2 debug switch per device

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CADtkEectLRZRUfWEhQtaCgMUJY0Mik=XN5A-seHJxdBNjFMJ-w@mail.gmail.com>
Date:   Mon, 25 Jan 2021 22:07:01 +0800
From:   许春光 <brookxu.cn@...il.com>
To:     Jan Kara <jack@...e.cz>
Cc:     tytso@....edu, adilger.kernel@...ger.ca, jack@...e.com,
        linux-ext4@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 0/4] make jbd2 debug switch per device

Thanks for your reply.

Jan Kara wrote on 2021/1/25 20:41:
> On Fri 22-01-21 14:43:18, Chunguang Xu wrote:
>> On a multi-disk machine, because jbd2 debugging switch is global, this
>> confuses the logs of multiple disks. It is not easy to distinguish the
>> logs of each disk and the amount of generated logs is very large. Or a
>> separate debugging switch for each disk would be better, so that you
>> can easily distinguish the logs of a certain disk.
>>
>> We can enable jbd2 debugging of a device in the following ways:
>> echo X > /proc/fs/jbd2/sdX/jbd2_debug
>>
>> But there is a small disadvantage here. Because the debugging switch is
>> placed in the journal_t object, the log before the object is initialized
>> will be lost. However, usually this will not have much impact on
>> debugging.
>
> OK, I didn't look at the series yet but I'm wondering: How are you using
> jbd2 debugging? I mean obviously it isn't meant for production use but
> rather for debugging JBD2 bugs so I'm kind of wondering in which case too
> many messages matter.
We perform stress testing on machines in the test environment, and use scripts
to capture journal related logs to analyze problems. There are 12 disks on this
machine, and each disk runs different jobs. Our test kernel also adds
some additional
function-related logs. If we adjust the log level to a higher level, a large
number of logs have nothing to do with the disk to be observed. These logs are
generated by system agents or coordinated tasks. This makes the log difficul
to analyze.

> And if the problem is that there's a problem with distinguishing messages
> from multiple filesystems, then it would be perhaps more useful to add
> journal identification to each message similarly as we do it with ext4
> messages (likely by using journal->j_dev) - which is very simple to do
> after your patches 3 and 4.
Our test kernel did this. Because it broke the log format, I was not
sure whether
it would break something, so I didn't bring this part. Even if the
device information
is added, when there are more disks and the log level is higher, there will be a
lot of irrelevant logs, which makes it necessary to consume a lot of
CPU to filter
messages. Therefore, a device-level switch is provided to make
everything simpler.
>
>                                                               Honza
>