lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Tue, 13 Sep 2022 20:57:00 +0800
From:   JunChao Sun <sunjunchao2870@...il.com>
To:     "Theodore Ts'o" <tytso@....edu>
Cc:     linux-ext4@...r.kernel.org
Subject: Re: How does newbie find bugs in ext4?

Thanks a lot for your suggestions and patience . It is a great
guidance for a newbie of ext4!



On Tue, Sep 13, 2022 at 12:33 AM Theodore Ts'o <tytso@....edu> wrote:
>
> Hi,
>
> So first of all, I would recommend that you learn how to use
> kvm-xfstests.  The reason for this is that kvm-xfstests is very useful
> for testing any changes that you make.  The same test appliance can be
> used for testing file systems for Android and using Google Compute
> Engine VM's (which is one of the best ways to use it).  Please take a
> look at these references:
>
>       https://thunk.org/gce-xfstests
>       https://github.com/tytso/xfstests-bld/blob/master/Documentation/what-is-xfstests.md
>       https://github.com/tytso/xfstests-bld/blob/master/Documentation/kvm-quickstart.md
>       https://github.com/tytso/xfstests-bld/blob/master/Documentation/kvm-xfstests.md
>
> In addition to using this as a way of a quick "playground" where you
> can test patches, this can also be a good way to (for example) test
> syzbot reports.
>
> Another thing which you could potentially do is to manual backporting
> of ext4 patches which didn't automatically get applied because the
> patch required some adjustments (or required backporting some
> additional commits, etc.) to fix a particular problem.  So for
> example, you could try running xfstests using the latest 5.10.y or
> 5.15.y stable kernels, since as we fix bugs, we often add tests to
> check for regressions.  For example, if you look at the header of the
> test ext4/058, you'll find:
>
> # Set 256 blocks in a block group, then inject I/O pressure,
> # it will trigger off kernel BUG in ext4_mb_mark_diskspace_used
> #
> # Regression test for commit
> # a08f789d2ab5 ext4: fix bug_on ext4_mb_use_inode_pa
>
> So if you find out that a particular test fails on an LTS kernel
> (e.g., 5.15.y or 5.10.y), but it passes on upstream, it could be that
> a missing commit needs to be backported.  We don't currently have
> anyone doing this on a regular basis for the LTS kernels (I maybe will
> do this once every few months, when I have time), so this could be a
> good way for you to contribute and also learn more about ext4 as you
> go.
>
> Finally, I'll note that although I do run xfstests regularly, and will
> reject patches that cause regressions, but there are still some tests
> that fail.  For example, here is my latest test report:
>
> TESTRUNID: ltm-20220912073217
> KERNEL:    kernel 6.0.0-rc4-xfstests #760 SMP PREEMPT_DYNAMIC Mon Sep 12 07:23:13 EDT 2022 x86_64
> CMDLINE:   full --kernel gs://gce-xfstests/kernel.deb
> CPUS:      4
> MEM:       7680
>
> ext4/4k: 515 tests, 27 skipped, 4093 seconds
> ext4/1k: 511 tests, 2 failures, 40 skipped, 5095 seconds
>   Flaky: generic/475: 40% (2/5)   generic/476: 40% (2/5)
> ext4/ext3: 507 tests, 115 skipped, 3514 seconds
> ext4/encrypt: 493 tests, 3 failures, 129 skipped, 2583 seconds
>   Failures: generic/681 generic/682 generic/691
> ext4/nojournal: 510 tests, 4 failures, 94 skipped, 3610 seconds
>   Failures: ext4/301 ext4/304 generic/455
>   Flaky: generic/077: 40% (2/5)
> ext4/ext3conv: 512 tests, 27 skipped, 3650 seconds
> ext4/adv: 512 tests, 3 failures, 34 skipped, 3860 seconds
>   Failures: generic/475 generic/477
>   Flaky: generic/455: 80% (4/5)
> ext4/dioread_nolock: 513 tests, 27 skipped, 4235 seconds
> ext4/data_journal: 511 tests, 2 failures, 87 skipped, 3647 seconds
>   Failures: generic/231 generic/455
> ext4/bigalloc: 489 tests, 2 failures, 34 skipped, 3904 seconds
>   Failures: generic/455 shared/298
> ext4/bigalloc_1k: 488 tests, 2 failures, 51 skipped, 3826 seconds
>   Failures: generic/455 shared/298
> ext4/dax: 502 tests, 127 skipped, 2520 seconds
> Totals: 6135 tests, 792 skipped, 80 failures, 0 errors, 44288s
>
> (This was done by using gce-xfstests, which is a cloud VM variant of
> kvm-xfstests.  The equivalant would take roughly 12 to 24 hours using
> kvm-xfstests, whichj gets run on multiple VM times, so the wall clock
> time needed is perhaps two to two and a half hours.)
>
> In general, I try very hard to make sure that ext4/4k (ext4 with the
> default 4k block size) to be free of failures hen running the xfstests
> "auto" group.  However, you'll see that there are other configs where
> there are failures, some of which have been around for a while.
> However, the challenge is that these are bugs that often, more senior
> ext4 developers have tried looking at for, say, an hour or two, and
> then said, "I have higher priority fires to fight".  But these might
> not be the best tests failures to ask a ext4 newbie to debug.  That
> being said, if you don't mind a bit (or a lot) of frustration, it
> could be that you might be able root cause soe of these failed tests.
>
> (But starting with testing the LTS kernels might be a better place to
> start.)
>
> Cheers,
>
>                                         - Ted

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ