lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 2 Jul 2021 10:57:37 +0100
From:   Jon Hunter <jonathanh@...dia.com>
To:     Theodore Ts'o <tytso@....edu>, Zhang Yi <yi.zhang@...wei.com>
CC:     <linux-kernel@...r.kernel.org>, <linux-ext4@...r.kernel.org>,
        linux-tegra <linux-tegra@...r.kernel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [GIT PULL] ext4 updates for v5.14

Hi Ted, Zhang,

On 30/06/2021 21:49, Theodore Ts'o wrote:
> The following changes since commit 614124bea77e452aa6df7a8714e8bc820b489922:
> 
>   Linux 5.13-rc5 (2021-06-06 15:47:27 -0700)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git tags/ext4_for_linus
> 
> for you to fetch changes up to 16aa4c9a1fbe763c147a964cdc1f5be8ed98ed13:
> 
>   jbd2: export jbd2_journal_[un]register_shrinker() (2021-06-30 11:05:00 -0400)
> 
> ----------------------------------------------------------------
> In addition to bug fixes and cleanups, there are two new features for
> ext4 in 5.14:
>  - Allow applications to poll on changes to /sys/fs/ext4/*/errors_count
>  - Add the ioctl EXT4_IOC_CHECKPOINT which allows the journal to be
>    checkpointed, truncated and discarded or zero'ed.
> 
> ----------------------------------------------------------------

...

> Zhang Yi (12):
>       ext4: cleanup in-core orphan list if ext4_truncate() failed to get a transaction handle
>       ext4: remove check for zero nr_to_scan in ext4_es_scan()
>       ext4: correct the cache_nr in tracepoint ext4_es_shrink_exit
>       jbd2: remove the out label in __jbd2_journal_remove_checkpoint()
>       jbd2: ensure abort the journal if detect IO error when writing original buffer back
>       jbd2: don't abort the journal when freeing buffers
>       jbd2: remove redundant buffer io error checks
>       jbd2,ext4: add a shrinker to release checkpointed buffers


I have noticed that with next-20210701 that one of our eMMC tests
started failing on all our ARM and ARM64 platforms and bisect is
pointing to commit 4ba3fcdde7e3 ("jbd2,ext4: add a shrinker to
release checkpointed buffers"). Today I am seeing the same failure
on the mainline.

Looking at the kernel logs I see the following crash ...

[   74.430365] Unable to handle kernel paging request at virtual address ffff8001e353a000
[   74.438304] Mem abort info:
[   74.441110]   ESR = 0x96000005
[   74.444226]   EC = 0x25: DABT (current EL), IL = 32 bits
[   74.449548]   SET = 0, FnV = 0
[   74.452595]   EA = 0, S1PTW = 0
[   74.455740]   FSC = 0x05: level 1 translation fault
[   74.460620] Data abort info:
[   74.463504]   ISV = 0, ISS = 0x00000005
[   74.467343]   CM = 0, WnR = 0
[   74.470314] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000081adc000
[   74.477013] [ffff8001e353a000] pgd=10000002771ff803, p4d=10000002771ff803, pud=0000000000000000
[   74.485718] Internal error: Oops: 96000005 [#1] PREEMPT SMP
[   74.491284] Modules linked in: tegra_drm snd_soc_tegra186_dspk cec snd_soc_tegra210_dmic snd_soc_tegra210_admaif snd_soc_tegra_pcm snd_soc_tegra210_i2s drm_kms_helper drm snd_soc_tegra210_ahub tegra210_adma crct10dif_ce snd_hda_codec_hdmi snd_soc_tegra_audio_graph_card snd_soc_audio_graph_card snd_hda_tegra snd_soc_simple_card_utils snd_hda_codec at24 tegra_bpmp_thermal snd_hda_core tegra_aconnect tegra_xudc ina3221 host1x ip_tables x_tables ipv6
[   74.530804] CPU: 0 PID: 936 Comm: umount Tainted: G S                5.13.0-next-20210701-gfb0ca446157a #1
[   74.540446] Hardware name: NVIDIA Jetson TX2 Developer Kit (DT)
[   74.546354] pstate: a0000005 (NzCv daif -PAN -UAO -TCO BTYPE=--)
[   74.552354] pc : percpu_counter_add_batch+0x30/0x118
[   74.557317] lr : __jbd2_journal_remove_checkpoint+0x70/0x170
[   74.562972] sp : ffff800013923b90
[   74.566278] x29: ffff800013923b90 x28: ffff000080ba8d80 x27: 0000000000000000
[   74.573408] x26: 0000000000000001 x25: 0000000000000006 x24: ffff000080ba8d80
[   74.580536] x23: ffff00008965a450 x22: ffff800011ce9000 x21: ffff00008965a380
[   74.587665] x20: ffffffffffffffff x19: ffff00008a9d8000 x18: 0000000000000011
[   74.594792] x17: 0000000000000000 x16: 0000000000000000 x15: 000000000000038d
[   74.601921] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[   74.609048] x11: 0000000000000001 x10: 0000000000000960 x9 : ffff800013923b90
[   74.616175] x8 : ffff000080ba9740 x7 : 0000000000000400 x6 : ffff00008965a0b0
[   74.623304] x5 : ffff00008965a0b0 x4 : ffff8001e353a000 x3 : ffff000080ba8d80
[   74.630430] x2 : 0000000000000020 x1 : 0000000000000000 x0 : ffff00008965a380
[   74.637558] Call trace:
[   74.640000]  percpu_counter_add_batch+0x30/0x118
[   74.644610]  __jbd2_journal_remove_checkpoint+0x70/0x170
[   74.649914]  jbd2_log_do_checkpoint+0xa8/0x398
[   74.654351]  jbd2_journal_destroy+0x100/0x2a8
[   74.658703]  ext4_put_super+0x7c/0x388
[   74.662449]  generic_shutdown_super+0x70/0xf8
[   74.666802]  kill_block_super+0x1c/0x60
[   74.670633]  deactivate_locked_super+0x6c/0x98
[   74.675071]  deactivate_super+0x84/0x90
[   74.678901]  cleanup_mnt+0x8c/0x110
[   74.682385]  __cleanup_mnt+0x10/0x18
[   74.685953]  task_work_run+0x78/0x150
[   74.689612]  do_notify_resume+0x31c/0x498
[   74.693618]  work_pending+0xc/0x328
[   74.697103] Code: 11000484 b9000864 d538d084 f9401001 (b8a46833) 
[   74.703186] ---[ end trace e18485293afc06e4 ]---


Is this causing problems for anyone else?

Thanks
Jon

-- 
nvpublic

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ