linux-kernel - Re: XFS mount timeout in linux-6.9.11

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4697de37-a630-402f-a547-cc4b70de9dc3@gmail.com>
Date: Sun, 11 Aug 2024 10:17:50 +0200
From: Anders Blomdell <anders.blomdell@...il.com>
To: Dave Chinner <david@...morbit.com>
Cc: linux-xfs@...r.kernel.org,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
 Chandan Babu R <chandan.babu@...cle.com>, "Darrick J. Wong"
 <djwong@...nel.org>, Christoph Hellwig <hch@....de>
Subject: Re: XFS mount timeout in linux-6.9.11



On 2024-08-11 01:11, Dave Chinner wrote:
> On Sat, Aug 10, 2024 at 10:29:38AM +0200, Anders Blomdell wrote:
>>
>>
>> On 2024-08-10 00:55, Dave Chinner wrote:
>>> On Fri, Aug 09, 2024 at 07:08:41PM +0200, Anders Blomdell wrote:
>>>> With a filesystem that contains a very large amount of hardlinks
>>>> the time to mount the filesystem skyrockets to around 15 minutes
>>>> on 6.9.11-200.fc40.x86_64 as compared to around 1 second on
>>>> 6.8.10-300.fc40.x86_64,
>>>
>>> That sounds like the filesystem is not being cleanly unmounted on
>>> 6.9.11-200.fc40.x86_64 and so is having to run log recovery on the
>>> next mount and so is recovering lots of hardlink operations that
>>> weren't written back at unmount.
>>>
>>> Hence this smells like an unmount or OS shutdown process issue, not
>>> a mount issue. e.g. if something in the shutdown scripts hangs,
>>> systemd may time out the shutdown and power off/reboot the machine
>>> wihtout completing the full shutdown process. The result of this is
>>> the filesystem has to perform recovery on the next mount and so you
>>> see a long mount time because of some other unrelated issue.
>>>
>>> What is the dmesg output for the mount operations? That will tell us
>>> if journal recovery is the difference for certain.  Have you also
>>> checked to see what is happening in the shutdown/unmount process
>>> before the long mount times occur?
>> echo $(uname -r) $(date +%H:%M:%S) > /dev/kmsg
>> mount /dev/vg1/test /test
>> echo $(uname -r) $(date +%H:%M:%S) > /dev/kmsg
>> umount /test
>> echo $(uname -r) $(date +%H:%M:%S) > /dev/kmsg
>> mount /dev/vg1/test /test
>> echo $(uname -r) $(date +%H:%M:%S) > /dev/kmsg
>>
>> [55581.470484] 6.8.0-rc4-00129-g14dd46cf31f4 09:17:20
>> [55581.492733] XFS (dm-7): Mounting V5 Filesystem e2159bbc-18fb-4d4b-a6c5-14c97b8e5380
>> [56048.292804] XFS (dm-7): Ending clean mount
>> [56516.433008] 6.8.0-rc4-00129-g14dd46cf31f4 09:32:55
> 
> So it took ~450s to determine that the mount was clean, then another
> 450s to return to userspace?
Yeah, that aligns with my userspace view that the mount takes 15 minutes.
> 
>> [56516.434695] XFS (dm-7): Unmounting Filesystem e2159bbc-18fb-4d4b-a6c5-14c97b8e5380
>> [56516.925145] 6.8.0-rc4-00129-g14dd46cf31f4 09:32:56
>> [56517.039873] XFS (dm-7): Mounting V5 Filesystem e2159bbc-18fb-4d4b-a6c5-14c97b8e5380
>> [56986.017144] XFS (dm-7): Ending clean mount
>> [57454.876371] 6.8.0-rc4-00129-g14dd46cf31f4 09:48:34
> 
> Same again.
> 
> Can you post the 'xfs_info /mnt/pt' for that filesystem?
# uname -r ; xfs_info /test
6.8.0-rc4-00128-g8541a7d9da2d
meta-data=/dev/mapper/vg1-test isize=512    agcount=8, agsize=268435455 blks
          =                       sectsz=4096  attr=2, projid32bit=1
          =                       crc=1        finobt=1, sparse=0, rmapbt=0
          =                       reflink=1    bigtime=0 inobtcount=0 nrext64=0
data     =                       bsize=4096   blocks=2147483640, imaxpct=20
          =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=521728, version=2
          =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
> 
>> And rebooting to the kernel before the offending commit:
>>
>> [   60.177951] 6.8.0-rc4-00128-g8541a7d9da2d 10:23:00
>> [   61.009283] SGI XFS with ACLs, security attributes, realtime, scrub, quota, no debug enabled
>> [   61.017422] XFS (dm-7): Mounting V5 Filesystem e2159bbc-18fb-4d4b-a6c5-14c97b8e5380
>> [   61.351100] XFS (dm-7): Ending clean mount
>> [   61.366359] 6.8.0-rc4-00128-g8541a7d9da2d 10:23:01
>> [   61.367673] XFS (dm-7): Unmounting Filesystem e2159bbc-18fb-4d4b-a6c5-14c97b8e5380
>> [   61.444552] 6.8.0-rc4-00128-g8541a7d9da2d 10:23:01
>> [   61.459358] XFS (dm-7): Mounting V5 Filesystem e2159bbc-18fb-4d4b-a6c5-14c97b8e5380
>> [   61.513938] XFS (dm-7): Ending clean mount
>> [   61.524056] 6.8.0-rc4-00128-g8541a7d9da2d 10:23:01
> 
> Yeah, that's what I'd expect to see.
> 
> But, hold on, the kernel version you are testing is apparently is in
> the middle of 6.8-rc4. This commit wasn't merged until 6.9-rc1 and
> there were no XFS changes merged in the between 6.8-rc3 and 6.8-rc6.
If I read the logs correctly, these commited xfs changes were based on the
6.8-rc4, and not rebased to 6.8.0 (which is totally OK, but confusing at
times, after doing some kernel bisects I have stopped thinking about it).

> So as the bisect is walking back in time through the XFS commits,
> the base kernel is also changing. Hence there's a lot more change
> in the kernel being tested by each bisect step than just the XFS
> commits, right?
Yes, it's a binary search through the highly convoluted graph of commits.
It always suprises me how few of these "random" points fails miserably
during a bisect (in this particular case none), so kudos to all linux
maintainers to keep things so sane!

> 
> This smells like a bisect jumping randomly backwards in time as it
> lands inside merges rather than bisecting the order in which commits
> were merged into the main tree. Can you post the full bisect log?
git bisect start
# status: waiting for both good and bad commits
# bad: [73f3c33036904bada1b9b6476a883b1a966440cc] Linux 6.9.11
git bisect bad 73f3c33036904bada1b9b6476a883b1a966440cc
# status: waiting for good commit(s), bad commit known
# good: [a0c69a570e420e86c7569b8c052913213eef2b45] Linux 6.8.10
git bisect good a0c69a570e420e86c7569b8c052913213eef2b45
# good: [e8f897f4afef0031fe618a8e94127a0934896aba] Linux 6.8
git bisect good e8f897f4afef0031fe618a8e94127a0934896aba
# bad: [fe46a7dd189e25604716c03576d05ac8a5209743] Merge tag 'sound-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
git bisect bad fe46a7dd189e25604716c03576d05ac8a5209743
# good: [9187210eee7d87eea37b45ea93454a88681894a4] Merge tag 'net-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
git bisect good 9187210eee7d87eea37b45ea93454a88681894a4
# good: [119b225f01e4d3ce974cd3b4d982c76a380c796d] Merge tag 'amd-drm-next-6.9-2024-03-08-1' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
git bisect good 119b225f01e4d3ce974cd3b4d982c76a380c796d
# good: [70ef654469b371d0a71bcf967fa3dcbca05d4b25] Merge tag 'efi-next-for-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
git bisect good 70ef654469b371d0a71bcf967fa3dcbca05d4b25
# bad: [8403ce70be339d462892a2b935ae30ee52416f92] Merge tag 'mfd-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
git bisect bad 8403ce70be339d462892a2b935ae30ee52416f92
# bad: [480e035fc4c714fb5536e64ab9db04fedc89e910] Merge tag 'drm-next-2024-03-13' of https://gitlab.freedesktop.org/drm/kernel
git bisect bad 480e035fc4c714fb5536e64ab9db04fedc89e910
# bad: [57982d6c835a71da5c66e6090680de1adf6e736a] xfs: consolidate btree ptr checking
git bisect bad 57982d6c835a71da5c66e6090680de1adf6e736a
# good: [0b8686f19879d896bbe2d3e893f433a08160452d] xfs: separate the marking of sick and checked metadata
git bisect good 0b8686f19879d896bbe2d3e893f433a08160452d
# good: [1a9d26291c68fbb8f8d24f9f694b32223a072745] xfs: store the btree pointer length in struct xfs_btree_ops
git bisect good 1a9d26291c68fbb8f8d24f9f694b32223a072745
# good: [802f91f7b1d535ac975e2d696bf5b5dea82816e7] xfs: fold xfs_bmbt_init_common into xfs_bmbt_init_cursor
git bisect good 802f91f7b1d535ac975e2d696bf5b5dea82816e7
# good: [4bfb028a4c00d0a079a625d7867325efb3c37de2] xfs: remove the btnum argument to xfs_inobt_count_blocks
git bisect good 4bfb028a4c00d0a079a625d7867325efb3c37de2
# bad: [fbeef4e061ab28bf556af4ee2a5a9848dc4616c5] xfs: pass a 'bool is_finobt' to xfs_inobt_insert
git bisect bad fbeef4e061ab28bf556af4ee2a5a9848dc4616c5
# good: [8541a7d9da2dd6e44f401f2363b21749b7413fc9] xfs: split xfs_inobt_insert_sprec
git bisect good 8541a7d9da2dd6e44f401f2363b21749b7413fc9
# bad: [14dd46cf31f4aaffcf26b00de9af39d01ec8d547] xfs: split xfs_inobt_init_cursor
git bisect bad 14dd46cf31f4aaffcf26b00de9af39d01ec8d547
# first bad commit: [14dd46cf31f4aaffcf26b00de9af39d01ec8d547] xfs: split xfs_inobt_init_cursor

/Anders