linux-kernel - Re: WARNING: CPU: 1 PID: 14735 at fs/dcache.c:365 dentry

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <0c844f53-f853-ac02-e6a3-399f9bd0ebe2@gmx.de>
Date:   Sat, 30 Jul 2022 22:21:30 +0200
From:   Helge Deller <deller@....de>
To:     Sam James <sam@...too.org>, Al Viro <viro@...iv.linux.org.uk>
Cc:     Hillf Danton <hdanton@...a.com>,
        John David Anglin <dave.anglin@...l.net>,
        linux-kernel@...r.kernel.org, linux-parisc@...r.kernel.org,
        linux-fsdevel@...r.kernel.org
Subject: Re: WARNING: CPU: 1 PID: 14735 at fs/dcache.c:365
 dentry_free+0x100/0x128

On 7/21/22 05:54, Helge Deller wrote:
> On 7/21/22 01:15, Sam James wrote:
>>> On 20 Jul 2022, at 18:06, Al Viro <viro@...iv.linux.org.uk> wrote:
>>>
>>> On Wed, Jul 20, 2022 at 07:00:32PM +0800, Hillf Danton wrote:
>>>
>>>> To help debug it, de-union d_in_lookup_hash with d_alias and add debug
>>>> info after dentry is killed. If any warning hits, we know where to add
>>>> something like
>>>>
>>>> 	WARN_ON(dentry->d_flags & DCACHE_DENTRY_KILLED);
>>>>
>>>> before hlist_bl_add or hlist_add.
>>
>>> [snip]
>>> I wonder if anyone had seen anything similar outside of parisc...
>
> Me too.
> Of course it could be caused by the platform code, as we have had
> issues with caches, spinlocks and so on.
> On older kernels we also have seen RCU stalls in d_alloc_parallel().
>
>>> I don't know if I have any chance to reproduce it here - the only
>>> parisc box I've got is a 715/100 (assuming the disk is still alive)
>>> and it's 32bit, unlike the reported setups and, er, not fast.
>
> It's fun to boot it, but it will be too slow for actual testing.
>
>>> qemu seems to have some parisc support, but it's 32bit-only at the
>>> moment...
>
> Yes. I think it will be hard to reproduce it in the VM.
>
>> I don't think I've seen this on parisc either, but I don't think
>> I've used tmpfs that heavily. I'll try it in case it's somehow more
>> likely to trigger it.
>
> It happened on the debian buildd server with tmpfs. To rule out tmpfs
> I switched to ext4 (on SATA SSD) and it happened there as well.
> I assume Dave's report is on ext3/ext4 with SCSI discs.
>
>> Helge, were there any particular steps to reproduce this? Or just
>> start doing your normal Debian builds on a tmpfs and it happens
>> soon enough?
>
> Currently it's not easy to reproduce for me either.
> It happens on the debian buildd server (4-way c8000 machine) while building
> the webkit2gtk package. I think it happens at the end when sbuild
> cleans the build directories by deleting all files.
> Maybe there is a filesystem test toolkit which you could try which hammers
> the fs by deleting lots of files in parallel?

I currently can't reproduce the issue any longer.
In case it pops up again, I'll follow up here again.

Helge