lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2766007.BEx9A2HvPv@suse>
Date:   Thu, 16 Mar 2023 11:30:21 +0100
From:   "Fabio M. De Francesco" <fmdefrancesco@...il.com>
To:     Jan Kara <jack@...e.cz>,
        Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Al Viro <viro@...iv.linux.org.uk>, linux-kernel@...r.kernel.org,
        linux-fsdevel@...r.kernel.org
Subject: Re: [git pull] vfs.git sysv pile

On giovedì 16 marzo 2023 10:00:35 CET Jan Kara wrote:
> On Wed 15-03-23 19:08:57, Fabio M. De Francesco wrote:
> > On mercoledì 1 marzo 2023 15:14:16 CET Al Viro wrote:
> > > On Wed, Mar 01, 2023 at 02:00:18PM +0100, Jan Kara wrote:
> > > > On Wed 01-03-23 12:20:56, Fabio M. De Francesco wrote:
> > > > > On venerdì 24 febbraio 2023 04:26:57 CET Al Viro wrote:
> > > > > > 	Fabio's "switch to kmap_local_page()" patchset (originally 
after
> > > > > > 	the
> > > > > > 
> > > > > > ext2 counterpart, with a lot of cleaning up done to it; as the
> > > > > > matter
> > 
> > of
> > 
> > > > > > fact, ext2 side is in need of similar cleanups - calling 
conventions
> > > > > > there
> > > > > > are bloody awful).
> > 
> > [snip]
> > 
> > > I think I've pushed a demo patchset to vfs.git at some point back in
> > > January... Yep - see #work.ext2 in there; completely untested, though.
> > 
> > The following commits from the VFS tree, #work.ext2 look good to me.
> > 
> > f5b399373756 ("ext2: use offset_in_page() instead of open-coding it as
> > subtraction")
> > c7248e221fb5 ("ext2_get_page(): saner type")
> > 470e54a09898 ("ext2_put_page(): accept any pointer within the page")
> > 15abcc147cf7 ("ext2_{set_link,delete_entry}(): don't bother with 
page_addr")
> > 16a5ee2027b7 ("ext2_find_entry()/ext2_dotdot(): callers don't need 
page_addr
> > anymore")
> > 
> > Reviewed-by: Fabio M. De Francesco <fmdefrancesco@...il.com>
> 
> Thanks!
> 
> > I could only read the code but I could not test it in the same QEMU/KVM
> > x86_32 VM where I test all my HIGHMEM related work.
> > 
> > Btrfs as well as all the other filesystems I converted to 
kmap_local_page()
> > don't make the processes in the VM to crash, whereas the xfstests on ext2
> > trigger the OOM killer at random tests (only sometimes they exit
> > gracefully).
> > 
> > FYI, I tried to run the tests with 6GB of RAM, booting a kernel with
> > HIGHMEM64GB enabled. I cannot add my "Tested-by" tag.
> 
> Hum, interesting. Reading your previous emails this didn't seem to happen
> before applying this series, did it?
>
I wrote too many messages but was probably not able to explain the facts 
properly. Please let me summarize...

1) When testing ext2 with "./check -g quick" in a QEMU/KVM x86_32 VM, 6GB RAM, 
booting a Vanilla kernel 6.3.0-rc1 with HIGHMEM64GB enabled, the OOM Killer 
kicks in at random tests _with_ and _without_ Al's patches.

2) The only case which does never trigger the OOM Killer is running the tests 
on ext2 formatted filesystems in loop disks with the stock openSUSE kernel 
which is the 6.2.1-1-pae.

3) The same "./check -g quick" on 6.3.0-rc1 runs always to completion with 
other filesystems. I ran xfstests several times on Btrfs and I had no 
problems.

4) I cannot git-bisect this issue with ext2 because I cannot trust the results 
on any particular Kernel version. I mean that I cannot mark any specific 
version neither "good" or "bad" because it happens that the same "good" 
version instead make xfstests crash at the next run.

My conclusion is that we probably have some kind of race that makes the random 
tests crash at random runs of random Kernel versions between (at least) SUSE 
6.2.1 and Vanilla current.

But it may be very well the case that I'm doing something stupid (e.g., with 
QEMU configuration or setup_disks or I can't imagine whatever else) and that 
I'm unable to see where I make mistakes. After all, I'm still a newcomer with 
little experience :-)

Therefore, I'd suggest that someone else try to test ext2 in an x86_32 VM. 
However, I'm 99.5% sure that Al's patches are good by the mere inspection of 
his code.

I hope that this summary contains everything that may help.

However, I remain available to provide any further information and to give my 
contribution if you ask me for specific tasks. 

For my part I have no idea how to investigate what is happening. In these 
months I have run the VM hundreds of times on the most disparate filesystems 
to test my conversions to kmap_local_page() and I have never seen anything 
like this happen.

Thanks,

Fabio 

> 								
Honza
> --
> Jan Kara <jack@...e.com>
> SUSE Labs, CR




Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ