lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230320124725.pe4jqdsp4o47kmdp@quack3>
Date:   Mon, 20 Mar 2023 13:47:25 +0100
From:   Jan Kara <jack@...e.cz>
To:     "Fabio M. De Francesco" <fmdefrancesco@...il.com>
Cc:     Jan Kara <jack@...e.cz>, Al Viro <viro@...iv.linux.org.uk>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [git pull] vfs.git sysv pile

On Mon 20-03-23 12:18:38, Fabio M. De Francesco wrote:
> On giovedì 16 marzo 2023 11:30:21 CET Fabio M. De Francesco wrote:
> > On giovedì 16 marzo 2023 10:00:35 CET Jan Kara wrote:
> > > On Wed 15-03-23 19:08:57, Fabio M. De Francesco wrote:
> > > > On mercoledì 1 marzo 2023 15:14:16 CET Al Viro wrote:
> > > > > On Wed, Mar 01, 2023 at 02:00:18PM +0100, Jan Kara wrote:
> > > > > > On Wed 01-03-23 12:20:56, Fabio M. De Francesco wrote:
> > > > > > > On venerdì 24 febbraio 2023 04:26:57 CET Al Viro wrote:
> > > > > > > > 	Fabio's "switch to kmap_local_page()" patchset (originally
> > 
> > after
> > 
> > > > > > > > 	the
> > > > > > > > 
> > > > > > > > ext2 counterpart, with a lot of cleaning up done to it; as the
> > > > > > > > matter
> > > > 
> > > > of
> > > > 
> > > > > > > > fact, ext2 side is in need of similar cleanups - calling
> > 
> > conventions
> > 
> > > > > > > > there
> > > > > > > > are bloody awful).
> > > > 
> > > > [snip]
> > > > 
> > > > > I think I've pushed a demo patchset to vfs.git at some point back in
> > > > > January... Yep - see #work.ext2 in there; completely untested, though.
> > > > 
> > > > The following commits from the VFS tree, #work.ext2 look good to me.
> > > > 
> > > > f5b399373756 ("ext2: use offset_in_page() instead of open-coding it as
> > > > subtraction")
> > > > c7248e221fb5 ("ext2_get_page(): saner type")
> > > > 470e54a09898 ("ext2_put_page(): accept any pointer within the page")
> > > > 15abcc147cf7 ("ext2_{set_link,delete_entry}(): don't bother with
> > 
> > page_addr")
> > 
> > > > 16a5ee2027b7 ("ext2_find_entry()/ext2_dotdot(): callers don't need
> > 
> > page_addr
> > 
> > > > anymore")
> > > > 
> > > > Reviewed-by: Fabio M. De Francesco <fmdefrancesco@...il.com>
> > > 
> > > Thanks!
> > > 
> > > > I could only read the code but I could not test it in the same QEMU/KVM
> > > > x86_32 VM where I test all my HIGHMEM related work.
> > > > 
> > > > Btrfs as well as all the other filesystems I converted to
> > 
> > kmap_local_page()
> > 
> > > > don't make the processes in the VM to crash, whereas the xfstests on 
> ext2
> > > > trigger the OOM killer at random tests (only sometimes they exit
> > > > gracefully).
> > > > 
> > > > FYI, I tried to run the tests with 6GB of RAM, booting a kernel with
> > > > HIGHMEM64GB enabled. I cannot add my "Tested-by" tag.
> > > 
> > > Hum, interesting. Reading your previous emails this didn't seem to happen
> > > before applying this series, did it?
> > 
> > I wrote too many messages but was probably not able to explain the facts
> > properly. Please let me summarize...
> > 
> > 1) When testing ext2 with "./check -g quick" in a QEMU/KVM x86_32 VM, 6GB 
> RAM,
> > booting a Vanilla kernel 6.3.0-rc1 with HIGHMEM64GB enabled, the OOM Killer
> > kicks in at random tests _with_ and _without_ Al's patches.
> > 
> > 2) The only case which does never trigger the OOM Killer is running the 
> tests
> > on ext2 formatted filesystems in loop disks with the stock openSUSE kernel
> > which is the 6.2.1-1-pae.
> > 
> > 3) The same "./check -g quick" on 6.3.0-rc1 runs always to completion with
> > other filesystems. I ran xfstests several times on Btrfs and I had no
> > problems.
> > 
> > 4) I cannot git-bisect this issue with ext2 because I cannot trust the 
> results
> > on any particular Kernel version. I mean that I cannot mark any specific
> > version neither "good" or "bad" because it happens that the same "good"
> > version instead make xfstests crash at the next run.
> > 
> > My conclusion is that we probably have some kind of race that makes the 
> random
> > tests crash at random runs of random Kernel versions between (at least) SUSE
> > 6.2.1 and Vanilla current.
> > 
> > But it may be very well the case that I'm doing something stupid (e.g., with
> > QEMU configuration or setup_disks or I can't imagine whatever else) and that
> > I'm unable to see where I make mistakes. After all, I'm still a newcomer 
> with
> > little experience :-)
> > 
> > Therefore, I'd suggest that someone else try to test ext2 in an x86_32 VM.
> > However, I'm 99.5% sure that Al's patches are good by the mere inspection of
> > his code.
> > 
> > I hope that this summary contains everything that may help.
> > 
> > However, I remain available to provide any further information and to give 
> my
> > contribution if you ask me for specific tasks.
> > 
> > For my part I have no idea how to investigate what is happening. In these
> > months I have run the VM hundreds of times on the most disparate filesystems
> > to test my conversions to kmap_local_page() and I have never seen anything
> > like this happen.
> > 
> > Thanks,
> > 
> > Fabio
> > 
> > 
> > Honza
> > 
> > > --
> > > Jan Kara <jack@...e.com>
> > > SUSE Labs, CR
> 
> I can't yet figure out which conditions lead to trigger the OOM Killer to kill 
> the XFCE Desktop Environment, and the xfstests (which I usually run into the 
> latter). After all, reserving 6GB of main memory to a QEMU/KVM x86_32 VM had 
> always been more than adequate.
> 
> So, I thought I'd better ignore that 6GB for a 32 bit architecture are a 
> notable amount of RAM and squeezed some more from the host until I went to 
> reserve 8GB. I know that this is not what who is able to find out what 
> consumes so much main memory would do, but wanted to get the output from the 
> tests, one way or the other... :-(
> 
> OK, I could finally run my tests to completion and had no crashes at all. I 
> ran "./check -g quick" on one "test" + three "scratch" loop devices formatted 
> with "mkfs.ext2 -c". I ran three times _with_ and then three times _without_ 
> Al's following patches cloned from his vfs tree, #work.ext2 branch:
> 
> f5b399373756 ("ext2: use offset_in_page() instead of open-coding it as 
> subtraction")
> c7248e221fb5 ("ext2_get_page(): saner type")
> 470e54a09898 ("ext2_put_page(): accept any pointer within the page")
> 15abcc147cf7 ("ext2_{set_link,delete_entry}(): don't bother with page_addr")
> 16a5ee2027b7 ("ext2_find_entry()/ext2_dotdot(): callers don't need
> 
> All the six tests were no longer killed by the Kernel :-)
> 
> I got 144 failures on 597 tests, regardless of the above listed patches.
> 
> My final conclusion is that these patches don't introduce regressions. I see 
> several tests that produce memory leaks but, I want to stress it again, the 
> failing tests are always the same with and without the patches.
> 
> therefore, I think that now I can safely add my tag to all five patches listed 
> above...
> 
> Tested-by: Fabio M. De Francesco <fmdefrancesco@...il.com>

Thanks for the effort! Al, will you submit these patches or should I just
pull your branch into my tree?

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ