linux-kernel - Re: [RFC PATCH 1/2] mm, vmscan: account the number of isolated pages per zone

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170206144221.GE10298@dhcp22.suse.cz>
Date:   Mon, 6 Feb 2017 15:42:22 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Brian Foster <bfoster@...hat.com>
Cc:     Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
        david@...morbit.com, dchinner@...hat.com, hch@....de,
        mgorman@...e.de, viro@...IV.linux.org.uk, linux-mm@...ck.org,
        hannes@...xchg.org, linux-kernel@...r.kernel.org,
        darrick.wong@...cle.com, linux-xfs@...r.kernel.org
Subject: Re: [RFC PATCH 1/2] mm, vmscan: account the number of isolated pages
 per zone

On Mon 06-02-17 09:35:33, Brian Foster wrote:
> On Mon, Feb 06, 2017 at 03:29:24PM +0900, Tetsuo Handa wrote:
> > Brian Foster wrote:
> > > On Fri, Feb 03, 2017 at 03:50:09PM +0100, Michal Hocko wrote:
> > > > [Let's CC more xfs people]
> > > > 
> > > > On Fri 03-02-17 19:57:39, Tetsuo Handa wrote:
> > > > [...]
> > > > > (1) I got an assertion failure.
> > > > 
> > > > I suspect this is a result of
> > > > http://lkml.kernel.org/r/20170201092706.9966-2-mhocko@kernel.org
> > > > I have no idea what the assert means though.
> > > > 
> > > > > 
> > > > > [  969.626518] Killed process 6262 (oom-write) total-vm:2166856kB, anon-rss:1128732kB, file-rss:4kB, shmem-rss:0kB
> > > > > [  969.958307] oom_reaper: reaped process 6262 (oom-write), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
> > > > > [  972.114644] XFS: Assertion failed: oldlen > newlen, file: fs/xfs/libxfs/xfs_bmap.c, line: 2867
> > > 
> > > Indirect block reservation underrun on delayed allocation extent merge.
> > > These are extra blocks are used for the inode bmap btree when a delalloc
> > > extent is converted to physical blocks. We're in a case where we expect
> > > to only ever free excess blocks due to a merge of extents with
> > > independent reservations, but a situation occurs where we actually need
> > > blocks and hence the assert fails. This can occur if an extent is merged
> > > with one that has a reservation less than the expected worst case
> > > reservation for its size (due to previous extent splits due to hole
> > > punches, for example). Therefore, I think the core expectation that
> > > xfs_bmap_add_extent_hole_delay() will always have enough blocks
> > > pre-reserved is invalid.
> > > 
> > > Can you describe the workload that reproduces this? FWIW, I think the
> > > way xfs_bmap_add_extent_hole_delay() currently works is likely broken
> > > and have a couple patches to fix up indlen reservation that I haven't
> > > posted yet. The diff that deals with this particular bit is appended.
> > > Care to give that a try?
> > 
> > The workload is to write to a single file on XFS from 10 processes demonstrated at
> > http://lkml.kernel.org/r/201512052133.IAE00551.LSOQFtMFFVOHOJ@I-love.SAKURA.ne.jp
> > using "while :; do ./oom-write; done" loop on a VM with 4CPUs / 2048MB RAM.
> > With this XFS_FILBLKS_MIN() change applied, I no longer hit assertion failures.
> > 
> 
> Thanks for testing. Well, that's an interesting workload. I couldn't
> reproduce on a few quick tries in a similarly configured vm.
> 
> Normally I'd expect to see this kind of thing on a hole punching
> workload or dealing with large, sparse files that make use of
> speculative preallocation (post-eof blocks allocated in anticipation of
> file extending writes). I'm wondering if what is happening here is that
> the appending writes and file closes due to oom kills are generating
> speculative preallocs and prealloc truncates, respectively, and that
> causes prealloc extents at the eof boundary to be split up and then
> re-merged by surviving appending writers.

Can those preallocs be affected by
http://lkml.kernel.org/r/20170201092706.9966-2-mhocko@kernel.org ?

-- 
Michal Hocko
SUSE Labs