lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210113214436.GH331610@dread.disaster.area>
Date:   Thu, 14 Jan 2021 08:44:36 +1100
From:   Dave Chinner <david@...morbit.com>
To:     Brian Foster <bfoster@...hat.com>
Cc:     Donald Buczek <buczek@...gen.mpg.de>, linux-xfs@...r.kernel.org,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        it+linux-xfs@...gen.mpg.de
Subject: Re: [PATCH] xfs: Wake CIL push waiters more reliably

On Fri, Jan 08, 2021 at 11:56:57AM -0500, Brian Foster wrote:
> On Fri, Jan 08, 2021 at 08:54:44AM +1100, Dave Chinner wrote:
> > e.g. we run the first transaction into the CIL, it steals the sapce
> > needed for the cil checkpoint headers for the transaciton. Then if
> > the space returned by the item formatting is negative (because it is
> > in the AIL and being relogged), the CIL checkpoint now doesn't have
> > the space reserved it needs to run a checkpoint. That transaction is
> > a sync transaction, so it forces the log, and now we push the CIL
> > without sufficient reservation to write out the log headers and the
> > items we just formatted....
> > 
> 
> Hmmm... that seems like an odd scenario because I'd expect the space
> usage delta to reflect what might or might not have already been added
> to the CIL context, not necessarily the AIL. IOW, shouldn't a negative
> delta only occur for items being relogged while still CIL resident
> (regardless of AIL residency)?
>
> From a code standpoint, the way a particular log item delta comes out
> negative is from having a shadow lv size smaller than the ->li_lv size.
> Thus, xlog_cil_insert_format_items() subtracts the currently formatted
> lv size from the delta, formats the current state of the item, and
> xfs_cil_prepare_item() adds the new (presumably smaller) size to the
> delta. We reuse ->li_lv in this scenario so both it and the shadow
> buffer remain, but a CIL push pulls ->li_lv from all log items and
> chains them to the CIL context for writing, so I don't see how we could
> have an item return a negative delta on an empty CIL. Hm?

In theory, yes. But like I said, I've made a bunch of assumptions
that this won't happen, and so without actually auditting the code
I'm not actually sure that it won't. i.e. I need to go check what
happens with items being marked stale, how shadow buffer
reallocation interacts with items that end up being smaller than the
existing buffer, etc. I've paged a lot of this detail out of my
brain, so until I spend the time to go over it all again I'm not
going to be able to answer definitively.

> (I was also wondering whether repeated smaller relogs of an item could
> be a vector for this to go wrong, but it looks like
> xlog_cil_insert_format_items() always uses the formatted size of the
> current buffer...).

That's my nagging doubt about all this - that there's an interaction
of this nature that I haven't considered due to the assumptions I've
made that allows underflow to occur. That would be much worse than
the current situation of hanging on a missing wakeup when the CIL is
full and used_space goes backwards....

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ