lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241108173006.GA168069@frogsfrogsfrogs>
Date: Fri, 8 Nov 2024 09:30:06 -0800
From: "Darrick J. Wong" <djwong@...nel.org>
To: Zizhi Wo <wozizhi@...wei.com>
Cc: chandan.babu@...cle.com, dchinner@...hat.com, osandov@...com,
	john.g.garry@...cle.com, linux-xfs@...r.kernel.org,
	linux-kernel@...r.kernel.org, yangerkun@...wei.com
Subject: Re: [PATCH 1/2] xfs: Fix missing block calculations in xfs datadev
 fsmap

On Fri, Nov 08, 2024 at 10:29:08AM +0800, Zizhi Wo wrote:
> 
> 
> 在 2024/11/8 7:43, Darrick J. Wong 写道:
> > On Mon, Aug 26, 2024 at 11:10:04AM +0800, Zizhi Wo wrote:
> > > In xfs datadev fsmap query, I noticed a missing block calculation problem:
> > > [root@...ora ~]# xfs_db -r -c "sb 0" -c "p" /dev/vdb
> > > magicnum = 0x58465342
> > > blocksize = 4096
> > > dblocks = 5242880
> > > ......
> > > [root@...ora ~]# xfs_io -c 'fsmap -vvvv' /mnt
> > > ...
> > > 30: 253:16 [31457384..41943031]: free space            3  (104..10485751)    10485648
> > > 
> > > (41943031 + 1) / 8 = 5242879 != 5242880
> > > We missed one block in our fsmap calculation!
> > 
> > Eek.
> > 
> > > The root cause of the problem lies in __xfs_getfsmap_datadev(), where the
> > > calculation of "end_fsb" requires a classification discussion. If "end_fsb"
> > > is calculated based on "eofs", we need to add an extra sentinel node for
> > > subsequent length calculations. Otherwise, one block will be missed. If
> > > "end_fsb" is calculated based on "keys[1]", then there is no need to add an
> > > extra node. Because "keys[1]" itself is unreachable, it cancels out one of
> > > the additions. The diagram below illustrates this:
> > > 
> > > |0|1|2|3|4|5|6|7|8|9|10|11|12|13|14|15|16|-----eofs
> > > |---------------|---------------------|
> > > a       n       b         n+1         c
> > > 
> > > Assume that eofs is 16, the start address of the previous query is block n,
> > > sector 0, and the length is 1, so the "info->next" is at point b, sector 8.
> > > In the last query, suppose the "rm_startblock" calculated based on
> > > "eofs - 1" is the last block n+1 at point b. All we get is the starting
> > > address of the block, not the end. Therefore, an additional sentinel node
> > > needs to be added to move it to point c. After that, subtracting one from
> > > the other will yield the remaining 1.
> > > 
> > > Although we can now calculate the exact last query using "info->end_daddr",
> > > we will still get an incorrect value if the device at this point is not the
> > > boundary device specified by "keys[1]", as "end_daddr" is still the initial
> > > value. Therefore, the eofs situation here needs to be corrected. The issue
> > > is resolved by adding a sentinel node.
> > 
> > Why don't we set end_daddr unconditionally, then?
> > 
> > Hmm, looking at the end_daddr usage in fsmap.c, I think it's wrong.  If
> > end_daddr is set at all, it's set either to the last sector for which
> > the user wants a mapping; or it's set to the last sector for the device.
> > But then look at how we use it:
> > 
> > 	if (info->last...)
> > 		frec->start_daddr = info->end_daddr;
> > 
> > 	...
> > 
> > 	/* "report the gap..."
> > 	if (frec->start_daddr > info->next_daddr) {
> > 		fmr.fmr_length = frec->start_daddr - info->next_daddr;
> > 	}
> > 
> > This is wrong -- we're using start_daddr to compute the distance from
> > the last mapping that we output up to the end of the range that we want.
> > The "end of the range" is modeled with a phony rmap record that starts
> > at the first fsblock after that range.
> > 
> 
> In the current code, we set "rec_daddr = end_daddr" only when
> (info->last && info->end_daddr != NULL), which should ensure that this
> is the last query?

Right.

> Because end_daddr is set to the last device, and
> info->last is set to the last query. Therefore, assigning it to
> start_daddr should not cause issues in the next query?

Right, the code currently sets end_daddr only for the last device, so
there won't be any issues with the next query.

That said, we reset the xfs_getfsmap_info state between each device, so
it's safe to set end_daddr for every device, not just the last time
through that loop.

> Did I misunderstand something? Or is it because the latest code
> constantly updates end_daddr, which is why this issue arises?

The 6.13 metadir/rtgroups patches didn't change when end_daddr gets set,
but my fixpatch *does* make it set end_daddr for all devices.  Will send
a patch + fstests update shortly to demonstrate. :)

> > IOWs, that assignment should have been
> > frec->start_daddr = info->end_daddr + 1.
> > 
> > Granted in August the codebase was less clear about the difference
> > between rec_daddr and rmap->rm_startblock.  For 6.13, hch cleaned all
> > that up -- rec_daddr is now called start_daddr and the fsmap code passes
> > rmap records with space numbers in units of daddrs via a new struct
> > xfs_fsmap_rec.  Unfortunately, that's all buried in the giant pile of
> > pull requests I sent a couple of days ago which hasn't shown up on
> > for-next yet.
> > 
> > https://lore.kernel.org/linux-xfs/173084396955.1871025.18156568347365549855.stgit@frogsfrogsfrogs/
> > 
> > So I think I know how to fix this against the 6.13 codebase, but I'm
> > going to take a slightly different approach than yours...
> > 
> > > Fixes: e89c041338ed ("xfs: implement the GETFSMAP ioctl")
> > > Signed-off-by: Zizhi Wo <wozizhi@...wei.com>
> > > ---
> > >   fs/xfs/xfs_fsmap.c | 19 +++++++++++++++++--
> > >   1 file changed, 17 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c
> > > index 85dbb46452ca..8a2dfe96dae7 100644
> > > --- a/fs/xfs/xfs_fsmap.c
> > > +++ b/fs/xfs/xfs_fsmap.c
> > > @@ -596,12 +596,27 @@ __xfs_getfsmap_datadev(
> > >   	xfs_agnumber_t			end_ag;
> > >   	uint64_t			eofs;
> > >   	int				error = 0;
> > > +	int				sentinel = 0;
> > >   	eofs = XFS_FSB_TO_BB(mp, mp->m_sb.sb_dblocks);
> > >   	if (keys[0].fmr_physical >= eofs)
> > >   		return 0;
> > >   	start_fsb = XFS_DADDR_TO_FSB(mp, keys[0].fmr_physical);
> > > -	end_fsb = XFS_DADDR_TO_FSB(mp, min(eofs - 1, keys[1].fmr_physical));
> > > +	/*
> > > +	 * For the case of eofs, we need to add a sentinel node;
> > > +	 * otherwise, one block will be missed when calculating the length
> > > +	 * in the last query.
> > > +	 * For the case of key[1], there is no need to add a sentinel node
> > > +	 * because it already represents a value that cannot be reached.
> > > +	 * For the case where key[1] after shifting is within the same
> > > +	 * block as the starting address, it is resolved using end_daddr.
> > > +	 */
> > > +	if (keys[1].fmr_physical > eofs - 1) {
> > > +		sentinel = 1;
> > > +		end_fsb = XFS_DADDR_TO_FSB(mp, eofs - 1);
> > > +	} else {
> > > +		end_fsb = XFS_DADDR_TO_FSB(mp, keys[1].fmr_physical);
> > > +	}
> > 
> > ...because running against djwong-wtf, I actually see the same symptoms
> > for the realtime device.  So I think a better solution is to change
> > xfs_getfsmap to set end_daddr always, and then fix the off by one error.
> > 
> 
> Yes, my second patch looks at this rt problem...
> Thank you for your reply

<nod>

--D

> Thanks,
> Zizhi Wo
> 
> 
> > I also don't really like "sentinel" values because they're not
> > intuitive.
> > 
> > I will also go update xfs/273 to check that there are no gaps in the
> > mappings returned, and that they go to where the filesystem thinks is
> > the end of the device.  Thanks for reporting this, sorry I was too busy
> > trying to get metadir/rtgroups done to look at this until now. :(
> > 
> > --D
> > 
> 
> 
> 
> > >   	/*
> > >   	 * Convert the fsmap low/high keys to AG based keys.  Initialize
> > > @@ -649,7 +664,7 @@ __xfs_getfsmap_datadev(
> > >   		info->pag = pag;
> > >   		if (pag->pag_agno == end_ag) {
> > >   			info->high.rm_startblock = XFS_FSB_TO_AGBNO(mp,
> > > -					end_fsb);
> > > +					end_fsb) + sentinel;
> > >   			info->high.rm_offset = XFS_BB_TO_FSBT(mp,
> > >   					keys[1].fmr_offset);
> > >   			error = xfs_fsmap_owner_to_rmap(&info->high, &keys[1]);
> > > -- 
> > > 2.39.2
> > > 
> > > 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ