linux-kernel - Re: XFS performance degradation during running cp command with big test file

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZxBafdsU6ioeTBmQ@dread.disaster.area>
Date: Thu, 17 Oct 2024 11:29:49 +1100
From: Dave Chinner <david@...morbit.com>
To: Xiongwei Song <sxwbruce@...il.com>
Cc: cem@...nel.org, djwong@...nel.org, linux-xfs@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: XFS performance degradation during running cp command with big
 test file

On Wed, Oct 16, 2024 at 07:09:29PM +0800, Xiongwei Song wrote:
> Dear Experts,
> 
> We are facing a performance degradation on the XFS partition. We
> was trying to copy a big file(200GB ~ 250GB) from a path to /dev/null,
> when performing cp command to 60s ~ 90s, the reading speed was
> suddenly down. At the beginning, the reading speed was around
> 1080MB/s, 60s later the speed was down to around 350MB/s. This
> problem  is only found with XFS + Thick LUN.

There are so many potential things that this could be caused by.

> The test environment:
> Storage Model: Dell unity XT 380 Think/Thin LUN

How many CPUS, RAM, etc does this have?  What disks and what is the
configuration of the fully provisioned LUN you are testing on?

> Linux Version: 4.12.14

You're running an ancient kernel, so the first thing to do is move
to a much more recent kernel (e.g. 6.11) and see if the same
behaviour occurs. If it does, then please answer all the other
questions I've asked and provide the information from running the
tests on the 6.11 kernel...

> The steps to run test:
> 1) Create a xfs partition with following commands
>    parted -a opt /dev/sdb mklabel gpt mkpart sdb xfs 0% 100%
>    mkfs.xfs /dev/sdbx
>    mount /dev/sdbx /xfs

What is the output of mkfs.xfs?

Did you drop the page cache between the initial file create and
the measured copy?

what is the layout of the file you are copying from (ie. xfs_bmap
-vvp <file> output)?

> It seems the issue only can be triggered with XFS + Thick LUN,
> no matter dd or cp to read the test file. We would like to learn
> if there is something special with XFS in this test situation?
> Is it known?

It smells like the difference in bandwidth between the outside edge
and the inside edge of a spinning disk, and XFS is switching
allocation location of the very big file from the outside to the
inside part way through the file (e.g. because the initial AG the
file is located in is full)...

> Do you have any thoughts or suggestions? Also, do you need vmstat
> or iostat logs or blktrace or any other logs to address this issue?

iostat and vmstat output in 1s increments would be useful.

-Dave.
-- 
Dave Chinner
david@...morbit.com