linux-kernel - Re: XFS performance degradation during running cp command with big test file

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CALy5rjX4xU0UtuQUZxD56LMpX=pseWwE0OSR4J2JH_Ce3bqAVg@mail.gmail.com>
Date: Thu, 17 Oct 2024 10:18:10 +0800
From: Xiongwei Song <sxwbruce@...il.com>
To: Dave Chinner <david@...morbit.com>
Cc: cem@...nel.org, djwong@...nel.org, linux-xfs@...r.kernel.org, 
	linux-kernel@...r.kernel.org
Subject: Re: XFS performance degradation during running cp command with big
 test file

Hi Dave,

Thank you so much for the response.

On Thu, Oct 17, 2024 at 8:29 AM Dave Chinner <david@...morbit.com> wrote:
>
> On Wed, Oct 16, 2024 at 07:09:29PM +0800, Xiongwei Song wrote:
> > Dear Experts,
> >
> > We are facing a performance degradation on the XFS partition. We
> > was trying to copy a big file(200GB ~ 250GB) from a path to /dev/null,
> > when performing cp command to 60s ~ 90s, the reading speed was
> > suddenly down. At the beginning, the reading speed was around
> > 1080MB/s, 60s later the speed was down to around 350MB/s. This
> > problem  is only found with XFS + Thick LUN.
>
> There are so many potential things that this could be caused by.
>
> > The test environment:
> > Storage Model: Dell unity XT 380 Think/Thin LUN
>
> How many CPUS, RAM, etc does this have?  What disks and what is the
> configuration of the fully provisioned LUN you are testing on?
>
> > Linux Version: 4.12.14
>
> You're running an ancient kernel, so the first thing to do is move
> to a much more recent kernel (e.g. 6.11) and see if the same
> behaviour occurs. If it does, then please answer all the other
> questions I've asked and provide the information from running the
> tests on the 6.11 kernel...
Ok, sure. I will try to upgrade the kernel version and run the test again.
But I don't own the test hardware. This issue can't be reproduced on any
machines, so I might not reply to you very quickly.  The worst situation is
I can't use the hardware any more. But once I get the test result I will get
back to you and answer all your questions as soon as possible.

Thank you again.

Regards,
Bruce

>
> > The steps to run test:
> > 1) Create a xfs partition with following commands
> >    parted -a opt /dev/sdb mklabel gpt mkpart sdb xfs 0% 100%
> >    mkfs.xfs /dev/sdbx
> >    mount /dev/sdbx /xfs
>
> What is the output of mkfs.xfs?
>
> Did you drop the page cache between the initial file create and
> the measured copy?
>
> what is the layout of the file you are copying from (ie. xfs_bmap
> -vvp <file> output)?
>
> > It seems the issue only can be triggered with XFS + Thick LUN,
> > no matter dd or cp to read the test file. We would like to learn
> > if there is something special with XFS in this test situation?
> > Is it known?
>
> It smells like the difference in bandwidth between the outside edge
> and the inside edge of a spinning disk, and XFS is switching
> allocation location of the very big file from the outside to the
> inside part way through the file (e.g. because the initial AG the
> file is located in is full)...
>
> > Do you have any thoughts or suggestions? Also, do you need vmstat
> > or iostat logs or blktrace or any other logs to address this issue?
>
> iostat and vmstat output in 1s increments would be useful.
>
> -Dave.
> --
> Dave Chinner
> david@...morbit.com