[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFUsyfJTuFjVXHMgYi0uggVNW=1WW1uVYa7avVjW5VBb2cmAkQ@mail.gmail.com>
Date: Wed, 17 Nov 2021 16:45:13 -0600
From: Noah Goldstein <goldstein.w.n@...il.com>
To: David Laight <David.Laight@...lab.com>
Cc: "tglx@...utronix.de" <tglx@...utronix.de>,
"mingo@...hat.com" <mingo@...hat.com>,
"bp@...en8.de" <bp@...en8.de>, "x86@...nel.org" <x86@...nel.org>,
"hpa@...or.com" <hpa@...or.com>,
"luto@...nel.org" <luto@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v4] arch/x86: Improve 'rep movs{b|q}' usage in memmove_64.S
On Wed, Nov 17, 2021 at 4:31 PM David Laight <David.Laight@...lab.com> wrote:
>
> From: Noah Goldstein
> > Sent: 17 November 2021 21:03
> >
> > Add check for "short distance movsb" for forwards FSRM usage and
> > entirely remove backwards 'rep movsq'. Both of these usages hit "slow
> > modes" that are an order of magnitude slower than usual.
> >
> > 'rep movsb' has some noticeable VERY slow modes that the current
> > implementation is either 1) not checking for or 2) intentionally
> > using.
>
> How does this relate to the decision that glibc made a few years
> ago to use backwards 'rep movs' for non-overlapping copies?
GLIBC doesn't use backwards `rep movs`. Since the regions are
non-overlapping it just uses forward copy. Backwards `rep movs` is
from setting the direction flag (`std`) and is a very slow byte
copy. For overlapping regions where backwards copy is necessary GLIBC
uses 4x vec copy loop.
>
>
> Did they find a different corner case??
>
> David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
>
Powered by blists - more mailing lists