lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 12 Sep 2020 09:44:15 -0500
From:   Michael Larabel <Michael@...haelLarabel.com>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     Amir Goldstein <amir73il@...il.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Ted Ts'o <tytso@...gle.com>,
        Andreas Dilger <adilger.kernel@...ger.ca>,
        Ext4 Developers List <linux-ext4@...r.kernel.org>,
        Jan Kara <jack@...e.cz>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: Kernel Benchmarking


On 9/12/20 9:37 AM, Matthew Wilcox wrote:
> On Sat, Sep 12, 2020 at 05:32:11AM -0500, Michael Larabel wrote:
>> On 9/12/20 2:28 AM, Amir Goldstein wrote:
>>> On Sat, Sep 12, 2020 at 1:40 AM Michael Larabel
>>> <Michael@...haellarabel.com> wrote:
>>>> On 9/11/20 5:07 PM, Linus Torvalds wrote:
>>>>> On Fri, Sep 11, 2020 at 9:19 AM Linus Torvalds
>>>>> <torvalds@...ux-foundation.org> wrote:
>>>>>> Ok, it's probably simply that fairness is really bad for performance
>>>>>> here in general, and that special case is just that - a special case,
>>>>>> not the main issue.
>>>>> Ahh. It turns out that I should have looked more at the fault path
>>>>> after all. It was higher up in the profile, but I ignored it because I
>>>>> found that lock-unlock-lock pattern lower down.
>>>>>
>>>>> The main contention point is actually filemap_fault(). Your apache
>>>>> test accesses the 'test.html' file that is mmap'ed into memory, and
>>>>> all the threads hammer on that one single file concurrently and that
>>>>> seems to be the main page lock contention.
>>>>>
>>>>> Which is really sad - the page lock there isn't really all that
>>>>> interesting, and the normal "read()" path doesn't even take it. But
>>>>> faulting the page in does so because the page will have a long-term
>>>>> existence in the page tables, and so there's a worry about racing with
>>>>> truncate.
>>>>>
>>>>> Interesting, but also very annoying.
>>>>>
>>>>> Anyway, I don't have a solution for it, but thought I'd let you know
>>>>> that I'm still looking at this.
>>>>>
>>>>>                    Linus
>>>> I've been running your EXT4 patch on more systems and with some
>>>> additional workloads today. While not the original problem, the patch
>>>> does seem to help a fair amount for the MariaDB database sever. This
>>>> wasn't one of the workloads regressing on 5.9 but at least with the
>>>> systems tried so far the patch does make a meaningful improvement to the
>>>> performance. I haven't run into any apparent issues with that patch so
>>>> continuing to try it out on more systems and other database/server
>>>> workloads.
>>>>
>>> Michael,
>>>
>>> Can you please add a reference to the original problem report and
>>> to the offending commit? This conversation appeared on the list without
>>> this information.
>>>
>>> Are filesystems other than ext4 also affected by this performance
>>> regression?
>>>
>>> Thanks,
>>> Amir.
>> On Linux 5.9 Git, Apache HTTPD, Redis, Nginx, and Hackbench appear to be the
>> main workloads that are running measurably slower than on Linux 5.8 and
>> prior on multiple systems.
>>
>> The issue was bisected to 2a9127fcf2296674d58024f83981f40b128fffea. The
>> Kernel Test Robot also previously was triggered by the commit in question
>> with mixed Hackbench results. In looking at the problem Linus had a hunch
>> when looking at the perf data that it may have had an adverse reaction with
>> the EXT4 locking behavior to which he sent out that patch. That EXT4 patch
>> didn't end up addressing the performance issue with the original workloads
>> in question (though in testing other workloads it seems to have benefit for
>> MariaDB at least depending upon the system there can be slightly better
>> performance).
> Based on this limited amount of information, I would suspect there would
> also be a problem with XFS, and that would be even _more_ sad because
> XFS already excludes a truncate-vs-mmap race with the MMAPLOCK_SHARED in
> __xfs_filemap_fault vs MMAPLOCK_EXCL ... somewhere in the truncate path,
> I'm sure.  It's definitely there for the holepunch.
>
> So maybe XFS should have its own implementation of filemap_fault,
> or we should have a filemap_fault_locked() for filesystems which have
> their own locking that excludes truncate.

Interesting, I'll fire up some cross-filesystem benchmarks with those 
tests today and report back shortly with the difference.

Michael

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ