lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 14 Jan 2020 19:25:14 +0100
From:   Christoph Hellwig <hch@....de>
To:     Waiman Long <longman@...hat.com>
Cc:     Christoph Hellwig <hch@....de>, linux-xfs@...r.kernel.org,
        linux-fsdevel@...r.kernel.org,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        linux-ext4@...r.kernel.org, cluster-devel@...hat.com,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH 02/12] locking/rwsem: Exit early when held by an
 anonymous owner

On Tue, Jan 14, 2020 at 01:17:45PM -0500, Waiman Long wrote:
> The owner field is just a pointer to the task structure with the lower 3
> bits served as flag bits. Setting owner to RWSEM_OWNER_UNKNOWN (-2) will
> stop optimistic spinning. So under what condition did the crash happen?

When running xfstests with all patches in this series except for this
one, IIRC in generic/114.

> Anyway, PeterZ is working on revising the percpu-rwsem implementation to
> more gracefully handle the frozen case. At the end, there will not be a
> need for the RWSEM_OWNER_UNKNOWN magic and it can be removed.

Well, this series relies on that value.  And I think it fundamentally
is the right thing to do for AIO, and potentially other I/O related
locking where we take a lock to synchronize access to data, then
do I/O and then eventually get an I/O completion from an interrupt.
Even thinking from the PREEMP_RT context we want to boost the
initial thread as long as we can, then do nothing when it is off
to I/O hardware (except maybe providing good diagnostics that the cause
for the latency is I/O), and then boost the thread that is handling
the completion.  Things like the i_dio_count hack can't provide that.

Powered by blists - more mailing lists