[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20121025234048.GH29378@dastard>
Date: Fri, 26 Oct 2012 10:40:48 +1100
From: Dave Chinner <david@...morbit.com>
To: Mikulas Patocka <mpatocka@...hat.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
Oleg Nesterov <oleg@...hat.com>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Ingo Molnar <mingo@...e.hu>,
Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
Ananth N Mavinakayanahalli <ananth@...ibm.com>,
Anton Arapov <anton@...hat.com>, linux-kernel@...r.kernel.org,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH 1/2] brw_mutex: big read-write mutex
On Thu, Oct 25, 2012 at 10:09:31AM -0400, Mikulas Patocka wrote:
>
>
> On Wed, 24 Oct 2012, Dave Chinner wrote:
>
> > On Fri, Oct 19, 2012 at 06:54:41PM -0400, Mikulas Patocka wrote:
> > >
> > >
> > > On Fri, 19 Oct 2012, Peter Zijlstra wrote:
> > >
> > > > > Yes, I tried this approach - it involves doing LOCK instruction on read
> > > > > lock, remembering the cpu and doing another LOCK instruction on read
> > > > > unlock (which will hopefully be on the same CPU, so no cacheline bouncing
> > > > > happens in the common case). It was slower than the approach without any
> > > > > LOCK instructions (43.3 seconds seconds for the implementation with
> > > > > per-cpu LOCKed access, 42.7 seconds for this implementation without atomic
> > > > > instruction; the benchmark involved doing 512-byte direct-io reads and
> > > > > writes on a ramdisk with 8 processes on 8-core machine).
> > > >
> > > > So why is that a problem? Surely that's already tons better then what
> > > > you've currently got.
> > >
> > > Percpu rw-semaphores do not improve performance at all. I put them there
> > > to avoid performance regression, not to improve performance.
> > >
> > > All Linux kernels have a race condition - when you change block size of a
> > > block device and you read or write the device at the same time, a crash
> > > may happen. This bug is there since ever. Recently, this bug started to
> > > cause major trouble - multiple high profile business sites report crashes
> > > because of this race condition.
> > >
> > > You can fix this race by using a read lock around I/O paths and write lock
> > > around block size changing, but normal rw semaphore cause cache line
> > > bouncing when taken for read by multiple processors and I/O performance
> > > degradation because of it is measurable.
> >
> > This doesn't sound like a new problem. Hasn't this global access,
> > single modifier exclusion problem been solved before in the VFS?
> > e.g. mnt_want_write()/mnt_make_readonly()
> >
> > Cheers,
> >
> > Dave.
>
> Yes, mnt_want_write()/mnt_make_readonly() do the same thing as percpu rw
> semaphores. I think you can convert mnt_want_write()/mnt_make_readonly()
> to use percpu rw semaphores and remove the duplicated code.
I think you misunderstood my point - that rather than re-inventing
the wheel, why didn't you just copy something that is known to
work?
Cheers,
Dave.
--
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists