[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM_iQpU6+xD+LmVtd=GPVp2zXgmLjT8tstRa7ia0pNgO0AcaPA@mail.gmail.com>
Date: Tue, 16 Apr 2019 10:09:49 -0700
From: Cong Wang <xiyou.wangcong@...il.com>
To: Borislav Petkov <bp@...en8.de>
Cc: LKML <linux-kernel@...r.kernel.org>, linux-edac@...r.kernel.org,
Tony Luck <tony.luck@...el.com>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH 2/2] ras: close the race condition with timer
On Tue, Apr 16, 2019 at 2:58 AM Borislav Petkov <bp@...en8.de> wrote:
>
> On Mon, Apr 15, 2019 at 06:20:01PM -0700, Cong Wang wrote:
> > cec_timer_fn() is a timer callback which reads ce_arr.array[]
> > and updates its decay values. Elements could be added to or
> > removed from this global array in parallel, although the array
> > itself will not grow or shrink. del_lru_elem_unlocked() uses
> > FULL_COUNT() as a key to find a right element to remove,
> > which could be affected by the parallel timer.
> >
> > Fix this by converting the mutex to a spinlock and holding it
> > inside the timer.
> >
> > Fixes: 011d82611172 ("RAS: Add a Corrected Errors Collector")
> > Cc: Tony Luck <tony.luck@...el.com>
> > Cc: Borislav Petkov <bp@...en8.de>
> > Cc: Thomas Gleixner <tglx@...utronix.de>
> > Signed-off-by: Cong Wang <xiyou.wangcong@...il.com>
> > ---
> > drivers/ras/cec.c | 16 +++++++++-------
> > 1 file changed, 9 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/ras/cec.c b/drivers/ras/cec.c
> > index 61332c9aab5a..a82c9d08d47a 100644
> > --- a/drivers/ras/cec.c
> > +++ b/drivers/ras/cec.c
> > @@ -117,7 +117,7 @@ static struct ce_array {
> > };
> > } ce_arr;
> >
> > -static DEFINE_MUTEX(ce_mutex);
> > +static DEFINE_SPINLOCK(ce_lock);
>
> Nah, pls keep the simplicity here by retaining the mutex. Use a
I don't understand why mutex is simpler than a spinlock here.
They are just locks requiring different contexts, I don't see how one is
simpler than the other. Do you mind to be more specific?
> workqueue instead to queue the spring cleaning from IRQ context and then
> do it when back in preemptible context.
By workqueue, you must mean to say delayed work, right?
But the global workqueue is not serialized either, if you really prefer to
move it to a same workqueue for serialization, we have to switch to an
ordered workqueue here. It certainly could work, but I don't see how
this way is any simpler than just using a spinlock.
What do you think?
Thanks.
Powered by blists - more mailing lists