[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55C8F533.1090007@colorfullife.com>
Date: Mon, 10 Aug 2015 21:02:11 +0200
From: Manfred Spraul <manfred@...orfullife.com>
To: "Herton R. Krzesinski" <herton@...hat.com>
Cc: linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
Davidlohr Bueso <dave@...olabs.net>,
Rafael Aquini <aquini@...hat.com>,
Joe Perches <joe@...ches.com>,
Aristeu Rozanski <aris@...hat.com>, djeffery@...hat.com
Subject: Re: [PATCH] ipc,sem: fix use after free on IPC_RMID after a task
using same semaphore set exits
Hi Herton,
On 08/10/2015 05:31 PM, Herton R. Krzesinski wrote:
> Well without the synchronize_rcu() and with the semid list loop fix I was still
> able to get issues, and I thought the problem is related to racing with IPC_RMID
> on freeary again. This is one scenario I would imagine:
>
> A B
>
> freeary()
> list_del(&un->list_id)
> spin_lock(&un->ulp->lock)
> un->semid = -1
> list_del_rcu(&un->list_proc)
> __list_del_entry(&un->list_proc)
> __list_del(entry->prev, entry->next) exit_sem()
> next->prev = prev; ...
> prev->next = next; ...
> ... un = list_entry_rcu(ulp->list_proc.next...)
> (&un->list_proc)->prev = LIST_POISON2 if (&un->list_proc == &ulp->list_proc) <true, last un removed by thread A>
> ... kfree(ulp)
> spin_unlock(&un->ulp->lock) <---- bug
>
> Now that is a very tight window, but I had problems even when I tried this patch
> first:
>
> (...)
> - if (&un->list_proc == &ulp->list_proc)
> - semid = -1;
> - else
> - semid = un->semid;
> + if (&un->list_proc == &ulp->list_proc) {
> + rcu_read_unlock();
What about:
+ spin_unlock_wait(&ulp->lock);
> + break;
> + }
> + spin_lock(&ulp->lock);
> + semid = un->semid;
> + spin_unlock(&ulp->lock);
>
> + /* exit_sem raced with IPC_RMID, nothing to do */
> if (semid == -1) {
> rcu_read_unlock();
> - break;
> + synchronize_rcu();
> + continue;
> }
> (...)
>
> So even with the bad/uneeded synchronize_rcu() which I had placed there, I could
> still get issues (however the testing on patch above was on an older kernel than
> latest upstream, from RHEL 6, I can test without synchronize_rcu() on latest
> upstream, however the affected code is the same). That's when I thought of
> scenario above. I was able to get this oops:
Adding sleep() usually help, too. But it is ugly, so let's try to
understand the race and to fix it.
Best regards,
Manfred
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists