[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120828175716.GA4595@redhat.com>
Date: Tue, 28 Aug 2012 20:57:16 +0300
From: "Michael S. Tsirkin" <mst@...hat.com>
To: Rafael Aquini <aquini@...hat.com>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
virtualization@...ts.linux-foundation.org,
Rusty Russell <rusty@...tcorp.com.au>,
Rik van Riel <riel@...hat.com>, Mel Gorman <mel@....ul.ie>,
Andi Kleen <andi@...stfloor.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
Minchan Kim <minchan@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Subject: Re: [PATCH v9 3/5] virtio_balloon: introduce migration primitives to
balloon pages
On Tue, Aug 28, 2012 at 02:37:13PM -0300, Rafael Aquini wrote:
> On Tue, Aug 28, 2012 at 06:54:10PM +0300, Michael S. Tsirkin wrote:
> > On Mon, Aug 27, 2012 at 04:47:13PM -0300, Rafael Aquini wrote:
> > > On Sun, Aug 26, 2012 at 10:42:44AM +0300, Michael S. Tsirkin wrote:
> > > >
> > > > Reading two atomics and doing math? Result can even be negative.
> > > > I did not look at use closely but it looks suspicious.
> > > Doc on atomic_read says:
> > > "
> > > The read is atomic in that the return value is guaranteed to be one of the
> > > values initialized or modified with the interface operations if a proper
> > > implicit or explicit memory barrier is used after possible runtime
> > > initialization by any other thread and the value is modified only with the
> > > interface operations.
> > > "
> > >
> > > There's no runtime init by other thread than balloon's itself at device register,
> > > and the operations (inc, dec) are made by the proper interface operations
> > > only when protected by the spinlock pages_lock. It does not look suspicious, IMHO.
> >
> > Any use of multiple atomics is suspicious.
> > Please just avoid it if you can. What's wrong with locking?
> >
> > > I'm failing to see how it could become a negative on that case, since you cannot
> > > isolate more pages than what was previoulsy inflated to balloon's list.
> >
> > There is no order guarantee. So in
> > A - B you can read B long after both A and B has been incremented.
> > Maybe it is safe in this case but it needs careful documentation
> > to explain how ordering works. Much easier to keep it all simple.
> >
> > >
> > > > It's already the case everywhere except __wait_on_isolated_pages,
> > > > so just fix that, and then we can keep using int instead of atomics.
> > > >
> > > Sorry, I quite didn't get you here. fix what?
> >
> > It's in the text you removed above. Access values under lock.
> >
>
> So, you prefer this way:
>
> /*
> * __wait_on_isolated_pages - check if leak_balloon() must wait on isolated
> * pages before proceeding with the page release.
> * @vb : pointer to the struct virtio_balloon describing this device.
> * @leak_target: how many pages we are attempting to release this round.
> */
> static inline void __wait_on_isolated_pages(struct virtio_balloon *vb,
> size_t leak_target)
> {
> unsigned int num_pages, isolated_pages;
> spin_lock(&vb->pages_lock);
> num_pages = vb->num_pages;
> isolated_pages = vb->num_isolated_pages;
> spin_unlock(&vb->pages_lock);
> /*
> * If isolated pages are making our leak target bigger than the
> * total pages that we can release this round. Let's wait for
> * migration returning enough pages back to balloon's list.
> */
> wait_event(vb->config_change,
> (!isolated_pages ||
> leak_target <= (num_pages - isolated_pages)));
This logic looks strange too - it does not 100% match the comment.
> }
>
> ?
Except that it does not work. You need to do the lock/unlock
dance and retest within wait_event.
> > >
> > > > That's 1K on stack - and can become more if we increase
> > > > VIRTIO_BALLOON_ARRAY_PFNS_MAX. Probably too much - this is the reason
> > > > we use vb->pfns.
> > > >
> > > If we want to use vb->pfns we'll have to make leak_balloon mutual exclusive with
> > > page migration (as it was before), but that will inevictably bring us back to
> > > the discussion on breaking the loop when isolated pages make leak_balloon find
> > > less pages than it wants to release at each leak round.
> > >
> >
> > I don't think this is an issue. The issue was busy waiting in that case.
> >
> But, in fact, it is.
> As we couldn't drop the mutex that prevents migration from happening, otherwise
> the migration threads would screw up with our vb->pfns array, there will be no point
> on keep waiting for isolated pages being reinserted on balloon's list, cause the
> migration threads that will accomplish that task are also waiting on us dropping
> the mutex.
>
> You may argue that we could flag virtballoon_migratepage() to give up and return
> before even trying to aquire the mutex, if a leak is ongoing -- deferring work
> to virtballoon_putbackpage(). However, I'm eager to think that for this case,
> the CPU time we spent isolating pages for compaction would be simply wasted and,
> perhaps, no effective compaction was even reached.
> And that makes me think it would have been better to stick with the old logics of
> breaking the loop since leak_balloon(), originally, also remains busy waiting
> while pursuing its target, anyway.
>
> That's the trade here, IMO. If one really wants to wait on potentially isolated
> pages getting back to the list before proceeding, we'll have to burn a little
> more stack space with local variables, unfortunately.
Sorry I do not understand what you are saying here. So find
a different locking strategy.
For example something like:
wait_event(vb->config_change,
({
lock
if (target <= (num_pages - isolated_pages))
leak balloon
cond = target <= (num_pages - isolated_pages));
unlock;
cond;
})
)
seems to have no issues?
--
MST
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists