[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4c9e1932-681f-4854-a04d-af9fbc35fc5e@quicinc.com>
Date: Tue, 3 Oct 2023 20:20:23 +0530
From: Pavan Kondeti <quic_pkondeti@...cinc.com>
To: Brian Geffon <bgeffon@...gle.com>
CC: Pavan Kondeti <quic_pkondeti@...cinc.com>,
"Rafael J. Wysocki" <rafael@...nel.org>,
Pavel Machek <pavel@....cz>, Len Brown <len.brown@...el.com>,
<kernel@...cinc.com>,
"Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
<linux-pm@...r.kernel.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] PM: hibernate: Fix a bug in copying the zero bitmap to
safe pages
On Tue, Oct 03, 2023 at 10:36:25AM -0400, Brian Geffon wrote:
> On Mon, Oct 2, 2023 at 11:05 PM Pavan Kondeti <quic_pkondeti@...cinc.com> wrote:
> >
> > On Mon, Oct 02, 2023 at 02:34:20PM -0400, Brian Geffon wrote:
> > > On Mon, Oct 2, 2023 at 1:56 PM Brian Geffon <bgeffon@...gle.com> wrote:
> > > >
> > > > On Fri, Sep 29, 2023 at 1:31 PM Pavankumar Kondeti
> > > > <quic_pkondeti@...cinc.com> wrote:
> > > > >
> > > > > The following crash is observed 100% of the time during resume from
> > > > > the hibernation on a x86 QEMU system.
> > > > >
> > > > > [ 12.931887] ? __die_body+0x1a/0x60
> > > > > [ 12.932324] ? page_fault_oops+0x156/0x420
> > > > > [ 12.932824] ? search_exception_tables+0x37/0x50
> > > > > [ 12.933389] ? fixup_exception+0x21/0x300
> > > > > [ 12.933889] ? exc_page_fault+0x69/0x150
> > > > > [ 12.934371] ? asm_exc_page_fault+0x26/0x30
> > > > > [ 12.934869] ? get_buffer.constprop.0+0xac/0x100
> > > > > [ 12.935428] snapshot_write_next+0x7c/0x9f0
> > > > > [ 12.935929] ? submit_bio_noacct_nocheck+0x2c2/0x370
> > > > > [ 12.936530] ? submit_bio_noacct+0x44/0x2c0
> > > > > [ 12.937035] ? hib_submit_io+0xa5/0x110
> > > > > [ 12.937501] load_image+0x83/0x1a0
> > > > > [ 12.937919] swsusp_read+0x17f/0x1d0
> > > > > [ 12.938355] ? create_basic_memory_bitmaps+0x1b7/0x240
> > > > > [ 12.938967] load_image_and_restore+0x45/0xc0
> > > > > [ 12.939494] software_resume+0x13c/0x180
> > > > > [ 12.939994] resume_store+0xa3/0x1d0
> > > > >
> > > > > The commit being fixed introduced a bug in copying the zero bitmap
> > > > > to safe pages. A temporary bitmap is allocated in prepare_image()
> > > > > to make a copy of zero bitmap after the unsafe pages are marked.
> > > > > Freeing this temporary bitmap later results in an inconsistent state
> > > > > of unsafe pages. Since free bit is left as is for this temporary bitmap
> > > > > after free, these pages are treated as unsafe pages when they are
> > > > > allocated again. This results in incorrect calculation of the number
> > > > > of pages pre-allocated for the image.
> > > > >
> > > > > nr_pages = (nr_zero_pages + nr_copy_pages) - nr_highmem - allocated_unsafe_pages;
> > > > >
> > > > > The allocate_unsafe_pages is estimated to be higher than the actual
> > > > > which results in running short of pages in safe_pages_list. Hence the
> > > > > crash is observed in get_buffer() due to NULL pointer access of
> > > > > safe_pages_list.
> > > >
> > > > After reading through the code, perhaps I'm missing something, I'm not
> > > > sure that this is really fixing the problem.
> > > >
> > > > It seems like the problem would be that memory_bm_create() results in
> > > > calls to get_image_page() w/ safe_needed = PG_ANY == 0, meaning that
> > > > get_image_page() will not touch allocated_unsafe_pages and instead
> > > > will mark the page as in use by setting it in the forbidden_pages_map
> > > > and the free_pages_map simultaneously. The problem is that the
> > > > free_pages_map was already populated by the call to mark_unsafe_pages,
> > > > meaning that if we allocated a safe page in get_image_page() we just
> > > > set the free bit when it otherwise should not be set.
> > > >
> > > > When the page is later free'd via the call to memory_bm_free(&tmp,
> > > > PG_UNSAFE_KEEP), it results in calls to free_image_page() w/
> > > > clear_page_nosave = PG_UNSAFE_KEEP == 0. This means that we do not
> > > > touch the free_pages_map because we don't call
> > > > swsusp_unset_page_free().
> > > >
> > > > With all that being said it seems like the correct way to deal with
> > > > that would be to do:
> > > > error = memory_bm_create(&tmp, GFP_ATOMIC, PG_SAFE);
> > > > Here we know that the pages were not in the free_pages_map initially.
> > > >
> > > > Followed by freeing it as:
> > > > memory_bm_free(&tmp, PG_UNSAFE_CLEAR);
> > > > And here we know that swsusp_unset_page_free() will be called making
> > > > sure the page is not in the free_pages_map afterwards.
> > > >
> > > > And that should result in an unchanged free_pages_map. Does that make
> > > > sense? Please correct me if I'm misunderstanding something.
> > > >
> >
> > Thanks for your review. Appreciate the detailed summary.
> >
> > >
> > > To restate this another way, if I'm reading it correctly, I think the
> > > outcome is actually nearly the same, the difference is, when
> > > allocating the bitmap before w/ PG_ANY we're setting bits in the
> > > free_page_list which will be unset a few lines later in the call to
> > > mark_unsafe_pages(), and then we won't touch the free_pages_list
> > > during the memory_bm_free() because it's called with PG_UNSAFE_KEEP.
> > >
> >
> > The current patch and your suggestion both gives the same effect like
> > you said. Since it is a temporary buffer to hold the zero bit map page, I
> > did not allocate safe pages. Allocating safe pages means a bit more
> > work. In this case this it is not completely throw away work but
> > re-ordering the call seems to be simple here. Pls let me know if you
> > want to me incorporate your suggestion.
>
> My personal opinion is that PG_SAFE makes a bit more sense, it's not
> really wasted work as any pages which are not safe end up being added
> to the allocated_unsafe_pages pool.
>
Yes, the extra bit of works does not go waste. I will send v2 with your
suggestion. Thanks a lot for your review and detailed comments.
Thanks,
Pavan
Powered by blists - more mailing lists