lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0910DD04CBD6DE4193FCF86B9C00BE971C7EC5@BPXM01GP.gisp.nec.co.jp>
Date:	Thu, 28 Nov 2013 07:08:26 +0000
From:	Atsushi Kumagai <kumagai-atsushi@....nes.nec.co.jp>
To:	HATAYAMA Daisuke <d.hatayama@...fujitsu.com>
CC:	"bhe@...hat.com" <bhe@...hat.com>,
	"tom.vaden@...com" <tom.vaden@...com>,
	"kexec@...ts.infradead.org" <kexec@...ts.infradead.org>,
	"ptesarik@...e.cz" <ptesarik@...e.cz>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"lisa.mitchell@...com" <lisa.mitchell@...com>,
	"vgoyal@...hat.com" <vgoyal@...hat.com>,
	"anderson@...hat.com" <anderson@...hat.com>,
	"ebiederm@...ssion.com" <ebiederm@...ssion.com>,
	"jingbai.ma@...com" <jingbai.ma@...com>
Subject: Re: [PATCH 0/3] makedumpfile: hugepage filtering for vmcore dump

On 2013/11/22 16:18:20, kexec <kexec-bounces@...ts.infradead.org> wrote:
> (2013/11/07 9:54), HATAYAMA Daisuke wrote:
> > (2013/11/06 11:21), Atsushi Kumagai wrote:
> >> (2013/11/06 5:27), Vivek Goyal wrote:
> >>> On Tue, Nov 05, 2013 at 09:45:32PM +0800, Jingbai Ma wrote:
> >>>> This patch set intend to exclude unnecessary hugepages from vmcore dump file.
> >>>>
> >>>> This patch requires the kernel patch to export necessary data structures into
> >>>> vmcore: "kexec: export hugepage data structure into vmcoreinfo"
> >>>> http://lists.infradead.org/pipermail/kexec/2013-November/009997.html
> >>>>
> >>>> This patch introduce two new dump levels 32 and 64 to exclude all unused and
> >>>> active hugepages. The level to exclude all unnecessary pages will be 127 now.
> >>>
> >>> Interesting. Why hugepages should be treated any differentely than normal
> >>> pages?
> >>>
> >>> If user asked to filter out free page, then it should be filtered and
> >>> it should not matter whether it is a huge page or not?
> >>
> >> I'm making a RFC patch of hugepages filtering based on such policy.
> >>
> >> I attach the prototype version.
> >> It's able to filter out also THPs, and suitable for cyclic processing
> >> because it depends on mem_map and looking up it can be divided into
> >> cycles. This is the same idea as page_is_buddy().
> >>
> >> So I think it's better.
> >>
> >
> >> @@ -4506,14 +4583,49 @@ __exclude_unnecessary_pages(unsigned long mem_map,
> >>                && !isAnon(mapping)) {
> >>                if (clear_bit_on_2nd_bitmap_for_kernel(pfn))
> >>                    pfn_cache_private++;
> >> +            /*
> >> +             * NOTE: If THP for cache is introduced, the check for
> >> +             *       compound pages is needed here.
> >> +             */
> >>            }
> >>            /*
> >>             * Exclude the data page of the user process.
> >>             */
> >> -        else if ((info->dump_level & DL_EXCLUDE_USER_DATA)
> >> -            && isAnon(mapping)) {
> >> -            if (clear_bit_on_2nd_bitmap_for_kernel(pfn))
> >> -                pfn_user++;
> >> +        else if (info->dump_level & DL_EXCLUDE_USER_DATA) {
> >> +            /*
> >> +             * Exclude the anonnymous pages as user pages.
> >> +             */
> >> +            if (isAnon(mapping)) {
> >> +                if (clear_bit_on_2nd_bitmap_for_kernel(pfn))
> >> +                    pfn_user++;
> >> +
> >> +                /*
> >> +                 * Check the compound page
> >> +                 */
> >> +                if (page_is_hugepage(flags) && compound_order > 0) {
> >> +                    int i, nr_pages = 1 << compound_order;
> >> +
> >> +                    for (i = 1; i < nr_pages; ++i) {
> >> +                        if (clear_bit_on_2nd_bitmap_for_kernel(pfn + i))
> >> +                            pfn_user++;
> >> +                    }
> >> +                    pfn += nr_pages - 2;
> >> +                    mem_map += (nr_pages - 1) * SIZE(page);
> >> +                }
> >> +            }
> >> +            /*
> >> +             * Exclude the hugetlbfs pages as user pages.
> >> +             */
> >> +            else if (hugetlb_dtor == SYMBOL(free_huge_page)) {
> >> +                int i, nr_pages = 1 << compound_order;
> >> +
> >> +                for (i = 0; i < nr_pages; ++i) {
> >> +                    if (clear_bit_on_2nd_bitmap_for_kernel(pfn + i))
> >> +                        pfn_user++;
> >> +                }
> >> +                pfn += nr_pages - 1;
> >> +                mem_map += (nr_pages - 1) * SIZE(page);
> >> +            }
> >>            }
> >>            /*
> >>             * Exclude the hwpoison page.
> >
> > I'm concerned about the case that filtering is not performed to part of mem_map
> > entries not belonging to the current cyclic range.
> >
> > If maximum value of compound_order is larger than maximum value of
> > CONFIG_FORCE_MAX_ZONEORDER, which makedumpfile obtains by ARRAY_LENGTH(zone.free_area),
> > it's necessary to align info->bufsize_cyclic with larger one in
> > check_cyclic_buffer_overrun().
> >
> 
> ping, in case you overlooked this...

Sorry for the delayed response, I prioritize the release of v1.5.5 now.

Thanks for your advice, check_cyclic_buffer_overrun() should be fixed
as you said. In addition, I'm considering other way to address such case,
that is to bring the number of "overflowed pages" to the next cycle and
exclude them at the top of __exclude_unnecessary_pages() like below:

               /*
                * The pages which should be excluded still remain.
                */
               if (remainder >= 1) {
                       int i;
                       unsigned long tmp;
                       for (i = 0; i < remainder; ++i) {
                               if (clear_bit_on_2nd_bitmap_for_kernel(pfn + i)) {
                                       pfn_user++;
                                       tmp++;
                               }
                       }
                       pfn += tmp;
                       remainder -= tmp;
                       mem_map += (tmp - 1) * SIZE(page);
                       continue;
               }

If this way works well, then aligning info->buf_size_cyclic will be
unnecessary.


Thanks
Atsushi Kumagai

> -- 
> Thanks.
> HATAYAMA, Daisuke
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@...ts.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ