lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 1 Dec 2015 11:31:41 +0900
From:	"kyeongdon.kim" <kyeongdon.kim@....com>
To:	Minchan Kim <minchan@...nel.org>
Cc:	Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org,
	Sergey Senozhatsky <sergey.senozhatsky@...il.com>
Subject: Re: Re: Re: [PATCH v3 2/2] zram: try vmalloc() after kmalloc()

On 2015-12-01 오전 8:18, Minchan Kim wrote:
> Hi Kyeongdon,
> 
> On Mon, Nov 30, 2015 at 07:42:02PM +0900, kyeongdon.kim wrote:
> 
>> > On Fri, Nov 27, 2015 at 01:10:49PM +0900, Sergey Senozhatsky wrote:
>> >> From: Kyeongdon Kim <kyeongdon.kim@....com>
>> >>
>> >> When we're using LZ4 multi compression streams for zram swap,
>> >> we found out page allocation failure message in system running test.
>> >> That was not only once, but a few(2 - 5 times per test).
>> >> Also, some failure cases were continually occurring to try allocation
>> >> order 3.
>> >>
>> >> In order to make parallel compression private data, we should call
>> >> kzalloc() with order 2/3 in runtime(lzo/lz4). But if there is no order
>> >> 2/3 size memory to allocate in that time, page allocation fails.
>> >> This patch makes to use vmalloc() as fallback of kmalloc(), this
>> >> prevents page alloc failure warning.
>> >>
>> >> After using this, we never found warning message in running test, also
>> >> It could reduce process startup latency about 60-120ms in each case.
>> >>
>> >> For reference a call trace :
>> >>
>> >> Binder_1: page allocation failure: order:3, mode:0x10c0d0
>> >> CPU: 0 PID: 424 Comm: Binder_1 Tainted: GW 3.10.49-perf-g991d02b-dirty
>> > #20
>> >> Call trace:
>> >> [<ffffffc0002069c8>] dump_backtrace+0x0/0x270
>> >> [<ffffffc000206c48>] show_stack+0x10/0x1c
>> >> [<ffffffc000cb51c8>] dump_stack+0x1c/0x28
>> >> [<ffffffc0002bbfc8>] warn_alloc_failed+0xfc/0x11c
>> >> [<ffffffc0002bf518>] __alloc_pages_nodemask+0x724/0x7f0
>> >> [<ffffffc0002bf5f8>] __get_free_pages+0x14/0x5c
>> >> [<ffffffc0002ed6a4>] kmalloc_order_trace+0x38/0xd8
>> >> [<ffffffc0005d9738>] zcomp_lz4_create+0x2c/0x38
>> >> [<ffffffc0005d78f0>] zcomp_strm_alloc+0x34/0x78
>> >> [<ffffffc0005d7a58>] zcomp_strm_multi_find+0x124/0x1ec
>> >> [<ffffffc0005d7c14>] zcomp_strm_find+0xc/0x18
>> >> [<ffffffc0005d8fa0>] zram_bvec_rw+0x2fc/0x780
>> >> [<ffffffc0005d9680>] zram_make_request+0x25c/0x2d4
>> >> [<ffffffc00040f8ac>] generic_make_request+0x80/0xbc
>> >> [<ffffffc00040f98c>] submit_bio+0xa4/0x15c
>> >> [<ffffffc0002e8bb0>] __swap_writepage+0x218/0x230
>> >> [<ffffffc0002e8c04>] swap_writepage+0x3c/0x4c
>> >> [<ffffffc0002c7384>] shrink_page_list+0x51c/0x8d0
>> >> [<ffffffc0002c7e88>] shrink_inactive_list+0x3f8/0x60c
>> >> [<ffffffc0002c86c8>] shrink_lruvec+0x33c/0x4cc
>> >> [<ffffffc0002c8894>] shrink_zone+0x3c/0x100
>> >> [<ffffffc0002c8c10>] try_to_free_pages+0x2b8/0x54c
>> >> [<ffffffc0002bf308>] __alloc_pages_nodemask+0x514/0x7f0
>> >> [<ffffffc0002bf5f8>] __get_free_pages+0x14/0x5c
>> >> [<ffffffc0003446cc>] proc_info_read+0x50/0xe4
>> >> [<ffffffc0002f5204>] vfs_read+0xa0/0x12c
>> >> [<ffffffc0002f59c8>] SyS_read+0x44/0x74
>> >> DMA: 3397*4kB (MC) 26*8kB (RC) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB
>> >> 0*512kB 0*1024kB 0*2048kB 0*4096kB = 13796kB
>> >>
>> >> [minchan: change vmalloc gfp and adding comment about gfp]
>> >> [sergey: tweak comments and styles]
>> >> Signed-off-by: Kyeongdon Kim <kyeongdon.kim@....com>
>> >> Signed-off-by: Minchan Kim <minchan@...nel.org>
>> >
>> > Kyeongdon, Could you test this patch on your device?
>> >
>> > Thanks.
>>
>> Sorry to have kept you waiting,
>> Obviously, I couldn't see allocation fail message with this patch.
>> But, there is something to make some delay(not sure yet this is normal).
> 
> You mean new changes makes start-up delay of your application sometime
> still,
> but not frequent like old?
> 
I couldn't see start-up delay during my test after this patch.
But, I checked the return value from alloc function like the below :

static void *zcomp_lz4_create(void)
<snip>
  ret = kzalloc(LZ4_MEM_COMPRESS, GFP_NOIO | __GFP_NORETRY |
			__GFP_NOWARN | __GFP_NOMEMALLOC);
  printk("%s: %d: ret = %p\n",__func__,__LINE__,ret);	//line 32
  if (!ret) {
    ret = __vmalloc(LZ4_MEM_COMPRESS,
		GFP_NOIO | __GFP_NORETRY | __GFP_NOWARN |
		__GFP_NOMEMALLOC | __GFP_ZERO | __GFP_HIGHMEM,
		PAGE_KERNEL);
    printk("%s: %d: ret = %p\n",__func__,__LINE__,ret);	//line 38
  }
  return ret;

log message :
[  352.226014][0] zcomp_lz4_create: 32: ret =           (null)
[  352.226035][0] zcomp_lz4_create: 38: ret =           (null)
[  352.226791][0] zcomp_lz4_create: 32: ret =           (null)
[  352.226809][0] zcomp_lz4_create: 38: ret =           (null)
[  352.230348][0] zcomp_lz4_create: 32: ret =           (null)
[  352.230369][0] zcomp_lz4_create: 38: ret =           (null)
[  352.230460][0] zcomp_lz4_create: 32: ret =           (null)
[  352.230485][0] zcomp_lz4_create: 38: ret =           (null)
[  352.230507][0] zcomp_lz4_create: 32: ret =           (null)
[  352.230520][0] zcomp_lz4_create: 38: ret =           (null)
[  352.230608][0] zcomp_lz4_create: 32: ret =           (null)
[  352.230619][0] zcomp_lz4_create: 38: ret =           (null)
[  352.230888][0] zcomp_lz4_create: 32: ret =           (null)
[  352.230902][0] zcomp_lz4_create: 38: ret =           (null)
[  352.231406][0] zcomp_lz4_create: 32: ret = ffffffc002088000
[  352.234024][0] zcomp_lz4_create: 32: ret =           (null)
[  352.234060][0] zcomp_lz4_create: 38: ret =           (null)
[  352.234359][0] zcomp_lz4_create: 32: ret =           (null)
[  352.234384][0] zcomp_lz4_create: 38: ret =           (null)
[  352.234618][0] zcomp_lz4_create: 32: ret =           (null)
[  352.234639][0] zcomp_lz4_create: 38: ret =           (null)
[  352.234667][0] zcomp_lz4_create: 32: ret =           (null)
[  352.234685][0] zcomp_lz4_create: 38: ret =           (null)
[  352.234738][0] zcomp_lz4_create: 32: ret =           (null)
[  352.234748][0] zcomp_lz4_create: 38: ret =           (null)
[  352.234800][0] zcomp_lz4_create: 32: ret =           (null)
[  352.234816][0] zcomp_lz4_create: 38: ret =           (null)
[  352.234852][0] zcomp_lz4_create: 32: ret =           (null)
[  352.234865][0] zcomp_lz4_create: 38: ret =           (null)
[  352.235136][0] zcomp_lz4_create: 32: ret =           (null)
[  352.235179][0] zcomp_lz4_create: 38: ret = ffffff80016a4000

I thought this pattern from vmalloc is not normal.
>>
>> static struct zcomp_strm *zcomp_strm_alloc(struct zcomp *comp)
>> {
>> <snip>
>>
>> zstrm->private = comp->backend->create();
>> ^ // sometimes, return 'null' continually(2-5times)
> 
> Hmm, I think it is caused by __GFP_NOMEMALLOC.
> Could you test it without the flag?
> 
>>
>> As you know, if there is 'null' return, this function is called again to
>> get a memory in while() loop. I just checked this one with printk().
>>
>> If you guys don't mind, I'll test more with trace log to check time
> delay.
> 
> No problem.
> 
>>
>> However, If this is fully expectable status to you.
>> I think I don't need to do it.
> 
> It's not what I expected. Actually, I thought failure of vmalloc
> in that place should be *really really* rare. I think it's caused by
> __GFP_NOMEMALLOC so I want to see test result without the flag.
> 
> Thanks for the careful test!
> 
You're welcome.

After I removed flag '__GFP_NOMEMALLOC', I couldn't find return 'null'
from vmalloc until now.

log message :
<4>[ 2288.954934][0] KDKIM: zcomp_lz4_create: 24: ret =           (null)
<4>[ 2288.954972][0] KDKIM: zcomp_lz4_create: 30: ret = ffffff800287e000
..<snip>..
<4>[ 2289.092411][0] KDKIM: zcomp_lz4_create: 24: ret =           (null)
<4>[ 2289.092546][0] KDKIM: zcomp_lz4_create: 30: ret = ffffff80028b5000
..<snip>..
<4>[ 2289.135628][0] KDKIM: zcomp_lz4_create: 24: ret =           (null)
<4>[ 2289.135642][0] KDKIM: zcomp_lz4_create: 24: ret =           (null)
<4>[ 2289.135729][0] KDKIM: zcomp_lz4_create: 30: ret = ffffff80028be000
<4>[ 2289.135732][0] KDKIM: zcomp_lz4_create: 30: ret = ffffff80028c7000

Thanks,
Kyeongdon Kim
> 
>>
>> Thanks,
>> Kyeongdon Kim

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ