linux-kernel - Re: [PATCH v7 00/12] Support non-lru page migration

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160616044710.GP17127@bbox>
Date:	Thu, 16 Jun 2016 13:47:10 +0900
From:	Minchan Kim <minchan@...nel.org>
To:	Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
CC:	Andrew Morton <akpm@...ux-foundation.org>, <linux-mm@...ck.org>,
	<linux-kernel@...r.kernel.org>, Vlastimil Babka <vbabka@...e.cz>,
	<dri-devel@...ts.freedesktop.org>, Hugh Dickins <hughd@...gle.com>,
	John Einar Reitan <john.reitan@...s.arm.com>,
	Jonathan Corbet <corbet@....net>,
	Joonsoo Kim <iamjoonsoo.kim@....com>,
	Konstantin Khlebnikov <koct9i@...il.com>,
	Mel Gorman <mgorman@...e.de>,
	Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
	Rafael Aquini <aquini@...hat.com>,
	Rik van Riel <riel@...hat.com>,
	Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
	<virtualization@...ts.linux-foundation.org>,
	Gioh Kim <gi-oh.kim@...fitbricks.com>,
	Chan Gyun Jeong <chan.jeong@....com>,
	Sangseok Lee <sangseok.lee@....com>,
	Kyeongdon Kim <kyeongdon.kim@....com>,
	Chulmin Kim <cmlaika.kim@...sung.com>
Subject: Re: [PATCH v7 00/12] Support non-lru page migration

On Thu, Jun 16, 2016 at 01:23:43PM +0900, Sergey Senozhatsky wrote:
> On (06/16/16 11:58), Minchan Kim wrote:
> [..]
> > RAX: 2065676162726166 so rax is totally garbage, I think.
> > It means obj_to_head returns garbage because get_first_obj_offset is
> > utter crab because (page_idx / class->pages_per_zspage) was totally
> > wrong.
> > 
> > > 					^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > >     6408:       f0 0f ba 28 00          lock btsl $0x0,(%rax)
> >  
> > <snip>
> > 
> > > > Could you test with [zsmalloc: keep first object offset in struct page]
> > > > in mmotm?
> > > 
> > > sure, I can.  will it help, tho? we have a race condition here I think.
> > 
> > I guess root cause is caused by get_first_obj_offset.
> 
> sounds reasonable.
> 
> > Please test with it.
> 
> 
> this is what I'm getting with the [zsmalloc: keep first object offset in struct page]
> applied:  "count:0 mapcount:-127". which may be not related to zsmalloc at this point.
> 
> kernel: BUG: Bad page state in process khugepaged  pfn:101db8
> kernel: page:ffffea0004076e00 count:0 mapcount:-127 mapping:          (null) index:0x1

Hm, it seems double free.

It doen't happen if you disable zram? IOW, it seems to be related
zsmalloc migration?

How easy can you reprodcue it? Could you bisect it?

> kernel: flags: 0x8000000000000000()
> kernel: page dumped because: nonzero mapcount
> kernel: Modules linked in: lzo zram zsmalloc mousedev coretemp hwmon crc32c_intel snd_hda_codec_realtek i2c_i801 snd_hda_codec_generic r8169 mii snd_hda_intel snd_hda_codec snd_hda_core acpi_cpufreq snd_pcm snd_timer snd soundcore lpc_ich processor mfd_core sch_fq_codel sd_mod hid_generic usb
> kernel: CPU: 3 PID: 38 Comm: khugepaged Not tainted 4.7.0-rc3-next-20160615-dbg-00005-gfd11984-dirty #491
> kernel:  0000000000000000 ffff8801124c73f8 ffffffff814d69b0 ffffea0004076e00
> kernel:  ffffffff81e658a0 ffff8801124c7420 ffffffff811e9b63 0000000000000000
> kernel:  ffffea0004076e00 ffffffff81e658a0 ffff8801124c7440 ffffffff811e9ca9
> kernel: Call Trace:
> kernel:  [<ffffffff814d69b0>] dump_stack+0x68/0x92
> kernel:  [<ffffffff811e9b63>] bad_page+0x158/0x1a2
> kernel:  [<ffffffff811e9ca9>] free_pages_check_bad+0xfc/0x101
> kernel:  [<ffffffff811ee516>] free_hot_cold_page+0x135/0x5de
> kernel:  [<ffffffff811eea26>] __free_pages+0x67/0x72
> kernel:  [<ffffffff81227c63>] release_freepages+0x13a/0x191
> kernel:  [<ffffffff8122b3c2>] compact_zone+0x845/0x1155
> kernel:  [<ffffffff8122ab7d>] ? compaction_suitable+0x76/0x76
> kernel:  [<ffffffff8122bdb2>] compact_zone_order+0xe0/0x167
> kernel:  [<ffffffff8122bcd2>] ? compact_zone+0x1155/0x1155
> kernel:  [<ffffffff8122ce88>] try_to_compact_pages+0x2f1/0x648
> kernel:  [<ffffffff8122ce88>] ? try_to_compact_pages+0x2f1/0x648
> kernel:  [<ffffffff8122cb97>] ? compaction_zonelist_suitable+0x3a6/0x3a6
> kernel:  [<ffffffff811ef1ea>] ? get_page_from_freelist+0x2c0/0x133c
> kernel:  [<ffffffff811f0350>] __alloc_pages_direct_compact+0xea/0x30d
> kernel:  [<ffffffff811f0266>] ? get_page_from_freelist+0x133c/0x133c
> kernel:  [<ffffffff811ee3b2>] ? drain_all_pages+0x1d6/0x205
> kernel:  [<ffffffff811f21a8>] __alloc_pages_nodemask+0x143d/0x16b6
> kernel:  [<ffffffff8111f405>] ? debug_show_all_locks+0x226/0x226
> kernel:  [<ffffffff811f0d6b>] ? warn_alloc_failed+0x24c/0x24c
> kernel:  [<ffffffff81110ffc>] ? finish_wait+0x1a4/0x1b0
> kernel:  [<ffffffff81122faf>] ? lock_acquire+0xec/0x147
> kernel:  [<ffffffff81d32ed0>] ? _raw_spin_unlock_irqrestore+0x3b/0x5c
> kernel:  [<ffffffff81d32edc>] ? _raw_spin_unlock_irqrestore+0x47/0x5c
> kernel:  [<ffffffff81110ffc>] ? finish_wait+0x1a4/0x1b0
> kernel:  [<ffffffff8128f73a>] khugepaged+0x1d4/0x484f
> kernel:  [<ffffffff8128f566>] ? hugepage_vma_revalidate+0xef/0xef
> kernel:  [<ffffffff810d5bcc>] ? finish_task_switch+0x3de/0x484
> kernel:  [<ffffffff81d32f18>] ? _raw_spin_unlock_irq+0x27/0x45
> kernel:  [<ffffffff8111d13f>] ? trace_hardirqs_on_caller+0x3d2/0x492
> kernel:  [<ffffffff81111487>] ? prepare_to_wait_event+0x3f7/0x3f7
> kernel:  [<ffffffff81d28bf5>] ? __schedule+0xa4d/0xd16
> kernel:  [<ffffffff810cd0de>] kthread+0x252/0x261
> kernel:  [<ffffffff8128f566>] ? hugepage_vma_revalidate+0xef/0xef
> kernel:  [<ffffffff810cce8c>] ? kthread_create_on_node+0x377/0x377
> kernel:  [<ffffffff81d3387f>] ret_from_fork+0x1f/0x40
> kernel:  [<ffffffff810cce8c>] ? kthread_create_on_node+0x377/0x377
> -- Reboot --
> 
> 	-ss