lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 20 May 2011 13:20:15 +0900
From:	Minchan Kim <minchan.kim@...il.com>
To:	Andrew Lutomirski <luto@....edu>
Cc:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	kamezawa.hiroyu@...fujitsu.com, fengguang.wu@...el.com,
	andi@...stfloor.org, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, mgorman@...e.de, hannes@...xchg.org,
	riel@...hat.com
Subject: Re: Kernel falls apart under light memory pressure (i.e. linking vmlinux)

On Fri, May 20, 2011 at 12:38 PM, Andrew Lutomirski <luto@....edu> wrote:
> On Thu, May 19, 2011 at 11:12 PM, KOSAKI Motohiro
> <kosaki.motohiro@...fujitsu.com> wrote:
>>> Right after that happened, I hit ctrl-c to kill test_mempressure.sh.
>>> The system was OK until I typed sync, and then everything hung.
>>>
>>> I'm really confused.  shrink_inactive_list in
>>> RECLAIM_MODE_LUMPYRECLAIM will call one of the isolate_pages functions
>>> with ISOLATE_BOTH.  The resulting list goes into shrink_page_list,
>>> which does VM_BUG_ON(PageActive(page)).
>>>
>>> How is that supposed to work?
>>
>> Usually clear_active_flags() clear PG_active before calling
>> shrink_page_list().
>>
>> shrink_inactive_list()
>>    isolate_pages_global()
>>    update_isolated_counts()
>>        clear_active_flags()
>>    shrink_page_list()
>>
>>
>
> That makes sense.  And I have CONFIG_COMPACTION=y, so the lumpy mode
> doesn't get set anyway.

Could you see the problem with disabling CONFIG_COMPACTION?

>
> But the pages I'm seeing have flags=100000000008005D.  If I'm reading
> it right, that means locked,referenced,uptodate,dirty,active.  How
> does a page like that end up in shrink_page_list?  I don't see how a
> page that's !PageLRU can get marked Active.  Nonetheless, I'm hitting
> that VM_BUG_ON.

Thanks for proving that it's not a problem of latest my patch.

>
> Is there a race somewhere?

First of all, let's finish your first problem about hang. :)
And let's make another thread to fix this problem.

I think this is a severe problem because 2.6.39 includes my deactivate_pages
(http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=315601809d124d046abd6c3ffa346d0dbd7aa29d)

It touches page states more and more. (2.6.38.6 doesn't include it so
it's not a problem of my deactivate_pages problem)
And now inorder-putback series which I will push for 2.6.40 touches it
more and more.

So I want to resolve your problem asap.
We don't have see report about that. Could you do git-bisect?
FYI, Recently, big change of mm is compaction,transparent huge pages.
Kame, could you point out thing related to memcg if you have a mind?

>
> --Andy
>



-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ