[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <op.u646z3e7asvm2a@kedge>
Date: Tue, 26 Jan 2010 14:41:54 +0100
From: "Roman Jarosz" <kedgedev@...il.com>
To: "KOSAKI Motohiro" <kosaki.motohiro@...fujitsu.com>
Cc: lkml <linux-kernel@...r.kernel.org>,
"A Rojas" <nqn1976list@...il.com>,
"Hugh Dickins" <hugh.dickins@...cali.co.uk>,
"A. Boulan" <arnaud.boulan@...ertysurf.fr>, michael@...nelt.co.at,
jcnengel@...glemail.com, rientjes@...gle.com, earny@...4u.de,
"Jesse Barnes" <jbarnes@...tuousgeek.org>,
"Eric Anholt" <eric@...olt.net>,
"Chris Wilson" <chris@...is-wilson.co.uk>
Subject: Re: OOM-Killer kills too much with 2.6.32.2
On Tue, 26 Jan 2010 12:07:43 +0100, KOSAKI Motohiro
<kosaki.motohiro@...fujitsu.com> wrote:
> (Restore all cc and add Hugh and Chris)
>
>
>> > Hi all,
>> >
>> > Strangely, all reproduce machine are x86_64 with Intel i915. but I
>> don't
>> > have any solid evidence.
>> > Can anyone please apply following debug patch and reproduce this
>> issue?
>> >
>> > this patch write some debug message into /var/log/messages.
>> >
>>
>> Here it is
>>
>> Jan 26 09:34:32 kedge kernel: ->fault OOM shmem_fault 1 1
>> Jan 26 09:34:32 kedge kernel: X invoked oom-killer: gfp_mask=0x0,
>> order=0,
>> oom_adj=0
>> Jan 26 09:34:32 kedge kernel: Pid: 1927, comm: X Not tainted 2.6.33-rc5
>> #3
>
>
> Very thank you!!
>
> Current status and analysis are
> - OOM is invoked by VM_FAULT_OOM in page fault
> - GEM use lots shmem internally. i915 use GEM.
> - VM_FAULT_OOM is created by shmem.
> - shmem allocate some memory by using
> mapping_gfp_mask(inode->i_mapping).
> and if allocation failed, it can return -ENOMEM and -ENOMEM generate
> VM_FAULT_OOM.
> - But, GEM have following code.
>
>
> drm_gem.c drm_gem_object_alloc()
> --------------------
> obj->filp = shmem_file_setup("drm mm object", size,
> VM_NORESERVE);
> (snip)
> /* Basically we want to disable the OOM killer and handle ENOMEM
> * ourselves by sacrificing pages from cached buffers.
> * XXX shmem_file_[gs]et_gfp_mask()
> */
> mapping_set_gfp_mask(obj->filp->f_path.dentry->d_inode->i_mapping,
> GFP_HIGHUSER |
> __GFP_COLD |
> __GFP_FS |
> __GFP_RECLAIMABLE |
> __GFP_NORETRY |
> __GFP_NOWARN |
> __GFP_NOMEMALLOC);
>
>
> This comment is lie. __GFP_NORETY cause ENOMEM to shmem, not GEM itself.
> GEM can't handle nor recover it. I suspect following commit is wrong.
>
> ----------------------------------------------------
> commit 07f73f6912667621276b002e33844ef283d98203
> Author: Chris Wilson <chris@...is-wilson.co.uk>
> Date: Mon Sep 14 16:50:30 2009 +0100
>
> drm/i915: Improve behaviour under memory pressure
>
> Due to the necessity of having to take the struct_mutex, the i915
> shrinker can not free the inactive lists if we fail to allocate
> memory
> whilst processing a batch buffer, triggering an OOM and an ENOMEM
> that
> is reported back to userspace. In order to fare better under such
> circumstances we need to manually retry a failed allocation after
> evicting inactive buffers.
>
> To do so involves 3 steps:
> 1. Marking the backing shm pages as NORETRY.
> 2. Updating the get_pages() callers to evict something on failure
> and then
> retry.
> 3. Revamping the evict something logic to be smarter about the
> required
> buffer size and prefer to use volatile or clean inactive pages.
>
> Signed-off-by: Chris Wilson <chris@...is-wilson.co.uk>
> Signed-off-by: Jesse Barnes <jbarnes@...tuousgeek.org>
> ----------------------------------------------------
>
>
> but unfortunatelly it can't revert easily.
> So, Can you please try following partial revert patch?
>
>
>
> From a27115f93d4f3ff6538860e69a7b444761cef91b Mon Sep 17 00:00:00 2001
> From: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
> Date: Tue, 26 Jan 2010 19:51:57 +0900
> Subject: [PATCH] Revert NORETRY
>
> ---
> drivers/gpu/drm/drm_gem.c | 13 -------------
> 1 files changed, 0 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index e9dbb48..8bf3770 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -142,19 +142,6 @@ drm_gem_object_alloc(struct drm_device *dev, size_t
> size)
> if (IS_ERR(obj->filp))
> goto free;
> - /* Basically we want to disable the OOM killer and handle ENOMEM
> - * ourselves by sacrificing pages from cached buffers.
> - * XXX shmem_file_[gs]et_gfp_mask()
> - */
> - mapping_set_gfp_mask(obj->filp->f_path.dentry->d_inode->i_mapping,
> - GFP_HIGHUSER |
> - __GFP_COLD |
> - __GFP_FS |
> - __GFP_RECLAIMABLE |
> - __GFP_NORETRY |
> - __GFP_NOWARN |
> - __GFP_NOMEMALLOC);
> -
> kref_init(&obj->refcount);
> kref_init(&obj->handlecount);
> obj->size = size;
I've applied this patch and I'm testing it right now.
Btw. what this patch will do from user(my) point of view?
Regards
Roman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists