[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <71c4ce84-8be7-49e2-90bd-348762b320b4@nvidia.com>
Date: Thu, 4 Feb 2021 18:52:01 -0800
From: John Hubbard <jhubbard@...dia.com>
To: Minchan Kim <minchan@...nel.org>
CC: Andrew Morton <akpm@...ux-foundation.org>,
<gregkh@...uxfoundation.org>, <surenb@...gle.com>,
<joaodias@...gle.com>, LKML <linux-kernel@...r.kernel.org>,
linux-mm <linux-mm@...ck.org>
Subject: Re: [PATCH] mm: cma: support sysfs
On 2/4/21 5:44 PM, Minchan Kim wrote:
> On Thu, Feb 04, 2021 at 04:24:20PM -0800, John Hubbard wrote:
>> On 2/4/21 4:12 PM, Minchan Kim wrote:
>> ...
>>>>> Then, how to know how often CMA API failed?
>>>>
>>>> Why would you even need to know that, *in addition* to knowing specific
>>>> page allocation numbers that failed? Again, there is no real-world motivation
>>>> cited yet, just "this is good data". Need more stories and support here.
>>>
>>> Let me give an example.
>>>
>>> Let' assume we use memory buffer allocation via CMA for bluetooth
>>> enable of device.
>>> If user clicks the bluetooth button in the phone but fail to allocate
>>> the memory from CMA, user will still see bluetooth button gray.
>>> User would think his touch was not enough powerful so he try clicking
>>> again and fortunately CMA allocation was successful this time and
>>> they will see bluetooh button enabled and could listen the music.
>>>
>>> Here, product team needs to monitor how often CMA alloc failed so
>>> if the failure ratio is steadily increased than the bar,
>>> it means engineers need to go investigation.
>>>
>>> Make sense?
>>>
>>
>> Yes, except that it raises more questions:
>>
>> 1) Isn't this just standard allocation failure? Don't you already have a way
>> to track that?
>>
>> Presumably, having the source code, you can easily deduce that a bluetooth
>> allocation failure goes directly to a CMA allocation failure, right?
Still wondering about this...
>>
>> Anyway, even though the above is still a little murky, I expect you're right
>> that it's good to have *some* indication, somewhere about CMA behavior...
>>
>> Thinking about this some more, I wonder if this is really /proc/vmstat sort
>> of data that we're talking about. It seems to fit right in there, yes?
>
> Thing is CMA instance are multiple, cma-A, cma-B, cma-C and each of CMA
> heap has own specific scenario. /proc/vmstat could be bloated a lot
> while CMA instance will be increased.
>
Yes, that would not fit in /proc/vmstat...assuming that you really require
knowing--at this point--which CMA heap is involved. And that's worth poking
at. If you get an overall indication in vmstat that CMA is having trouble,
then maybe that's all you need to start digging further.
It's actually easier to monitor one or two simpler items than it is to monitor
a larger number of complicated items. And I get the impression that this is
sort of a top-level, production software indicator.
thanks,
--
John Hubbard
NVIDIA
Powered by blists - more mailing lists