[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <98477af6-b774-48bd-f663-28a7f9f554e3@mellanox.com>
Date: Fri, 30 Mar 2018 22:39:00 +0300
From: Alex Vesker <valex@...lanox.com>
To: David Ahern <dsahern@...il.com>, Andrew Lunn <andrew@...n.ch>
CC: "David S. Miller" <davem@...emloft.net>, <netdev@...r.kernel.org>,
"Tariq Toukan" <tariqt@...lanox.com>,
Jiri Pirko <jiri@...lanox.com>
Subject: Re: [PATCH net-next 0/9] devlink: Add support for region access
On 3/30/2018 7:57 PM, David Ahern wrote:
> On 3/30/18 8:34 AM, Andrew Lunn wrote:
>>>> And it seems to want contiguous pages. How well does that work after
>>>> the system has been running for a while and memory is fragmented?
>>> The allocation can be changed, there is no read need for contiguous pages.
>>> It is important to note that we the amount of snapshots is limited by the
>>> driver
>>> this can be based on the dump size or expected frequency of collection.
>>> I also prefer not to pre-allocate this memory.
>> The driver code also asks for a 1MB contiguous chunk of memory! You
>> really should think about this API, how can you avoid double memory
>> allocations. And can kvmalloc be used. But then you get into the
>> problem for DMA'ing the memory from the device...
>>
>> This API also does not scale. 1MB is actually quite small. I'm sure
>> there is firmware running on CPUs with a lot more than 1MB of RAM.
>> How well does with API work with 64MB? Say i wanted to snapshot my
>> GPU? Or the MC/BMC?
>>
> That and the drivers control the number of snapshots. The user should be
> able to control the number of snapshots, and an option to remove all
> snapshots to free up that memory.
There is an option to free up this memory, using a delete command.
The reason I added the option to control the number of snapshots from
the driver side only is because the driver knows the size of the snapshots
and when/why they will be taken.
For example in our mlx4 driver the snapshots are taken on rare failures,
the snapshot is quite large and from past analyses the first dump is usually
the important one, this means that 8 is more than enough in my case.
If a user wants more than that he can always monitor notification read
the snapshot and delete once backup-ed, there is no reason for keeping
all of this data in the kernel.
Powered by blists - more mailing lists