lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 29 Nov 2023 10:03:57 -0500
From:   Donald Dutile <ddutile@...hat.com>
To:     Baoquan He <bhe@...hat.com>, Michal Hocko <mhocko@...e.com>
Cc:     Jiri Bohac <jbohac@...e.cz>, Pingfan Liu <piliu@...hat.com>,
        Tao Liu <ltao@...hat.com>, Vivek Goyal <vgoyal@...hat.com>,
        Dave Young <dyoung@...hat.com>, kexec@...ts.infradead.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/4] kdump: crashkernel reservation from CMA

Baoquan,
hi!

On 11/29/23 3:10 AM, Baoquan He wrote:
> On 11/28/23 at 10:08am, Michal Hocko wrote:
>> On Tue 28-11-23 10:11:31, Baoquan He wrote:
>>> On 11/28/23 at 09:12am, Tao Liu wrote:
>> [...]
>>> Thanks for the effort to bring this up, Jiri.
>>>
>>> I am wondering how you will use this crashkernel=,cma parameter. I mean
>>> the scenario of crashkernel=,cma. Asking this because I don't know how
>>> SUSE deploy kdump in SUSE distros. In SUSE distros, kdump kernel's
>>> driver will be filter out? If latter case, It's possibly having the
>>> on-flight DMA issue, e.g NIC has DMA buffer in the CMA area, but not
>>> reset during kdump bootup because the NIC driver is not loaded in to
>>> initialize. Not sure if this is 100%, possible in theory?
>>
>> NIC drivers do not allocation from movable zones (that includes CMA
>> zone). In fact kernel doesn't use GFP_MOVABLE for non-user requests.
>> RDMA drivers might and do transfer from user backed memory but for that
>> purpose they should be pinning memory (have a look at
>> __gup_longterm_locked and its callers) and that will migrate away from
>> the any zone.
> 
> Add Don in this thread.
> 
> I am not familiar with RDMA. If we reserve a range of 1G meory as cma in
> 1st kernel, and RDMA or any other user space tools could use it. When
> corruption happened with any cause, that 1G cma memory will be reused as
> available MOVABLE memory of kdump kernel. If no risk at all, I mean 100%
> safe from RDMA, that would be great.
> 
My RDMA days are long behind me... more in mm space these days, so this still
interests me.
I thought, in general, userspace memory is not saved or used in kdumps, so
if RDMA is using cma space for userspace-based IO (gup), then I would expect
it can be re-used for kexec'd kernel.
So, I'm not sure what 'safe from RDMA' means, but I would expect RDMA queues
are in-kernel data structures, not userspace strucutures, and they would be
more/most important to maintain/keep for kdump saving.  The actual userspace
data ... ssdd wrt any other userspace data.
dma-buf's allocated from cma, which are (typically) shared with GPUs
(& RDMA in GPU-direct configs), again, would be shared userspace, not
control/cmd/rsp queues, so I'm not seeing an issue there either.

I would poke the NVIDIA+Mellanox folks for further review in this space,
if my reply leaves you (or others) 'wanting'.

- Don
>>   
>> [...]
>>> The crashkernel=,cma requires no userspace data dumping, from our
>>> support engineers' feedback, customer never express they don't need to
>>> dump user space data. Assume a server with huge databse deployed, and
>>> the database often collapsed recently and database provider claimed that
>>> it's not database's fault, OS need prove their innocence. What will you
>>> do?
>>
>> Don't use CMA backed crash memory then? This is an optional feature.
>>   
>>> So this looks like a nice to have to me. At least in fedora/rhel's
>>> usage, we may only back port this patch, and add one sentence in our
>>> user guide saying "there's a crashkernel=,cma added, can be used with
>>> crashkernel= to save memory. Please feel free to try if you like".
>>> Unless SUSE or other distros decides to use it as default config or
>>> something like that. Please correct me if I missed anything or took
>>> anything wrong.
>>
>> Jiri will know better than me but for us a proper crash memory
>> configuration has become a real nut. You do not want to reserve too much
>> because it is effectively cutting of the usable memory and we regularly
>> hit into "not enough memory" if we tried to be savvy. The more tight you
>> try to configure the easier to fail that is. Even worse any in kernel
>> memory consumer can increase its memory demand and get the overall
>> consumption off the cliff. So this is not an easy to maintain solution.
>> CMA backed crash memory can be much more generous while still usable.
>> -- 
>> Michal Hocko
>> SUSE Labs
>>
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ