lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF+s44TQ2g6VTL4JSubvch5VkW7SSsePp-aBz+kigg563NijJg@mail.gmail.com>
Date:   Fri, 1 Dec 2023 08:54:20 +0800
From:   Pingfan Liu <piliu@...hat.com>
To:     Michal Hocko <mhocko@...e.com>
Cc:     Baoquan He <bhe@...hat.com>, Donald Dutile <ddutile@...hat.com>,
        Jiri Bohac <jbohac@...e.cz>, Tao Liu <ltao@...hat.com>,
        Vivek Goyal <vgoyal@...hat.com>,
        Dave Young <dyoung@...hat.com>, kexec@...ts.infradead.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/4] kdump: crashkernel reservation from CMA

On Thu, Nov 30, 2023 at 9:43 PM Michal Hocko <mhocko@...e.com> wrote:
>
> On Thu 30-11-23 21:33:04, Pingfan Liu wrote:
> > On Thu, Nov 30, 2023 at 9:29 PM Michal Hocko <mhocko@...e.com> wrote:
> > >
> > > On Thu 30-11-23 20:04:59, Baoquan He wrote:
> > > > On 11/30/23 at 11:16am, Michal Hocko wrote:
> > > > > On Thu 30-11-23 11:00:48, Baoquan He wrote:
> > > > > [...]
> > > > > > Now, we are worried if there's risk if the CMA area is retaken into kdump
> > > > > > kernel as system RAM. E.g is it possible that 1st kernel's ongoing RDMA
> > > > > > or DMA will interfere with kdump kernel's normal memory accessing?
> > > > > > Because kdump kernel usually only reset and initialize the needed
> > > > > > device, e.g dump target. Those unneeded devices will be unshutdown and
> > > > > > let go.
> > > > >
> > > > > I do not really want to discount your concerns but I am bit confused why
> > > > > this matters so much. First of all, if there is a buggy RDMA driver
> > > > > which doesn't use the proper pinning API (which would migrate away from
> > > > > the CMA) then what is the worst case? We will get crash kernel corrupted
> > > > > potentially and fail to take a proper kernel crash, right? Is this
> > > > > worrisome? Yes. Is it a real roadblock? I do not think so. The problem
> > > > > seems theoretical to me and it is not CMA usage at fault here IMHO. It
> > > > > is the said theoretical driver that needs fixing anyway.
> > > > >
> > > > > Now, it is really fair to mention that CMA backed crash kernel memory
> > > > > has some limitations
> > > > >     - CMA reservation can only be used by the userspace in the
> > > > >       primary kernel. If the size is overshot this might have
> > > > >       negative impact on kernel allocations
> > > > >     - userspace memory dumping in the crash kernel is fundamentally
> > > > >       incomplete.
> > > >
> > > > I am not sure if we are talking about the same thing. My concern is:
> > > > ====================================================================
> > > > 1) system corrutption happened, crash dumping is prepared, cpu and
> > > > interrupt controllers are shutdown;
> > > > 2) all pci devices are kept alive;
> > > > 3) kdump kernel boot up, initialization is only done on those devices
> > > > which drivers are added into kdump kernel's initrd;
> > > > 4) those on-flight DMA engine could be still working if their kernel
> > > > module is not loaded;
> > > >
> > > > In this case, if the DMA's destination is located in crashkernel=,cma
> > > > region, the DMA writting could continue even when kdump kernel has put
> > > > important kernel data into the area. Is this possible or absolutely not
> > > > possible with DMA, RDMA, or any other stuff which could keep accessing
> > > > that area?
> > >
> > > I do nuderstand your concern. But as already stated if anybody uses
> > > movable memory (CMA including) as a target of {R}DMA then that memory
> > > should be properly pinned. That would mean that the memory will be
> > > migrated to somewhere outside of movable (CMA) memory before the
> > > transfer is configured. So modulo bugs this shouldn't really happen.
> > > Are there {R}DMA drivers that do not pin memory correctly? Possibly. Is
> > > that a road bloack to not using CMA to back crash kernel memory, I do
> > > not think so. Those drivers should be fixed instead.
> > >
> > I think that is our concern. Is there any method to guarantee that
                           ^^^ Sorry, to clarify, I am only speaking for myself.

> > will not happen instead of 'should be' ?
> > Any static analysis during compiling time or dynamic checking method?
>
> I am not aware of any method to detect a driver is going to configure a
> RDMA.
>

If there is a pattern, scripts/coccinelle may give some help. But I am
not sure about that.

> > If this can be resolved, I think this method is promising.
>
> Are you indicating this is a mandatory prerequisite?

IMHO, that should be mandatory. Otherwise for any unexpected kdump
kernel collapses,  it can not shake off its suspicion.

Thanks,

Pingfan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ