[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAM_iQpVRruWNnMtP2BKfQJrHnA_B+ea84GbO_C2rF6cuJcj5_Q@mail.gmail.com>
Date: Wed, 1 Oct 2025 21:17:02 -0700
From: Cong Wang <xiyou.wangcong@...il.com>
To: Stefan Hajnoczi <stefanha@...hat.com>
Cc: David Hildenbrand <david@...hat.com>, linux-kernel@...r.kernel.org,
pasha.tatashin@...een.com, Cong Wang <cwang@...tikernel.io>,
Andrew Morton <akpm@...ux-foundation.org>, Baoquan He <bhe@...hat.com>,
Alexander Graf <graf@...zon.com>, Mike Rapoport <rppt@...nel.org>, Changyuan Lyu <changyuanl@...gle.com>,
kexec@...ts.infradead.org, linux-mm@...ck.org, multikernel@...ts.linux.dev,
jasowang@...hat.com
Subject: Re: [RFC Patch 0/7] kernel: Introduce multikernel architecture support
On Mon, Sep 29, 2025 at 8:12 AM Stefan Hajnoczi <stefanha@...hat.com> wrote:
>
> On Sat, Sep 27, 2025 at 12:42:23PM -0700, Cong Wang wrote:
> > On Wed, Sep 24, 2025 at 12:03 PM Stefan Hajnoczi <stefanha@...hat.com> wrote:
> > >
> > > Thanks, that gives a nice overview!
> > >
> > > I/O Resource Allocation part will be interesting. Restructuring existing
> > > device drivers to allow spawned kernels to use specific hardware queues
> > > could be a lot of work and very device-specific. I guess a small set of
> > > devices can be supported initially and then it can grow over time.
> >
> > My idea is to leverage existing technologies like XDP, which
> > offers huge benefits here:
> >
> > 1) It is based on shared memory (although it is virtual)
> >
> > 2) Its API's are user-space API's, which is even stronger for
> > kernel-to-kernel sharing, this possibly avoids re-inventing
> > another protocol.
> >
> > 3) It provides eBPF.
> >
> > 4) The spawned kernel does not require any hardware knowledge,
> > just pure XDP-ringbuffer-based software logic.
> >
> > But it also has limitations:
> >
> > 1) xdp_md is too specific for networking, extending it to storage
> > could be very challenging. But we could introduce a SDP for
> > storage to just mimic XDP.
> >
> > 2) Regardless, we need a doorbell anyway. IPI is handy, but
> > I hope we could have an even lighter one. Or more ideally,
> > redirecting the hardware queue IRQ into each target CPU.
>
> I see. I was thinking that spawned kernels would talk directly to the
> hardware. Your idea of using a software interface is less invasive but
> has an overhead similar to paravirtualized devices.
When we have sufficient hardware resources or prefer to use
SR IOV, the multikernel could indeed access hardware directly.
Queues are an alternative choice for elasticity.
>
> A software approach that supports a wider range of devices is
> virtio_vdpa (drivers/vdpa/). The current virtio_vdpa implementation
> assumes that the device is located in the same kernel. A
> kernel-to-kernel bridge would be needed so that the spawned kernel
> forwards the vDPA operations to the other kernel. The other kernel
> provides the virtio-net, virtio-blk, etc device functionality by passing
> requests to a netdev, blkdev, etc.
I think that is the major blocker. VDPA looks more complex than
queue-based solutions (including Soft Functions provided by mlx),
from my naive understanding, but I will take a deep look at VDPA.
>
> There are in-kernel simulator devices for virtio-net and virtio-blk in
> drivers/vdpa/vdpa_sim/ which can be used as a starting point. These
> devices are just for testing and would need to be fleshed out to become
> useful for real workloads.
>
> I have CCed Jason Wang, who maintains vDPA, in case you want to discuss
> it more.
Appreciate it.
Regards,
Cong Wang
Powered by blists - more mailing lists