linux-kernel - Re: [PATCH v13 00/13] nommu UML

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m2bjl7y6mv.wl-thehajime@gmail.com>
Date: Wed, 12 Nov 2025 17:52:56 +0900
From: Hajime Tazaki <thehajime@...il.com>
To: johannes@...solutions.net
Cc: hch@...radead.org,
	linux-um@...ts.infradead.org,
	ricarkol@...gle.com,
	Liam.Howlett@...cle.com,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v13 00/13] nommu UML



On Tue, 11 Nov 2025 17:01:25 +0900,
Johannes Berg wrote:
> 
> On Mon, 2025-11-10 at 21:18 +0900, Hajime Tazaki wrote:
> > 
> >   What is it for ?
> >   ================
> >   
> >   - Alleviate syscall hook overhead implemented with ptrace(2)
> >   - To exercises nommu code over UML (and over KUnit)
> >   - Less dependency to host facilities
> 
> FWIW, in some way, this order of priorities is exactly why this hasn't
> been going anywhere, and every time I looked at it I got somewhat
> annoyed by what seems to me like choices made to support especially the
> first bullet.

over the past versions, I've been emphasized that the 2nd bullet (testing)
is the primary usecase as I saw several actually cases from mm folks,

https://lists.infradead.org/pipermail/maple-tree/2024-November/003775.html
https://lore.kernel.org/all/cb1cf0be-871d-4982-9a1b-5fdd54deec8d@lucifer.local/

and I think this is not limited to mm code.

other 2 bullets are additional benefits which we observed in a
comment, and our experience.

https://lore.kernel.org/all/20241122121826.GA26024@lst.de/
[2] https://static.sched.com/hosted_files/ossna2020/ec/kollerr_linux_um_nommu.pdf

but those are not the primary goal, so I'm not pushing this aspect
with usecases.

> I suspect that the first and third bullet are not even really true any
> more, since you moved to seccomp (per our request), yet I think design
> choices influenced by them persist.

this observation is not true; the first bullet is still true even
using seccomp.  please look at the benchmark result in the patch
[12/13], quoted below.

summary: most of tests show that um-nommu+seccomp is x4 to x20 faster
than um-mmu+seccomp (and ptrace).

.. csv-table:: lmbench (usec)
  :header: ,native,um,um-mmu(s),um-nommu(s)

  select-10    ,0.5319,36.1214,24.2795,2.9174
  select-100   ,1.6019,34.6049,28.8865,3.8080
  select-1000  ,12.2588,43.6838,48.7438,12.7872
  syscall      ,0.1644,35.0321,53.2119,2.5981
  read         ,0.3055,31.5509,45.8538,2.7068
  write        ,0.2512,31.3609,29.2636,2.6948
  stat         ,1.8894,43.8477,49.6121,3.1908
  open/close   ,3.2973,77.5123,68.9431,6.2575
  fork+sh      ,1110.3000,7359.5000,4618.6667,439.4615
  fork+execve  ,510.8182,2834.0000,2461.1667,139.7848

.. csv-table:: do_getpid bench (nsec)
  :header: ,native,um,um-mmu(s),um-nommu(s)

  getpid , 161 , 34477 , 26242 , 2599

the 1st bullet saying ptrace(2) is somehow misleading now.  this might
be rephrased with "a separate process handling userspace", instead of
"ptrace".

# when I started this patchset, the seccomp patch wasn't in upstream.
  saying ptrace(2) wasn't not that much wrong.

> People are definitely interested in the second bullet, mostly for kunit,
> and I'd be willing to support them in that to some extent.

so (again) the 2nd bullet is the primary use case at this stage.

> However, I'm not yet convinced that all of the complexities presented in
> this patchset (such as completely separate seccomp implementation) are
> actually necessary in support of _just_ the second bullet. These seem to
> me like design choices necessary to support the _first_ bullet [1].

separate seccomp implementation is indeed needed due to the design
choice we made, to use a single process to host a (um) userspace.  I
think there is no reason to unify the seccomp part because the
signal handlers and filter installation do the different jobs.

I don't see why you see this as a _complexity_, as functionally both
seccomp handling don't interfere each other.  we have prepared
separate sub-directories for nommu to avoid unnecessary if/else
clauses in .c/.h files.  we haven't seen any functional regressions
since this RFC version (which was 6.12 kernel).

> [1] and then I suppose the third, which I'm reading as "doesn't need
> seccomp or ptrace", but I'm not really quite sure what you meant
> 
> I've thought about what would happen if we stuck to creating a (single)
> separate process on the host to execute userspace, and just used
> CLONE_VM for it. That way, it's still no-MMU with full memory access,
> but there's some implicit isolation between the kernel and userspace
> processes which will likely remove complexities around FP/SSE/AVX
> handling, may completely remove the need for a separate seccomp
> implementation, etc.

this would be doable I think, but we went the different way, as
using separate host processes (with ptrace/seccomp) is slow and add
complexity by the synchronization between processes, which we think
it's not easy to maintain in the future.

this was natural for us (not sure for maintainers) when we add a new
functionality, consider several options to implement, and took one of the
option which is faster, simpler, and having less cost to maintain.

the avoidance of separate processes is probably the core of our design
choice we made for nommu UML.  I'm not strongly pushing the benefits
of 1st/3rd bullets, but I thought describing the characteristics of
what _this_ patchset can should be useful.  thus in the document.

additionally, if the design choice we made introduces any breakages on
existing code, or maintenance burdens, I would understand your concern
on the complexity, but I don't think this is the case.

> It would, on the other hand, make it completely non-viable to achieve
> the first and third bullets, so given your pursuit of those, one some
> level I understand the design right now. I'm yet to be convinced,
> however, that those are even worthy goals for (upstream) UML, what use
> case would that enable that we really need?

the usecase for those are inherited from the original implementation,
[2] above, which is running UML on containers with less host dependency
and speedups.  but again, this is not the primary goal at this stage.

if you think that the document should not describe the potential
benefits/usecases which are not related to the primary goal of the
functionality, I'd agree to remove those descriptions.

> Especially considering that
> over a longer perspective, NOMMU architectures _are_ on their way out,
> and UML will certainly follow once that happens, it won't be the last
> remaining NOMMU architecture.

I'm aware of this nommu removal discussion, but also saw there are
expressions not to support this direction.  This patchset is still
useful even now.

> So the only value I see in this is for testing over the net couple of
> years, which really doesn't need any sort of significant optimisation or
> less reliance on host facilities.

I agree the former, but not the latter.

- there is a value with a real usecase,
- there are different ways to implement it but this went with the
  one with potential (additional) benefits,
- without breakages to the exising (MMU) uml code.

with that, we're proposing this patchset.

> Where do you see this differently?

thanks for the careful prompt for me.
I hope my answer clarifies your concerns.

I also wish to understand concerns of maintainers, due to the single
process design of nommu for um userspace, and the codebase is still
young so may have unexpected influence to others.  but this is exactly
the reason why I also put myself to MAINTAINERS in order to take care
of this patchset even it is small (1.3k loc).


-- Hajime