linux-kernel - Re: [RFC 0/3] Improve memory statistics for virtio balloon

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ee1ac0fb-daf7-4aea-b07e-f8879b6b860b@redhat.com>
Date: Mon, 15 Apr 2024 17:01:49 +0200
From: David Hildenbrand <david@...hat.com>
To: zhenwei pi <pizhenwei@...edance.com>, linux-kernel@...r.kernel.org,
 linux-mm@...ck.org, virtualization@...ts.linux.dev
Cc: mst@...hat.com, jasowang@...hat.com, xuanzhuo@...ux.alibaba.com,
 akpm@...ux-foundation.org
Subject: Re: [RFC 0/3] Improve memory statistics for virtio balloon

On 15.04.24 10:41, zhenwei pi wrote:
> Hi,
> 
> When the guest runs under critial memory pressure, the guest becomss
> too slow, even sshd turns D state(uninterruptible) on memory
> allocation. We can't login this VM to do any work on trouble shooting.
> 
> Guest kernel log via virtual TTY(on host side) only provides a few
> necessary log after OOM. More detail memory statistics are required,
> then we can know explicit memory events and estimate the pressure.
> 
> I'm going to introduce several VM counters for virtio balloon:
> - oom-kill
> - alloc-stall
> - scan-async
> - scan-direct
> - reclaim-async
> - reclaim-direct

IIUC, we're only exposing events that are already getting provided via 
all_vm_events(), correct?

In that case, I don't really see a major issue. Some considerations:

(1) These new events are fairly Linux specific.

PSWPIN and friends are fairly generic, but HGTLB is also already fairly 
Linux specific already. OOM-kills don't really exist on Windows, for 
example. We'll have to be careful of properly describing what the 
semantics are.

(2) How should we handle if Linux ever stops supporting a certain event 
(e.g., major reclaim rework). I assume, simply return nothing like we 
currently would for VIRTIO_BALLOON_S_HTLB_PGALLOC without 
CONFIG_HUGETLB_PAGE.

-- 
Cheers,

David / dhildenb