[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c390dc64-280e-6d9f-661a-9a5d77f16cf8@linux.dev>
Date: Thu, 10 Aug 2023 12:41:01 -0700
From: Martin KaFai Lau <martin.lau@...ux.dev>
To: Michal Hocko <mhocko@...e.com>,
Roman Gushchin <roman.gushchin@...ux.dev>
Cc: Chuyi Zhou <zhouchuyi@...edance.com>, hannes@...xchg.org,
ast@...nel.org, daniel@...earbox.net, andrii@...nel.org,
muchun.song@...ux.dev, bpf@...r.kernel.org,
linux-kernel@...r.kernel.org, wuyun.abel@...edance.com,
robin.lu@...edance.com
Subject: Re: [RFC PATCH 1/2] mm, oom: Introduce bpf_select_task
>>>> First, I'm a bit concerned about implicit restrictions we apply to bpf programs
>>>> which will be executed potentially thousands times under a very heavy memory
>>>> pressure. We will need to make sure that they don't allocate (much) memory, don't
>>>> take any locks which might deadlock with other memory allocations etc.
>>>> It will potentially require hard restrictions on what these programs can and can't
>>>> do and this is something that the bpf community will have to maintain long-term.
>>>
>>> Right, BPF callbacks operating under OOM situations will be really
>>> constrained but this is more or less by definition. Isn't it?
>>
>> What do you mean?
>
> Callbacks cannot depend on any direct or indirect memory allocations.
> Dependencies on any sleeping locks (again directly or indirectly) is not
> allowed just to name the most important ones.
>
>> In general, the bpf community is trying to make it as generic as possible and
>> adding new and new features. Bpf programs are not as constrained as they were
>> when it's all started.
bpf supports different running context. For example, only non-sleepable bpf prog
is allowed to run at the NIC driver. A sleepable bpf prog is only allowed to run
at some bpf_lsm hooks that is known to be safe to call blocking
bpf-helper/kfunc. From the bpf side, it ensures a non-sleepable bpf prog cannot
do things that may block.
fwiw, Dave has recently proposed something for iterating the task vma
(https://lore.kernel.org/bpf/20230810183513.684836-4-davemarchevsky@fb.com/).
Potentially, a similar iterator can be created for a bpf program to iterate
cgroups and tasks.
Powered by blists - more mailing lists