[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAG_fn=XFkNVkT3EmB99SdEBAwkGq3EUdM9xR4rzH_HatrJw8rQ@mail.gmail.com>
Date: Wed, 15 Jan 2025 16:16:57 +0100
From: Alexander Potapenko <glider@...gle.com>
To: Joey Jiao <quic_jiangenj@...cinc.com>
Cc: Marco Elver <elver@...gle.com>, Dmitry Vyukov <dvyukov@...gle.com>,
Andrey Konovalov <andreyknvl@...il.com>, Jonathan Corbet <corbet@....net>,
Andrew Morton <akpm@...ux-foundation.org>, Dennis Zhou <dennis@...nel.org>, Tejun Heo <tj@...nel.org>,
Christoph Lameter <cl@...ux.com>, Catalin Marinas <catalin.marinas@....com>, Will Deacon <will@...nel.org>,
kasan-dev@...glegroups.com, linux-kernel@...r.kernel.org,
workflows@...r.kernel.org, linux-doc@...r.kernel.org, linux-mm@...ck.org,
linux-arm-kernel@...ts.infradead.org, kernel@...cinc.com
Subject: Re: [PATCH 0/7] kcov: Introduce New Unique PC|EDGE|CMP Modes
On Tue, Jan 14, 2025 at 2:00 PM Joey Jiao <quic_jiangenj@...cinc.com> wrote:
>
> On Tue, Jan 14, 2025 at 11:43:08AM +0100, Marco Elver wrote:
> > On Tue, 14 Jan 2025 at 06:35, Jiao, Joey <quic_jiangenj@...cinc.com> wrote:
> > >
> > > Hi,
> > >
> > > This patch series introduces new kcov unique modes:
> > > `KCOV_TRACE_UNIQ_[PC|EDGE|CMP]`, which are used to collect unique PC, EDGE,
> > > CMP information.
> > >
> > > Background
> > > ----------
> > >
> > > In the current kcov implementation, when `__sanitizer_cov_trace_pc` is hit,
> > > the instruction pointer (IP) is stored sequentially in an area. Userspace
> > > programs then read this area to record covered PCs and calculate covered
> > > edges. However, recent syzkaller runs show that many syscalls likely have
> > > `pos > t->kcov_size`, leading to kcov overflow. To address this issue, we
> > > introduce new kcov unique modes.
Hi Joey,
Sorry for not responding earlier, I thought I'd come with a working
proposal, but it is taking a while.
You are right that kcov is prone to overflows, and we might be missing
interesting coverage because of that.
Recently we've been discussing the applicability of
-fsanitize-coverage=trace-pc-guard to this problem, and it is almost
working already.
The idea is as follows:
- -fsanitize-coverage=trace-pc-guard instruments basic blocks with
calls to `__sanitizer_cov_trace_pc_guard(u32 *guard)`, each taking a
unique 32-bit global in the __sancov_guards section;
- these globals are zero-initialized, but upon the first call to
__sanitizer_cov_trace_pc_guard() from each callsite, the corresponding
global will receive a unique consequent number;
- now we have a mapping of PCs into indices, which can we use to
deduplicate the coverage:
-- storing PCs by their index taken from *guard directly in the
user-supplied buffer (which size will not exceed several megabytes in
practice);
-- using a per-task bitmap (at most hundreds of kilobytes) to mark
visited basic blocks, and appending newly encountered PCs to the
user-supplied buffer like it's done now.
I think this approach is more promising than using hashmaps in kcov:
- direct mapping should be way faster than a hashmap (and the overhead
of index allocation is amortized, because they are persistent between
program runs);
- there cannot be collisions;
- no additional complexity from pool allocations, RCU synchronization.
The above approach will naturally break edge coverage, as there will
be no notion of a program trace anymore.
But it is still a question whether edges are helping the fuzzer, and
correctly deduplicating them may not be worth the effort.
If you don't object, I would like to finish prototyping coverage
guards for kcov before proceeding with this review.
Alex
> > > 2. [P 2-3] Introduce `KCOV_TRACE_UNIQ_EDGE` Mode:
> > > - Save `prev_pc` to calculate edges with the current IP.
> > > - Add unique edges to the hashmap.
> > > - Use a lower 12-bit mask to make hash independent of module offsets.
Note that on ARM64 this will be effectively using bits 11:2, so if I
am understanding correctly more than a million coverage callbacks will
be mapped into one of 1024 buckets.
Powered by blists - more mailing lists