lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 26 Oct 2021 10:47:20 -0400
From:   Steven Rostedt <rostedt@...dmis.org>
To:     Huan Xie <xiehuan09@...il.com>
Cc:     Masami Hiramatsu <mhiramat@...nel.org>, mingo@...hat.com,
        chenhuacai@...nel.org, linux-kernel@...r.kernel.org,
        Tom Zanussi <zanussi@...nel.org>
Subject: Re: [RFC PATCH v2] trace: Add trace any kernel object

On Tue, 26 Oct 2021 16:50:46 +0800
Huan Xie <xiehuan09@...il.com> wrote:

> > > +static void submit_trace_object(unsigned long ip, unsigned long parent_ip,
> > > +                              unsigned long object)
> > > +{
> > > +
> > > +     struct trace_buffer *buffer;
> > > +     struct ring_buffer_event *event;
> > > +     struct trace_object_entry *entry;
> > > +     int pc;
> > > +
> > > +     pc = preempt_count();
> > > +     event = trace_event_buffer_lock_reserve(&buffer, &event_trace_file,
> > > +                     TRACE_OBJECT, sizeof(*entry), pc);
> > > +     if (!event)
> > > +             return;
> > > +     entry   = ring_buffer_event_data(event);
> > > +     entry->ip                       = ip;
> > > +     entry->parent_ip                = parent_ip;
> > > +     entry->object                   = object;  
> >
> > So here we are just recording the value we saw at the kprobe (not very
> > interesting).
> >
> > I think we want the content of the object:
> >
> >         long val;
> >
> >         ret = copy_from_kernel_nofault(&val, object, sizeof(val));
> >         if (ret)
> >                 val = 0;  
> 
> This place is the only thing I don't understand, don't know  why and
> where to use the copy_from_kernel_nofault.


If we have the address of the symbol, we want to read what's at that
address, right?

> 
> we can only get the struct pt_regs from the  __kprobe_trace_fun() ,
> and use it on the  trace_object_trigger() ,
> so need to save the pt_regs using a struct:
> 
> struct object_trigger_param {
>         struct pt_regs *regs;
>         int param;
> };
> 
> /* Kprobe handler */
> static nokprobe_inline void __kprobe_trace_func(struct trace_kprobe
> *tk, struct pt_regs *regs,
>                     struct trace_event_file *trace_file)
> 
> 
> static void trace_object_trigger(struct event_trigger_data *data,
> struct trace_buffer *buffer,  void *rec,
>                    struct ring_buffer_event *event)


OK, so let me ask this question. What is it that you want to see?

If we have (using your example):

int bio_add_page(struct bio *bio, struct page *page,
				unsigned int len, unsigned int offset)

And we want to trace "bio" right?

Doing:

  echo 'p bio_add_page arg1=$arg1' > kprobe_events

Will make "arg1" be assigned the pointer that was passed in.

  0xffff888102a4b900

Which is a local variable that holds an address to some structure bio.

Your current example just keeps showing us that same pointer address and
not the content of bio, and will never change until the bio_add_page
function is called again, in which case, you will now be tracing the
next address of the structure that was passed into the function. There's
nothing more to learn from this over just tracing that function and giving
us the address passed in.

Now if I look at struct bio, I see:

struct bio {
	[..]
	atomic_t		__bi_cnt;	/* pin count */
	[..]
};

And let's say I want to monitor that __bi_cnt while functions are being
traced. What would be *really cool*, is to mark that value!

// find the offset of __bi_cnt in struct bio:
$ gdb vmlinux
(gdb) p &((struct bio *)0)->__bi_cnt
$1 = (atomic_t *) 0x64

 # echo 'objfilter:0x64(arg1) if comm == "cat"' > ./trigger

Which would then read that arg1=0xffff888102a4b900 and offset it by 0x64,
and give me the value at that location:

  *(0xffff888102a4b900 + 0x64)

at every function. Then I could watch the __bi_cnt change over time. But to
dereference memory safely, we need to use copy_from_kernel_nofault()
because that address "0xffff888102a4b900 + 0x64" could point to nothing
and fault / crash the kernel.

	obj = arg1 + 0x64
	if (copy_from_kernel_nofault(&val, arg1 + 0x64, sizeof(val)))
		// faulted
		return;

Now val has the content of __bi_cnt and we can print that!

-- Steve



> 
> > Then we can see what changed during this time.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ