Dominik Süß

fighting computers since 1999

eBPF dereference of modified ctx ptr


During this weeks Hackathon at Grafana Labs, I tried to get delve's eBPF tracing to work using a remote-controlled agent. To do this, I took their trace.bpf.c eBPF function that extracts the passed parameters and returns them to a ringbuffer.

Turns out, just compiling this didn't work. The validator always refused to load the program with the following error:

dereference of modified ctx ptr r0

I found the article Ebpf: Dereference of Modified Ctx Ptr Disallowed which told my why the validator refuses to load the program (arbitrary memory access at offsets from the ctx pointer are not allowed) but doesn't offer any fixes. The most likely cause for these issues is the compiler optimizing the output in a way that no longer passes validation.

By removing different parts of the module, I narrowed the issue down to this switch statement:

__always_inline void get_value_from_register(struct pt_regs *ctx, void *dest,
                                             int reg_num) {
  switch (reg_num) {
  case 0: // RAX
    __builtin_memcpy(dest, &ctx->ax, sizeof(ctx->ax));
    break;
  case 1: // RDX
    __builtin_memcpy(dest, &ctx->dx, sizeof(ctx->dx));
    break;
  // ...
  }
}

Whenever this was included in the program, it refused to load. In this case the ctx pointer isn't actually modified, it's just accessed at different offsets. So how come this still caused the validator to fail?

In a message on the iovisor-dev mailing list I found my answer! Yonghong Song explains:

Now I remembered that we had this issue before in bcc. it is a compiler optimization likes this:

if (...)
    *(ctx + 60)
else
    *(ctx + 56)

The compiler translates it to

if (...)
   ptr = ctx + 60
else
   ptr = ctx + 56
*(ptr + 0)

In my case, this would look like this (in pseudo c code):

reg = &ctx
switch (reg_num) {
case 0: // RAX
  reg += ax_offset
  break;
case 1: // RDX
  reg += dx_offset
  break;
}
__builtin_memcpy(dest, reg, some_const); // all fields have the same size so this param is optimized away

Luckily, Yonghong also offers a solution. By sprinkling __asm__ __volatile__("" : : : "memory"); after the offending memory operations, we force the compiler to merge diverging paths at this point. This avoids the optimization taken by LLVM and results in a valid program!

__always_inline void get_value_from_register(struct pt_regs *ctx, void *dest,
                                             int reg_num) {
  switch (reg_num) {
  case 0: // RAX
    __builtin_memcpy(dest, &ctx->ax, sizeof(ctx->ax));
    __asm__ __volatile__("" : : : "memory");
    break;
  case 1: // RDX
    __builtin_memcpy(dest, &ctx->dx, sizeof(ctx->dx));
    __asm__ __volatile__("" : : : "memory");
    break;
  // ...
  }
}