Skip to content

cleanup(bpf): reduce preemption fragility in overlayfs dedup#558

Merged
Stringy merged 2 commits intomainfrom
giles/no-lru-needed
Apr 28, 2026
Merged

cleanup(bpf): reduce preemption fragility in overlayfs dedup#558
Stringy merged 2 commits intomainfrom
giles/no-lru-needed

Conversation

@Stringy
Copy link
Copy Markdown
Contributor

@Stringy Stringy commented Apr 27, 2026

Description

The whole chain is in a single syscall context:
overlayfs -> open (first call to lsm hook) -> underlying fs -> open (second call to lsm hook.)

But the kernel thread might be preempted between the two lsm calls, resulting in scheduling on a different CPU and causing a cache miss when deduplicating overlayfs events.

To reduce this risk, this PR changes the dedup map to a global (not PERCPU) LRU map with 256 entries.

Checklist

  • Investigated and inspected CI test results
  • Updated documentation accordingly

Automated testing

  • Added unit tests
  • Added integration tests
  • Added regression tests

If any of these don't apply, please comment below.

Testing Performed

CI should be enough.

On a per-cpu hash with a single entry, LRU is redundant
@Stringy Stringy requested a review from a team as a code owner April 27, 2026 13:43
Copy link
Copy Markdown
Contributor

@Molter73 Molter73 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the kernel decides to run a separate program that also attempts to access an overlayfs file between the two separate calls to file_open? Will the call just fail to add the entry on the map and produce the duplicate event we stopped with the fix? Can this even happen or is the two calls to file_open happening during the same syscall in a way that prevents the program from being preempted?

I'm just trying to understand if the right move is having a single entry per CPU or if we should have an LRU with multiple entries.

@Stringy
Copy link
Copy Markdown
Contributor Author

Stringy commented Apr 27, 2026

I had originally thought that this is safe because the whole chain is in a single syscall context:
overlayfs -> open (first call to lsm hook) -> underlying fs -> open (second call to lsm hook.)

But having said that you might be right that the kernel could be preempted between those two calls; so the kernel is scheduled on a different CPU and misses the cache, and we'd emit the duplicated event. LRU cache + some arbitrarily sensible number of entries? 256?

@Molter73
Copy link
Copy Markdown
Contributor

LRU cache + some arbitrarily sensible number of entries? 256?

Yeah, but it needs to be a regular (non-per-cpu) map, otherwise it will still fail when preempted, just so we are 100% clear on this.

@Stringy Stringy changed the title cleanup(bpf): no need for LRU hash cleanup(bpf): reduce preemption fragility in overlayfs dedup Apr 28, 2026
@Stringy Stringy requested a review from Molter73 April 28, 2026 09:05
Copy link
Copy Markdown
Contributor

@Molter73 Molter73 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@Stringy Stringy merged commit dc24df9 into main Apr 28, 2026
25 checks passed
@Stringy Stringy deleted the giles/no-lru-needed branch April 28, 2026 09:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants