Fix race condition when createdump tries to read /proc/pid/mem before prctl(PR_SET_PTRACER, childpid) is set#120000
Conversation
…ad /proc/pid/mem before prctl is set
There was a problem hiding this comment.
Pull Request Overview
This PR fixes a race condition in PROCCreateCrashDump where the child process (createdump) could attempt to read /proc/pid/mem before the parent process grants PTRACE permissions via prctl(PR_SET_PTRACER, childpid). The fix ensures proper synchronization between parent and child processes to prevent intermittent permission failures.
|
Tagging subscribers to this area: @steveisok, @dotnet/dotnet-diag |
|
I will address copilot's suggestions after a team member review if the fix makes sense 😅 |
|
BTW, the same pattern is also present on nativeaot: runtime/src/coreclr/nativeaot/Runtime/unix/PalCreateDump.cpp Lines 230 to 272 in 2ccc38b |
noahfalk
left a comment
There was a problem hiding this comment.
A few minor comments inline and I agree with those AI review comments that we should error check the new read/write calls you added.
Thanks for finding this and making the PR!
Interested in making the fix for NativeAOT in the same PR? |
Co-authored-by: Noah Falk <noahfalk@users.noreply.github.com>
Co-authored-by: Noah Falk <noahfalk@users.noreply.github.com>
|
@hoyosjs - if this looks good to you as well go ahead and merge it |
|
/backport to release/10.0 |
|
Started backporting to release/10.0: https://github.com/dotnet/runtime/actions/runs/17990123963 |
ProcDump-for-Linux is a tool that supports .NET as one of its targets to take dumps. For .NET, it works pretty much as
dotnet-dumptool, triggeringcreatedumputility.In ProcDump's integration pipelines, we've been seeing some intermittent errors taking .NET dumps. Sometimes we see:
We even saw that for the same process, a first dump succeeded, a second one failed with the message above.
After studying the code from createdump's launcher in the CLR, we can see there is a fork: the child execs and becomes
createdump, while the parent process usesprctlsyscall to allowcreatedumpto read its/proc/pid/mem:runtime/src/coreclr/pal/src/thread/process.cpp
Lines 2448 to 2505 in 6ef2af1
We noticed there can be a potential race condition on the code above when the child/
createdumptries to read/proc/pid/membefore the parent grants permissions. I assume it is rare in multi-core systems, but it happens quite frequently in our pipelines that have a limited number of cores.To force the race condition to happen locally, I added a
sleepon the parent branch right before it callsprctl- it always fails with the error message above!This PR proposes a fix to allow the parent to signal the child using pipes when
prctl(PR_SET_PTRACER, childpid)is done so it can proceed.