Add backpressure when rapidly creating new stateful Streamable HTTP sessions without closing them#677
Add backpressure when rapidly creating new stateful Streamable HTTP sessions without closing them#677
Conversation
|
|
||
| // Acquire semaphore when session becomes inactive (reference count goes to 0) to slow | ||
| // down session creation until idle sessions are disposed by the background service. | ||
| await session._idleSessionSemaphore.WaitAsync(); |
There was a problem hiding this comment.
Could this wait forever since there is no cancellationToken?
There was a problem hiding this comment.
In theory no, since the session is already marked as idle by virtue of the reference count going to zero, the corresponding call to Release() to unblock this should not be inhibited.
For this to wait, that means we are at least at 110% of the MaxIdleSessionCount. Either idle tracking background service will eventually dispose enough sessions to get it below 110% and call Release() or another request will reactivate the session and call Release(). And any sessions waiting for this call to WaitAsync() to complete are already eligible for pruning if they haven't been reactivated.
|
I'm closing this in favor of #701 which is an improvement over this PR because it disposes an unroots extra idle sessions inline before starting new ones , so there's no need to wait for the IdleTrackingBackgroundService to clean up the next time the periodic timer fires even though the periodic timer still exists to handle normal IdleTimeouts when the MaxIdleSessionCount is not reached. |
Also reduce MaxIdleSessionCount to default to 10,000
I tested the AspNetCoreMcpSample using https://github.com/wg/wrk/ and the following command:
And the following lua script:
The ~30,000 RPS isn't too bad considering the RPS I get only ~5,000 RPS more using the same parameters against the following minimal endpoint:
However, @DavidParks8 made another benchmark that created a new session per request which quickly overloaded the server. I emulated this benchmark by simply commenting out as the
-- wrk.headers["Mcp-Session-Id"] = args[1]part of the lua script, and sure enough after just over 100,000 requests the server became overwhelmed. I attached a debugger and saw GC could not keep up disposing all the McpServers/McpSessions and the hundreds of thread pool threads started that were responsible for calling DisposeAsync and unreferencing the pruned idle sessions stalled waiting on GC leading to thread pool starvation.With these changes, since we only allow 11,000 new sessions to be created with the new default idle session limit of 10,000, we'll only prune down to 1,000 idle sessions every 5 seconds if the client doesn't gracefully close the session with a DELETE request.
The new default steady-states to approximately 200 new sessions per second. We can look at improving the maximum new session rate this once we have better object pooling to avoid GC pressure when creating new sessions so rapidly. We could also look into proactively pruning idle sessions when new sessions are waiting rather than waiting on the background service.
Here are results that show that the new session RPS remains stable after this change even if you try to open more concurrent connections to allow more parallel requests. Memory usage looks stable as well at about 400 MB in these tests.
These changes do not have any apparent impact on single-session performance which remains a little over 30k RPS.