Skip to content

Conversation

@zedkipp
Copy link
Contributor

@zedkipp zedkipp commented Dec 20, 2025

Add the agent forwarding of boundary audit logs from workspaces to coderd via the agent API, and re-emission of boundary logs to coderd stderr.

Log format example:

[API] 2025-12-23 18:31:46.755 [info] coderd.agentrpc: boundary_request owner=.. workspace_name=.. agent_name=.. decision=.. workspace_id=.. http_method=.. http_url=.. event_time=.. request_id=..

Corresponding boundary PR: coder/boundary#124
RFC: https://www.notion.so/coderhq/Agent-Boundary-Logs-2afd579be59280f29629fc9823ac41ba
#21280

@zedkipp zedkipp force-pushed the zedkipp/boundary-logs branch 3 times, most recently from 334f1ca to ab19300 Compare December 22, 2025 17:55
@zedkipp zedkipp force-pushed the zedkipp/boundary-logs branch 4 times, most recently from ced7c18 to adf8dda Compare December 23, 2025 19:01
@zedkipp zedkipp marked this pull request as ready for review December 23, 2025 21:39
Add receiving boundary logs via stream unix socket in the agent, forwarding
of boundary audit logs from agent to coderd via agent API, and re-emission
of boundary logs to coderd stderr.

Log format example:
[API] 2025-12-23 18:31:46.755 [info] coderd.agentrpc: boundary_request owner=.. workspace_name=.. agent_name=.. decision=.. workspace_id=.. http_method=.. http_url=.. event_time=.. request_id=..
@zedkipp zedkipp force-pushed the zedkipp/boundary-logs branch from adf8dda to 752536f Compare December 24, 2025 00:23
Copy link
Contributor

@dannykopping dannykopping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before we proceed, we need to extract the message framing into its own package so it can be shared. There's a lot of duplication across this PR and https://github.com/coder/boundary/pull/124, and it will lead to inconsistencies.

We also need a more resilient approach towards the socket addressing; /tmp is not guaranteed to exist.

Comment on lines +134 to +139
s.wg.Add(1)
go func() {
defer s.wg.Done()
<-ctx.Done()
_ = conn.Close()
}()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you not move the conn.Close to the same defer as cancel() but run it afterwards?
I'm not seeing why we need the extra goroutine.

Copy link
Contributor Author

@zedkipp zedkipp Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

binary.Read() may be blocked waiting on I/O. When the context cancellation happens, something needs to interrupt binary.Read(). That's what the conn.Close() does here.

EDIT: this could be a read deadline, but this approach doesn't require the loop to spin on the deadline expiry to check for cancellation.

@zedkipp zedkipp requested a review from dannykopping December 30, 2025 01:32
Copy link
Contributor

@dannykopping dannykopping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! A few minor points but we're close now.

@zedkipp zedkipp requested a review from dannykopping December 31, 2025 00:02
Copy link
Contributor

@dannykopping dannykopping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing blocking, LGTM 👍

// MaxMessageSizeV1 is the maximum size of the encoded protobuf messages sent
// over the wire for the TagV1 tag. While the wire format allows 24 bits for
// length, TagV1 only uses 15 bits.
MaxMessageSizeV1 uint32 = 1 << 15
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking nit: I think this detail should be hidden behind a Tag type to make adding new tags easy, and using tag settings expressively.
Right now there are a couple shortcomings with this design:

  1. You just have to know that TagV1 and MaxMessageSizeV1
  2. The logic to determine max size is scattered across both {Read,Write}Frame
  3. Adding future tags will require changes in a few places instead of just one (the definition)

Here's an idea you might consider: https://gist.github.com/dannykopping/2d7d512d4c4c6508003dc6efcf39cee0

Just a quick hack job but the codec package tests pass at least.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look into refactoring after this initial PR merges. I debated just using a map keyed by a Tag because all that's really needed is knowing what maximum message size is associated with a particular tag.

@zedkipp zedkipp merged commit 0792403 into main Dec 31, 2025
30 checks passed
@zedkipp zedkipp deleted the zedkipp/boundary-logs branch December 31, 2025 23:38
@github-actions github-actions bot locked and limited conversation to collaborators Dec 31, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants