A popular AI coding tool called GSD (“Get Shit Done”) got rug-pulled this week. The maintainer disappeared, social accounts were deleted, and a crypto token associated with the project was publicly linked to a rug-pull. The community forked the code, but the original npm packages are still out there under the old owner’s control.

The community response was predictable: “be more careful what you install.”

But this isn’t new. Earlier this month, a compromised VS Code extension was live for 18 minutes. That was enough. The payload harvested credentials from everywhere it could reach — npm tokens, AWS keys, SSH keys, vault tokens, GitHub PATs. One employee had it installed. Thousands of internal repos were accessed. Signing keys had to be rotated. Every Enterprise Server customer had to take action.

18 minutes. One extension. One machine.

The extension wasn’t sketchy. It was well-known, widely-used, actively-maintained. The compromise came through a chain of poisoned dependencies. The malicious payload didn’t even need anything exotic — it just called tools already on the machine to grab credentials.

And it’s not just extensions. Earlier this year, Notepad++ was hijacked by what security researchers assessed as a Chinese state-sponsored group — not through a code vulnerability, but by compromising the hosting provider and selectively redirecting update traffic. The software was fine. The distribution was poisoned.

Mitigations exist, but they don’t solve this

Yes, there are org-level controls. You can disable the VS Code marketplace entirely. You can restrict extensions to trusted publishers. You can lock down package registries.

But “trusted publisher” wouldn’t have helped here — the compromised extension was from a trusted publisher. The Notepad++ compromise bypassed the publisher entirely. You can lock everything down to be secure and also completely useless. But if you want your developers to be productive, a balance has to be struck — and that’s where the real nuance lives. How much do you allow to run? Which tools get access to what? Who decides?

The advice we keep giving after these incidents — vet your dependencies, check the maintainer, audit your supply chain — is correct. It’s also completely insufficient at scale.

Because the actual problem isn’t which tools you install. It’s what those tools can do once they’re running.

You might assume VS Code extensions run sandboxed. They don’t. Extensions run in a separate extension host process, but according to VS Code’s own docs, that process has the same permissions as VS Code itself — it can read and write files, make network requests, and run external processes. Compare that to browser extensions, which run in a sandboxed environment with a declared permission model — they can’t just read arbitrary files off your disk. The browser figured this out years ago. Developer tooling hasn’t caught up.

An npm postinstall script runs with the same permissions as you. An MCP server runs inside your agent’s process with whatever the agent can touch.

There’s no boundary between “I added a productivity tool” and “that tool can read every secret on my machine.”

The right question

The question I keep coming back to: why is full trust the default?

Why doesn’t the platform enforce what a tool can actually reach — which files, which network endpoints, which credentials — regardless of whether you trust the maintainer?

Some ecosystems are starting to get this right. Deno ships with --allow-net, --allow-read, --allow-env — you have to explicitly grant each capability. Mobile operating systems figured this out a decade ago: apps declare permissions, the OS enforces them, and you can revoke access at any time. Windows has had this with UWP and AppContainer — sandboxing, declared capabilities, brokered access to resources. The primitives exist. They’re just not being applied to developer tooling yet.

Developer tooling is still stuck in the 2005 model where everything runs with your full permissions and we just hope nobody’s malicious.

What would “boundaries as infrastructure” look like?

A few things that would have changed the outcome here:

Filesystem isolation. The extension doesn’t need access to ~/.vault-token or ~/.npmrc or ~/.ssh/. Why can it read them? Restrict tools to the workspace they’re operating on, plus explicitly declared paths.

Network policy. A code linting extension has no reason to make outbound HTTP requests to arbitrary endpoints. A language server doesn’t need to phone home. Enforce a default-deny network policy and make tools declare what they need.

Credential isolation. Your GitHub PAT, your AWS keys, your SSH keys — these should not be ambient. Tools should request credential access through a mediated API, not just read files off disk.

Per-invocation sandboxing. Not “this tool is sandboxed” as a binary yes/no, but different sandbox shapes for different tools based on what they actually need. A filesystem MCP server gets filesystem access to specific paths. A GitHub MCP server gets network access to api.github.com. A linter gets read access to the workspace and nothing else.

None of this is novel. It’s how mobile platforms have worked for over a decade. It’s how Deno works today. It’s how containers work. We just haven’t applied it to developer workstations yet, because “developers need full access” has been the unquestioned assumption.

“Be careful” is individual advice. Boundaries are infrastructure.

Individual judgment doesn’t scale. You can’t vet every transitive dependency. You can’t audit every extension update. You can’t “be more careful” when a trusted tool gets compromised three hops deep in the dependency chain and the malicious version is live for 18 minutes.

What scales is enforcing boundaries at the platform level — so that even when a tool is compromised, the blast radius is contained to what that tool was supposed to access in the first place, and auditing can happen for forensic purposes.

We have the primitives. Sandboxing, capability-based permissions, network policy enforcement — this isn’t speculative technology. It’s just not being applied where developers live.

That’s the gap. And until we close it, “be careful what you install” will keep being our only answer, and it will keep being insufficient.


I write about AI agents, developer tooling, and platform security. If this resonated, connect with me on LinkedIn — I post about these topics regularly.

Updated: