Rendered at 11:13:16 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
cedws 13 hours ago [-]
Claude Code’s sandboxing is a complete joke. There should be no ‘off switch.’ Sandboxing should not be opt in. It should not have full read access over the file system by default.
I really want more security people to get involved in the LLM space because everyone seems to have just lost their minds.
If you look at this thing through a security lens it’s horrifying, which was a cause of frustration when Anthropic changed their TOS to ban use of alternative clients with a subscription. I don’t want to use that Swiss cheese.
tso 2 hours ago [-]
The Claude sandbox is so antithetical to good security posture it almost seems intentional[0]. Having both "default read to the entire file system" and "the agent can and _will_ disable the sandbox, without even asking the user[1], in order to complete tasks" would not pass muster in a freshman level security course.
[0] assuming a human with security training was involved in the design/prompting of the sandbox development.
[1] Claude has well used mechanisms for asking the user before taking potentionally dangerous actions. Why it is not part of the "disable my own SANDBOX" branches of code is confusing.
simlevesque 12 hours ago [-]
The first thing I recommend everyone using is devcontainers [1]. They're very simple to setup and make using LLMs a lot more secure.
Author here. I helped creating Falco (CNCF runtime security) and built this (Veto) to fix the path-based identity problem we all shipped a decade ago. The dynamic linker bypass in the "where it breaks" section is the part I'm most interested in discussing. It's a class of evasion that no current eval framework measures. Happy to answer questions about the BPF LSM implementation.
botanicalfriend 4 hours ago [-]
On the dynamic linker bypass specifically, have you looked at fapolicyd [1]? It uses fanotify(7) and the top of the README is:
> The restrictive policy was designed with these goals in mind:
> 1. No bypass of security by executing programs via ld.so.
> 2. Anything requesting execution must be trusted.
One correction on the table: SELinux and AppArmor shouldn't be grouped under "rename-resistant: No". AppArmor is path-based. SELinux labels are on the inode, a rename doesn't change the security context. The copy attack doesn't apply either: a process in sandbox_t creating a file in /tmp gets tmp_t via type transition, and the policy does not grant sandbox_t execute permission on tmp_t.
Thanks for your work! Just curious, would it be possible to pad the denylisted binary with arbitrary bytes and circumvent the content hash?
walterbell 11 hours ago [-]
Security policy usually defaults unknown artifacts to low privileges.
rogerrogerr 12 hours ago [-]
> No jailbreak, no special prompting. The agent just wanted to finish the task.
Good lord, why do people use LLMs to write on this topic? It destroys credibility.
thinkingemote 2 hours ago [-]
People who write about LLMs will use LLMs. That's the norm now. The exceptions are what we should look out for and cheer for.
HN users continue to upvote LLM written submissions.
The default for me is every LLM submission has little credibility unless proven otherwise. Enshittied.
hn_go_brrrrr 2 hours ago [-]
This doesn't read as AI-written to me, fwiw.
thinkingemote 2 hours ago [-]
Humans just need to adapt their pattern recognition skills. It's a continuous and changing effort. For some, not detecting it is the sign that they need to update their own systems not that the sign is wrong.
For many it's not worth the effort to even try anymore. Particularly when the content of a submission is about LLMs: why worry?
tomvault 16 hours ago [-]
The adversary can reason now, and our security tools weren't built for that.
Leo di Donato, who helped create Falco, the cloud native runtime security, wrote a technical deep dive into how Claude Code bypassed it's own denylist and sandbox. And introduces Veto, a kernel-level enforcement engine built into the Ona platform.
hilti 14 hours ago [-]
Thank you for this write up. I am still lightyears behind this deep knowledge, but feel like I learned from your post the vocabulary to get started.
I really want more security people to get involved in the LLM space because everyone seems to have just lost their minds.
If you look at this thing through a security lens it’s horrifying, which was a cause of frustration when Anthropic changed their TOS to ban use of alternative clients with a subscription. I don’t want to use that Swiss cheese.
[0] assuming a human with security training was involved in the design/prompting of the sandbox development.
[1] Claude has well used mechanisms for asking the user before taking potentionally dangerous actions. Why it is not part of the "disable my own SANDBOX" branches of code is confusing.
[1] https://code.claude.com/docs/en/devcontainer
https://github.com/anthropic-experimental/sandbox-runtime/is...
I ended up making my own sandbox wrapper instead https://GitHub.com/arianvp/landlock-nix
> The restrictive policy was designed with these goals in mind:
> 1. No bypass of security by executing programs via ld.so.
> 2. Anything requesting execution must be trusted.
One correction on the table: SELinux and AppArmor shouldn't be grouped under "rename-resistant: No". AppArmor is path-based. SELinux labels are on the inode, a rename doesn't change the security context. The copy attack doesn't apply either: a process in sandbox_t creating a file in /tmp gets tmp_t via type transition, and the policy does not grant sandbox_t execute permission on tmp_t.
[1] https://github.com/linux-application-whitelisting/fapolicyd
Good lord, why do people use LLMs to write on this topic? It destroys credibility.
HN users continue to upvote LLM written submissions.
The default for me is every LLM submission has little credibility unless proven otherwise. Enshittied.
For many it's not worth the effort to even try anymore. Particularly when the content of a submission is about LLMs: why worry?
Leo di Donato, who helped create Falco, the cloud native runtime security, wrote a technical deep dive into how Claude Code bypassed it's own denylist and sandbox. And introduces Veto, a kernel-level enforcement engine built into the Ona platform.