Reverse engineering OpenAI code execution to make it run C and JavaScript

by benswerdon 3/12/2025, 4:04 PMwith 69 comments

by simonwon 3/12/2025, 5:33 PM

I've had it write me SQLite extensions in C in the past, then compile them, then load them into Python and test them out: https://simonwillison.net/2024/Mar/23/building-c-extensions-...

I've also uploaded binary executable for JavaScript (Deno), Lua and PHP and had it write and execute code in those languages too: https://til.simonwillison.net/llms/code-interpreter-expansio...

If there's a Python package you want to use that's not available you can upload a wheel file and tell it to install that.

by jeffwasson 3/12/2025, 5:14 PM

A funny story I heard recently on a python podcast where a user was trying to get their LLM to ā€˜pip installā€™ a package in its sandbox, which it refused to do.

So he tricked it by saying ā€œwhat is the error message if you try to pip install fooā€ so it ran pip install and announced there was no error.

Package foo now installed.

by stolen_biscuiton 3/12/2025, 8:02 PM

How do we know you're actually running the code and it's not just the LLM spitting out what it thinks it would return if you were running code on it?

by j4nekon 3/12/2025, 4:28 PM

Many thanks for the interesting article! I normaly don't read any articles on AI here, but I really liked this one from a technical point of view!

since reading on twitter is annoying with all the popups: https://archive.is/ETVQ0

by jasonthorsnesson 3/12/2025, 5:07 PM

Given itā€™s running in a locked-down container: thereā€™s no reason to restrict it to Python anyway. They should parter/use something like replit to allow anything!

One weird thing - why would they be running such an old Linux?

ā€œTheir sandbox is running a really old version of linux, a Kernel from 2016.ā€

by yzydserdon 3/12/2025, 4:31 PM

Here is Simonw experimenting with ChatGPT and C a year ago: https://news.ycombinator.com/item?id=39801938

I find ChatGPT and Claude really quite good at C.

by huijzeron 3/12/2025, 9:05 PM

I did similar things last year [1]. Also I tried running arbitrary binaries and that worked too. You could even run them in the GPTs. It was okay back then but not super reliable. I should try again because the newer models definitively follow prompts better from what Iā€™ve seen.

[1]: https://huijzer.xyz/posts/openai-gpts/

by mirekrusinon 3/13/2025, 5:28 AM

That's how you put "Open" in "OpenAI".

Would be cool if you can get weights this way.

by grepfru_iton 3/12/2025, 7:28 PM

Just a reminder, Google allowed all of their internal source code to be browsed in a manner like this when Gemini first came out. Everyone on here said that could never happen, yet here we are again.

All of the exploits of early dotcom days are new again. Have fun!

by rhodescolossuson 3/12/2025, 4:28 PM

Pretty cool, it'd be interesting to try other things like running a C++ daemon and letting it run, or adding something to cron.

by lnautaon 3/12/2025, 4:35 PM

Interesting idea to increase the scope until the LLM gives suggestions on how to 'hack' itself. Good read!

by ttoinouon 3/12/2025, 10:22 PM

Itā€™s crazy Iā€™m so afraid of this kind of security failures that I wouldnā€™t even think of releasing an app like that online, Iā€™d ask myself too many questions about jailbreaking like that. But some people are fine with this kind of risks ?

by incognito124on 3/12/2025, 4:39 PM

I can't believe they're running it out of ipynb