-
The simplest way to use local and online AI models. Interact with any AI model with just a click of a button.
#Productivity #Writing Tools #AI 15 social mentions
-
Frontier reasoning in the terminal
Inference in Python uses harmony [1] (for request and response format) which is written in Rust with Python bindings. Another OpenAI's Rust libraries is tiktoken [2], used for all tokenization and detokenization. OpenAI Codex [3] is also written in Rust. It looks like OpenAI is increasingly adopting Rust (at least for inference). [1] https://github.com/openai/harmony [2] https://github.com/openai/tiktoken [3] https://github.com/openai/codex.
#AI #Productivity #Developer Tools 14 social mentions
-
Discover, download, and run local LLMs
Thanks openai for being open ;) Surprised there are no official MLX versions and only one mention of MLX in this thread. FYI to any one on mac, the easiest way to run these models right now is using LM Studio (https://lmstudio.ai/), its free. Then you just search for the model, usually 3rd party groups mlx-community or lmstudio-community have mlx versions within a day or 2 of releases. I got for the 8-bit quantizations (4-bit faster, but quality drops). You can also convert to mlx yourself... Once you have it running on LM studio, you can chat there in your own interface, or you can run it through api that defaults to http://127.0.0.1:1234 You can run multiple models that hot swap and load instantly and switch between them etc. Its surpassingly easy, and actually a lot of cool niche models comings out, like this tiny high-quality search model released today as well (and who released official mlx version) https://huggingface.co/Intelligent-Internet/II-Search-4B Other fun ones are gemma 3n which is model multi-modal, larger one that is actually solid model but takes more memory is the new Qwen3 30b A3B (coder and instruct), Pixtral (mixtral vision with full resolution images), etc. Look forward to playing with this model.
#AI #Productivity #Writing Tools 29 social mentions