If you’re on one of the cheaper Anthropic plans like me, it’s a pretty common scenario when you’re deep into Claude coding an idea, to hit a daily or weekly quota limit. If you want to keep going, you can connect to a local open source model instead of Anthropic. To monitor your current quota, type: /usage

/usage to monitor how much quota you have left and how quick you burn it.The best open source model is changing pretty frequently, but at the time of writing this post, I recommend GLM-4.7-Flash from Z.AI or Qwen 3. If you want or need to save some disk space and GPU memory, try a smaller quantized version which will load and run quicker with a quality cost. I’ll save another detailed post for how to find the best open source model for your task and machine constraints.
Method 1: LM Studio

If you haven’t used LM Studio before, it’s an accessible way to find and run open source LLMs and vision models locally on your machine. In version 0.4.1, they introduce support to connect to Claude Code (CC). See here: https://lmstudio.ai/blog/claudecode or follow the instructions below:
- Install and run LM Studio
- Find the model search button to install a model (see image above). LM Studio recommends running the model with a context of > 25K.
- Open a new terminal session to:
a. start the server:lms server start --port 1234
b. configure environment variables to point CC at LM Studio:export ANTHROPIC_BASE_URL=http://localhost:1234export ANTHROPIC_AUTH_TOKEN=lmstudio
c. start CC pointing at your server:claude --model openai/gpt-oss-20b - Reduce your expectations about speed and performance!
- To confirm which model you are using or when you want to switch back, type
/model

Method 2: Connecting directly to Llama.CPP
LM Studio is built on top of the open source project llama.cpp.
If you prefer not to use LM Studio, you can install and run the project directly and connect Claude Code to it but honestly, unless you are fine tuning a model, or have really specific needs, probably LM Studio is going to be a quicker setup.
Conclusion
For the moment, this is a backup solution. Unless you have a monster of a machine, you’re going to notice the time it takes to do things and a drop in code quality but it works(!) and it’s easy enough to switch between your local OSS model and Claude when you’re quota limit is back, so it’s a good way to keep coding when you’re stuck or you just want to save some quota. If you try it let me know how you go and which model works for you.




Stir-Fry Veggies and Smoked Tofu: throw together a bunch of roughly chopped onions, garlic, ginger, chilli, fresh veggies: carrots, capsicum, and mushrooms. Add in big cubes of smoked tofu, snow peas, coriander and lime towards the end of cooking. Optional: crushed cashews, fish sauce, hoisin sauce, chilli sauce, and noodles. 