Hosting your own large language models and connecting them to MATLAB with an NVIDIA DGX Spark
I've talked about running local Large Language Models a couple of times on The MATLAB Blog but always had to settle for small models because of the tiny amount of memory on my GPU -- 6GB to be precise! Running much larger, more capable models meant requireing expensive, sever-class GPUs on HPC or cloud instances and I never had enough budget to do it.
Until now!

NVIDIA's DGX Spark is a small desktop machine that doesn't cost the earth. Indeed, several of us at MathWorks have one now although 'mine' (pictured above sporting a MATLAB sticker) is actually shared with a few other people and lives on a desk in Natick, USA while I'm in the UK.
The DGX Spark has 128GB of memory available to the GPU which means that I can run a MUCH larger language model. So, I installed a 120 Billion parameter model on it: gpt-oss:120b. More than an order of magnitude bigger than any local model I had played with before.
The next step was to connect to it from MATLAB running on my laptop.
The result is a *completely private* MATLAB + AI workflow that several of us have been playing with.
In my latest article, I show you how to set everything up: The LLM running on the DGX Spark connected to MATLAB running on my MacBook Pro. https://blogs.mathworks.com/matlab/2026/01/05/running-large-language-models-on-the-nvidia-dgx-spark-and-connecting-to-them-in-matlab/