We Have ChatGPT At Home

The recent release of DeepSeek-R1 has posed a challenge to the established players in the LLM scene. While it may not be quite as powerful as the latest OpenAI models, R1 has demonstrated “chain of thought” and “mixture of experts” reasoning behavior, and can solve at least some puzzles requiring logical thought.

More interestingly for home users skeptical of relying on AI hosted in mainland China and under the control of the CCP, various “distilled” models of DeepSeek-R1 have been released, with parameter sizes ranging from 1.5B to 70B. All of these will run fairly reliably on $1000-class enthusiast PCs — and the smaller models will run nicely on even modest hardware.

Probably the easiest way to get up and running with these models is to download and install Ollama, which can download, load, and run models from a single command prompt. Once Ollama is installed, just type

ollama run deepseek-r1:14B

and Ollama will download, install, and run the 14B parameter distilled model.

These “distilled” models are actually “Qwen” (from Alibaba) and “Llama” (from Meta) base models, finetuned with reasoning data produced by DeepSeek-R1. The 70B model, in particular, seems quite stable and fairly capable of dealing with logical problems of moderate complexity. This largest distilled model was able to solve a word problem involving the ages of three people (three equations and three unknowns) 90% of the time or more. (It got the answer correct 17 times in a row and then got it wrong 6 times in a row, which strongly implies the memory is not being correctly cleared.)

Compared to GPT-o1 (and probably the new GPT-o3-mini), the distilled DeepSeek models seem to struggle with more complex questions. When asked how many gold ingots a Minecraft character can carry (via a disguised question about regular and “magic” boxes), the DeepSeek distilled models have trouble coming up with an optimal plan, whereas GPT-o1 came up with an optimal or nearly-optimal approach right away.

So while we may not have “ChatGPT at home,” it is still impressive to see models running locally that can not only carry on a conversation but can reason. I see “cloud computing” as a necessary evil at best, since ultimately “the cloud” is “somebody else’s computer,” and that somebody else could decide to stop providing the service at any time. From the perspectives of privacy, reliability, and democratization of technology, it’s nice to have the option to self-host.

And more competition is always a good thing. I don’t think it’s a coincidence GPT-o3-mini just rolled out. OpenAI wants to remain in the spotlight.

And maybe this will help “Open”AI decide to release actually-open models.

This entry was posted in Current Events, Digital Citizenship, Machine Learning / Neural Networks. Bookmark the permalink.

Leave a Reply