DeepSeek launched a free, open-source large language model in late December, claiming it was developed in just two months at a cost of under $6 million.
My understanding is that DeepSeek still used Nvidia just older models and way more efficiently, which was remarkable. I hope to tinker with the opensource stuff at least with a little Twitch chat bot for my streams I was already planning to do with OpenAI. Will be even more remarkable if I can run this locally.
However this is embarassing to the western companies working on AI and especially with the $500B announcement of Stargate as it proves we don’t need as high end of an infrastructure to achieve the same results.
My understanding is that DeepSeek still used Nvidia just older models and way more efficiently, which was remarkable. I hope to tinker with the opensource stuff at least with a little Twitch chat bot for my streams I was already planning to do with OpenAI. Will be even more remarkable if I can run this locally.
However this is embarassing to the western companies working on AI and especially with the $500B announcement of Stargate as it proves we don’t need as high end of an infrastructure to achieve the same results.
500b of trust me Bros… To shake down US taxpayer for subsidies
Read between the lines folks
It’s really not. This is the ai equivalent of the vc repurposing usa bombs that didn’t explode when dropped.
Their model is the differentiator here but they had to figure out something more efficient in order to overcome the hardware shortcomings.
The us companies will soon outpace this by duping the model and running it on faster hw
Throw more hardware and power at it. Build more power plants so we can use AI.
Are there any guides to running it locally?
I’m using Ollama to run my LLM’s. Going to see about using it for my twitch chat bot too
https://github.com/ollama/ollama