How is this Open Source? The official repository https://github.com/deepseek-ai/DeepSeek-R1 contains images only, a PDF file, and links to download the model. I don't see any code. What exactly is Open Source here? And if so, where to get the source code?
Open Source
All about open source! Feel free to ask questions, and share news, and interesting stuff!
Useful Links
- Open Source Initiative
- Free Software Foundation
- Electronic Frontier Foundation
- Software Freedom Conservancy
- It's FOSS
- Android FOSS Apps Megathread
Rules
- Posts must be relevant to the open source ideology
- No NSFW content
- No hate speech, bigotry, etc
Related Communities
- !libre_culture@lemmy.ml
- !libre_software@lemmy.ml
- !libre_hardware@lemmy.ml
- !linux@lemmy.ml
- !technology@lemmy.ml
Community icon from opensource.org, but we are not affiliated with them.
In deep learning generally open source doesn't include actual training or inference code. Rather it means they publish the model weights and parameters (necessary to run it locally/on your own hardware) and publish academic papers explaining how the model was trained. I'm sure Stallman disagrees but from the standpoint of deep learning research DeepSeek definitely qualifies as an "open source model"
Just because they call it Open Source does not make it. DeepSeek is not Open Source, it only provides model weights and parameters, not any source code and training data. I still don't know whats in the model and we only get "binary" data, not any source code. This is not Libre software.
There is a nice (even if by now already a bit outdated) analysis about the openness of different "open source" generative AI projects in the following article: Liesenfeld, Andreas, and Mark Dingemanse. "Rethinking open source generative AI: open washing and the EU AI Act." The 2024 ACM Conference on Fairness, Accountability, and Transparency. 2024.
So "Open Source" to AI is just releasing a .psd file used to export a jpeg, and you need some other proprietary software like Photoshop in order to use it.
Open-Source in AI usually posted to HuggingFace instead of GitHub: https://huggingface.co/deepseek-ai/DeepSeek-R1
How apt, just yesterday I put together an evidenced summary of the CEOs recent absurd comments. Why are Proton so keen to throw away so much good will people had invested in them?!
This is what the CEO posting as u/Proton_Team stated in a response on r/ProtonMail:
Here is our official response, also available on the Mastodon post in the screenshot:
Corporate capture of Dems is real. In 2022, we campaigned extensively in the US for anti-trust legislation.
Two bills were ready, with bipartisan support. Chuck Schumer (who coincidently has two daughters working as big tech lobbyists) refused to bring the bills for a vote.
At a 2024 event covering antitrust remedies, out of all the invited senators, just a single one showed up - JD Vance.
By working on the front lines of many policy issues, we have seen the shift between Dems and Republicans over the past decade first hand.
Dems had a choice between the progressive wing (Bernie Sanders, etc), versus corporate Dems, but in the end money won and constituents lost.
Until corporate Dems are thrown out, the reality is that Republicans remain more likely to tackle Big Tech abuses.
Source: https://archive.ph/quYyb
To call out the important bits:
- He refers to it as the "official response"
- Indicates that JD Vance is on their side just because he attended an event that other invited senators didn't
- Rattles on about "corporate Dems" with incredible bias
- States "Republicans remain more likely to tackle Big Tech abuses" which is immediately refuted by every response
That was posted in ther/ProtonMail sub where the majority of the event took place: https://old.reddit.com/r/ProtonMail/comments/1i1zjgn/so_that_happened/m7ahrlm/
However be aware that the CEO posting as u/Proton_Team kept editing his comments so I wouldn't trust the current state of it. Plus the proton team/subreddit mods deleted a ton of discussion they didn't like. Therefore this archive link captured the day after might show more but not all: https://web.archive.org/web/20250116060727/https://old.reddit.com/r/ProtonMail/comments/1i1zjgn/so_that_happened/m7ahrlm/
Some statements were made on Mastodon but these are subsequently deleted, but they're capture by an archive link: https://web.archive.org/web/20250115165213/https://mastodon.social/@protonprivacy/113833073219145503
I learned about it from an r/privacy thread but true to their reputation the mods there also went on a deletion spree and removed the entire post: https://www.reddit.com/r/privacy/comments/1i210jg/protonmail_supporting_the_party_that_killed/
This archive link might show more but I've not checked: https://web.archive.org/web/20250115193443/https://old.reddit.com/r/privacy/comments/1i210jg/protonmail_supporting_the_party_that_killed/
There's also this lemmy discussion from the day after but by that point the Proton team had fully kicked in their censorship so I don't know how much people were aware of (apologies I don't know how to make a generic lemmy link) https://feddit.uk/post/22741653
OpenAI, Google, and Meta, for example, can push back against most excessive government demands.
Sure they "can" but do they?
Why do that when you can just score a deal with the government to give them whatever information they want for sweet perks like foreign competitors getting banned?
DeepSeek is open source, meaning you can modify code(new window) on your own app to create an independent — and more secure — version. This has led some to hope that a more privacy-friendly version of DeepSeek could be developed. However, using DeepSeek in its current form — as it exists today, hosted in China — comes with serious risks for anyone concerned about their most sensitive, private information.
Any model trained or operated on DeepSeek’s servers is still subject to Chinese data laws, meaning that the Chinese government can demand access at any time.
What???? Whoever wrote this sounds like he has 0 understanding of how it works. There is no "more privacy-friendly version" that could be developed, the models are already out and you can run the entire model 100% locally. That's as privacy-friendly as it gets.
"Any model trained or operated on DeepSeek's servers are still subject to Chinese data laws"
Operated, yes. Trained, no. The model is MIT licensed, China has nothing on you when you run it yourself. I expect better from a company whose whole business is on privacy.
To be fair, most people can't actually self-host Deepseek, but there already are other providers offering API access to it.
There are plenty of step-by-step guides to run Deepseek locally. Hell, someone even had it running on a Raspberry Pi. It seems to be much more efficient than other current alternatives.
That's about as openly available to self host as you can get without a 1-button installer.
You can run an imitation of the DeepSeek R1 model, but not the actual one unless you literally buy a dozen of whatever NVIDIA’s top GPU is at the moment.
A server grade CPU with a lot of RAM and memory bandwidth would work reasonable well, and cost "only" ~$10k rather than 100k+...
Pretty rich coming from Proton, who shoved a LLM into their mail client mere months ago.
wait, what? How did I miss that? I use protonmail, and I didn't see anything about an LLM in the mail client. Nor have I noticed it when I check my mail. Where/how do I find and disable that shit?
Thank you. I've saved the link and will be disabling it next time I log in. Can't fucking escape this AI/LLM bullshit anywhere.
The combination of AI, crypto wallet and CEO's pro-MAGA comments (all within six months or so!) are why I quit Proton. They've completely lost the plot. I just want a reliable email service and file storage.
Once all that crap came out, I felt incredibly justified by never having switched to Proton.
It was entirely out of laziness, but still
I'm considering leaving proton too. The two things I really care about are simplelogin and the VPN with port forwarding. As far as I understand it, proton is about the last VPN option you can trust with port forwarding
As far as I understand it, proton is about the last VPN option you can trust with port forwarding
Could you explain this part please? What makes them untrustworthy?
I'm not 100% sure if you mean what do I think makes proton untrustworthy, or what do I think makes other vpns untrustworthy?
If you're referring to proton, some of the statements Andy Yen have made recently are painting proton as less neutral than they claim to be.
I'm also generally aware that a LOT of vpn outfits are just a different company mining your traffic and data, and that there are few "no log" vpns that you can trust.
Despite their recent statements that sour my taste in giving proton money (and the ai bullshit that every goddam company is shoving down our throats), I trust proton when they say no logs. They're regularly audited for it.
I don't trust all these other VPN companies that claim to be no log and have nothing to back them up. Especially when several of them have been caught logging and mining/selling the data they claim to not be logging.
Apologies, I misread your comment and though you said protonvpn was untrustworthy. I'm not a VPN user so I'm not up to date with the rep of any of them, but I am a proton mail user so I was worried about the technical integrity of one of their products
It’s simple: bad.
🤣
Well you just made me choke on my laughter. Well done, well done.
To be fair its correct but it's poor writing to skip the self hosted component. These articles target the company not the model.
There are many llms you can use offline
Including DeepSeek: https://huggingface.co/deepseek-ai
Deepseek works reasonably well, even at cpu only in ollama. I ran the 7b and 1.5b models and it wasn't awful. 7b slowed down as the convo went on, but the 1.5b model felt pretty passable while I was playing with it
Proton have been too noisy from the very start .