The venture capitalist and new Trump administration member David Sacks, meanwhile, said that there is āsubstantial evidenceā that DeepSeek ādistilled the knowledge out of OpenAIās models.ā
āThereās a technique in AI called distillation, which youāre going to hear a lot about, and itās when one model learns from another model, effectively what happens is that the student model asks the parent model a lot of questions, just like a human would learn, but AIs can do this asking millions of questions, and they can essentially mimic the reasoning process they learn from the parent model and they can kind of suck the knowledge of the parent model,ā Sacks told Fox News. āThereās substantial evidence that what DeepSeek did here is they distilled the knowledge out of OpenAIās models and I donāt think OpenAI is very happy about this.ā
This sounds like horse shit to me but I donāt know the technical details well enough to say with confidence.
also āsuck the knowledge out of the parent modelā what the actual fuck?
Footage of Deepseek slurping the knowledge out of the GPT4
I donāt know enough to say whether this is valid or just crybaby tech bros having a fit on fox news but likeā¦ God I hope deepseek is completely stolen like this because and I hope thereās absolutely nothing closedai can do about the fact that thereās a better thief out there on the market. Fuck them so hard and fuck their hypocrisy about stealing data. Maybe we can finally move away from trying to use a double digit percentage of national electric grid capacity to power a fucking glorified magic 8ball
This is much more a TechTakes story than a NotAwfulTech one; letās keep the discussion over on the other thread:
Noted going forward. Sorry about that! ā¤
They donāt have any evidence. They say someone did āhammer their APIā, and then they terminated their license (last year), but they donāt know who. China bashing is not going to depend on actual evidence.
All that matters, in the end, is ācustomer pricesā instead of our devoted love for Sam Altman.
Knowledge distilation is training a smaller model to mimic the outputs of a larger model. You donāt need to use the same training set that was used to train the larger model (the whole internet or whatever they used for chatgpt), but can use a transfer set.
Hereās a reference: Hinton, Geoffrey. āDistilling the Knowledge in a Neural Network.ā arXiv preprint arXiv:1503.02531 (2015)., https://arxiv.org/pdf/1503.02531