So it's okay when big American corps raid the internet ignoring any terms of service or licenses they see in order to train models they rent back to us, but when a foreign entity trains off of Anthropic it's illegal?
Pot, meet kettle!
I don’t think I’m the only one feeling some schadenfreude at this news. I suppose it’s ok when you’re a hot Silicon Valley scale-up to slurp up the rest of the worlds data for free and then hire hot shot lawyers to defend you against all the creatives you ripped off, but when it’s the “evil” Chinese doing the same to you it’s a dastardly “attack”?
I don't think this counts as distillation. Distillation is when you use a teacher model to train a student model, but crucially, you have access to the entire probability distribution of the generated tokens, not just to the tokens themselves. That probability distribution increases tremendously the strength of the signal, so the training converges much faster. Claude does not provide these probabilities. So, Claude was used for synthetic training data generation, but not really for distillation.