• Moonrise2473@feddit.it
    link
    fedilink
    arrow-up
    1
    ·
    edit-2
    14 hours ago

    I feel it’s just a side effect of them trying to block ai companies stealing large amounts of videos for training models. They see too many downloads from a datacenter IP address and require user login to continue

    Openai’s whisper often recognizes mangled words as “please like and subscribe” so they’re actively stealing videos and their subs (the manually created ones by companies like “caption+ by js”, which creators paid hundreds of dollars to make, not the free ones made by Google automatic transcriber or whisper itself) to improve their models so they can make profit

      • Moonrise2473@feddit.it
        link
        fedilink
        arrow-up
        1
        ·
        edit-2
        2 hours ago

        Stealing, without the quotation marks. If you copy something and profit off it without crediting, compensating or asking permission to who paid for it, it’s stealing. We can’t downplay it as “but they just downloaded 700k hours of videos and 200k pirated books for training a simple model that they’re charging users $20 a month, what’s the issue”

        If you copy something for personal enjoyment without profiting from it, then it’s not stealing.