

Reportedly, this training of bots is a thing of the past. Google used to do this, make people put in the street numbers from Street View or blurred words from book scans. But from what I read this isn’t really necessary any more, AI and computer vision got better and what we see these days is just wasted effort, it doesn’t contribute to anything except tell if you’re able to solve the challenge and how you move your mouse while doing it. I wonder why they still do all the zebra crossings and motorcycles and fire hydrants, though. Looks like a synthetic dataset to me, because pictures repeat on a regular basis and they’re not that hard… I’d certainly expect less repeating pictures and more occlusion and weird ones if this was training for something.





There’s another community for this: [email protected]
Though we mostly discuss the news and specific questions there, beginner questions are a bit more rare.
I think you already got a lot of good answers here, LMStudio, OpenWebUI, LocalAI…
I’d like to add KoboldCpp that’s kind of made for gaming/dialogue, but it can do everything. And from my experience it’s very easy to set up and bundles everything into one program.