Twitter’s biased cropping algorithm
A wonderful story unfolds on Twitter now. You can post pictures there, and a preview of them appears in the timeline. And so Twitter made a special algorithm that selects what part of the image to show in the preview. So, if the image proportions differ from the preview proportions, the algorithm not only takes the upper left pixels from the image but also analyzes it and shows the “most important” part of the image. For example, text or a person’s face if it is in the picture.
And that’s where the fun starts. After many experiments, Twitter users have found out that regardless of the location in the picture, the algorithm prefers to show white people over black people, (even cartoon ones) men over women, the entire set of human biases.
The whole twitter is now buzzing about racism in the algorithms – the company should “fix the algorithm.” Of course, it would be great if the algorithms do not adopt human prejudices, but this problem is a complex one with no simple solution, and most importantly, lies outside the objective reality plane. Even if we assume that Twitter will manage to make the algorithm absolutely neutral (which is not an easy task by itself), it still has to choose which of the two people to show, and there will always be someone who will be dissatisfied with the choice, or find it biased.
In this situation, I am most amazed by the fact that all this problem could have been avoided long before it occurred by simply... not cropping the photos in the first place! For users, this feature does not add a lot of value to the product, and the only reason it may exist is to increase the artificial “interaction with media content” metrics within the company. If people can not see the whole picture in the preview, they will click on it more often, increasing the juicy “engagement”.
So, instead of adding “background-size: contain;” to CSS and show a preview of the entire image, the company invented the problem, solved it with the help of Machine Learning™, and got a disproportionately more complex one with the scandal at the top.
Curiously enough, from now on, it is not so easy for Twitter to move on and just remove the algorithm as a whole — now it may look like an attempt to “run away from the problem” or “unwillingness to make the algorithm less biased.” A representative of the company has already stated that they will conduct more research and invest more time and resources to solve the problem, which could easily not exist.
I would say, this is a great example of why one should be careful when adding complexity to the system – especially where it could be avoided.