SenseTime, the Chinese AI giant best known for facial recognition, has unveiled SenseNova U1, an image model that promises to revolutionise how machines perceive and process visual information. Unlike traditional models that convert images into text before processing, SenseNova U1 handles images directly, significantly speeding up computation and reducing energy consumption.
Dahua Lin, co-founder and chief scientist of SenseTime, claims the new model's native image processing capabilities will enable robots to better understand complex environments. This is crucial as China experiences a humanoid robot boom; SenseTime hopes this advancement will help it catch up with both domestic and Western competitors.
Releasing U1 as open source on platforms like Hugging Face and GitHub, SenseTime aims to foster collaboration with international researchers despite geopolitical tensions. Lin asserts that the speed of iteration, rather than proprietary models, is now key in AI development. The company has also secured support from 10 Chinese chip designers, indicating their commitment to using locally produced technology.
Despite these advancements, SenseTime’s progress is still hampered by US sanctions and export controls on advanced chips, which are currently dominated by Western firms like Nvidia. However, Lin remains optimistic about the potential of SenseNova U1 in robotics and other applications, believing it could help robots act faster and make fewer mistakes.







