Sanctioned Chinese AI Firm Sense Time Releases Image Model Built for Speed | WIRED

Overview

Sanctioned Chinese AI Firm Sense Time Releases Image Model Built for Speed

Sense Time, a Chinese AI company best known for its facial recognition technology, released a new open source model on Tuesday that it claims can both generate and interpret images far faster than top models developed by US competitors. Sense Nova U1 could help the company reclaim lost ground after it slipped from its place among the leading players in China’s AI development race.

Details

The model’s secret sauce is its ability to “read” images without translating them to text first, speeding up the process and reducing the amount of computing power required. “The model’s entire reasoning process is no longer limited to text. It can reason with images as well,” Dahua Lin, cofounder and chief scientist at Sense Time, said in an interview with WIRED.

Lin, who is also a professor of information engineering at the Chinese University of Hong Kong, says that models capable of processing images directly will enable robots to better understand the physical world in the future.

Like Deep Seek's latest flagship model, Sense Time says U1 can be powered by Chinese-made chips. “Several Chinese domestic chipmakers have finished optimizing compatibility with our new model,” Lin says. On release day, 10 Chinese chip designers, including Cambricon and Biren Technology, announced their hardware supports U1.

That flexibility matters because US export controls restrict Chinese firms from accessing the world's most advanced AI chips, particularly those used for training, which at this point are primarily developed by Western companies like Nvidia. “We will continue to push for training on more different chips,” Lin says. But he also acknowledges that Sense Time “may still need to use the best chips to ensure the speed of our iteration.”

Sense Time released U1 for free on Hugging Face and Git Hub, another sign of how Chinese companies are becoming some of the most active contributors to open source AI.

Sense Time was founded in 2014 and became a world leader in computer vision, which is used in applications like facial recognition and autonomous driving. But when Chat GPT and other AI systems powered by natural language processing became the hottest thing in the tech industry, Sense Time began struggling to turn a profit and fell behind newer Chinese startups like Deep Seek and Mini Max.

Sense Time says it hopes that releasing Sense Nova-U1 publicly for anyone to use will help it catch up with both domestic and Western AI players. Lin says the company finally made the decision last year to focus on open source because of the helpful feedback it gets from researchers, which enables the company to iterate faster. “In this day and age, being open source or closed source is not the winning factor; the speed of iteration is,” Lin explains.

Going open source also helps Sense Time continue collaborating with international researchers without the interference of geopolitics. The company has been sanctioned repeatedly by the US government in recent years over allegations that its facial recognition technology helped power surveillance systems used to monitor and detain Uyghurs and other minority groups in China’s Xinjiang region. As a result, US firms are restricted from investing in Sense Time and selling certain technologies to it without a license. (Sense Time has denied the allegations.)

A sample image created using Sense Nova U1. Generated using AI

In an accompanying technical report, Sense Time claims that Sense Nova-U1 generates higher-quality images than all other open source models currently on the market. Its performance is comparable to leading Chinese closed source models like Alibaba’s Qwen and Byte Dance’s Seedream, but it still lags behind industry leaders like GPT-Image-2.0, which came out just a week ago.

But the model’s main selling point is its ability to generate images much faster than all of those models. It relies on an innovative technical structure called NEO-Unify that Sense Time previewed earlier this year.

The model’s new architecture, which could improve efficiency and performance, is what sets U1 apart, says Adina Yakefu, an AI researcher at Hugging Face. “This is a more ambitious approach, as it still faces significant practical challenges,” she says. “It’s good that they decided to open source it, so the community can explore and test it more widely.” The model is also small enough to run on PCs and phones, making it potentially useful in many scenarios.

Lin says the technique Sense Time developed will be especially useful in robotics. When a robot tries to process the visual world, it needs to sort through an enormous amount of information. “It has to think, ‘how should I deal with all the clutter in this room? If there is a complicated machine in front of me, which button should I press?’ All of these are forms of information, and they need to be integrated into the model’s internal judgment,” he says. Because it can understand images natively, Lin is hopeful that Sense Time’s technology will help robots act faster and make fewer mistakes in complex environments.

China is in the midst of a humanoid robot boom. While Sense Time doesn’t currently develop its own robots, Lin says it is closely working with ACE Robotics, a startup led by another Sense Time cofounder. It's also developing models that specialize in geospatial understanding, or creating simulations of the real world.

In your inbox: WIRED's most ambitious, future-defining stories

The scammers using AI-generated MAGA girls to grift ‘super dumb’ men

Big Story: They made D4vd a star—now they want him convicted of murder

The weird, twisting tale of how China spied on Alysa Liu and her dad

Livestream: Submit your questions about the Musk v. Altman trial

Key Takeaways

Sanctioned Chinese AI Firm Sense Time Releases Image Model Built for Speed
Sense Time, a Chinese AI company best known for its facial recognition technology, released a new open source model on Tuesday that it claims can both generate and interpret images far faster than top models developed by US competitors
The model’s secret sauce is its ability to “read” images without translating them to text first, speeding up the process and reducing the amount of computing power required
Lin, who is also a professor of information engineering at the Chinese University of Hong Kong, says that models capable of processing images directly will enable robots to better understand the physical world in the future
Like Deep Seek's latest flagship model, Sense Time says U1 can be powered by Chinese-made chips