Connect with us

Hi, what are you looking for?

Tech News

Microsoft brings out a small language model that can look at pictures

Illustration of the Microsoft wordmark on a green background
Illustration: The Verge

Microsoft announced a new version of its small language model, Phi-3, which can look at images and tell you what’s in them.

Phi-3-vision is a multimodal model — aka it can read both text and images — and is best used on mobile devices. Microsoft says Phi-3-vision, now available on preview, is a 4.2 billion parameter model (parameters refer to how complex a model is and how much of its training it understands) that can do general visual reasoning tasks like asking questions about charts or images.

But Phi-3-vision is far smaller than other image-focused AI models like OpenAI’s DALL-E or Stability AI’s Stable Diffusion. Unlike those models, Phi-3-vision doesn’t generate images, but it can understand what’s in an image and analyze it for a…

Continue reading…

You May Also Like

Editor's Pick

Colleen Hroncich Jeana Wilson didn’t plan to get into education. But her daughter has special needs that her local school district wasn’t able to...

Editor's Pick

Chris Edwards New York’s state and local governments appear to be incredibly bloated. New York State’s population is 10 percent less than Florida’s, yet...

Editor's Pick

Chris Edwards State and local subsidies and narrow tax breaks for businesses are growing. These benefits—called incentives—include grants, loans, tax credits, and tax exemptions...

Politics

Former President Donald Trump joins a growing list of world leaders convicted after leaving office, with many critics in the U.S. claiming that such...