• About
  • Advertise
  • Contact
Monday, May 12, 2025
Manhattan Tribune
  • Home
  • World
  • International
  • Wall Street
  • Business
  • Health
No Result
View All Result
  • Home
  • World
  • International
  • Wall Street
  • Business
  • Health
No Result
View All Result
Manhattan Tribune
No Result
View All Result
Home Science

A new AI tool makes it possible to exploit a database of 10 million biological images

manhattantribune.com by manhattantribune.com
14 February 2024
in Science
0
A new AI tool makes it possible to exploit a database of 10 million biological images
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter


Credit: Pixabay/CC0 Public domain

Researchers have developed the largest dataset of biological images ever adapted for machine learning, along with a new vision-based artificial intelligence tool to learn from them.

The results of the new study significantly expand the scope of what scientists can do by using artificial intelligence to analyze images of plants, animals and fungi to answer new questions, said author Samuel Stevens principal of the study and holder of a Ph.D. student in computer science and engineering at Ohio State.

“Our model will be useful for tasks spanning the entire tree of life,” Stevens said. “Researchers will be able to carry out studies that would not have been possible before.”

The results are published on the arXiv preprint server.

Stevens and his colleagues first curated and released the world’s largest and most diverse machine learning-ready image dataset, TreeOfLife-10M, which contains more than 10 million plant images , animals and fungi covering more than 454,000 taxa of the tree of life. In comparison, the previous largest machine learning-ready database contains only 2.7 million images covering 10,000 taxa. The diversity of this data is one of the key enabling features of their algorithm.

They then developed BioCLIP, a new machine learning model presented to researchers in December and designed to learn from the dataset using both visual cues in the images and various types of text associated with the images, such as taxonomic labels and other information.

The researchers tested BioCLIP by seeing how well it could classify images by their place in the tree of life, including a dataset of rare species that they hadn’t seen during training. The results showed that it performed 17-20% better than existing models on this task.

The BioCLIP model is publicly available here. Its demo, Stevens said, can also accurately discern species from an arbitrary organism image, whether it comes from the Serengeti savannah, your local zoo or your garden.

Traditional computational approaches used to organize abundant databases of biological images are typically designed for specific tasks and are not as capable of answering new questions, contexts and data sets, Stevens said.

Additionally, because the model can be broadly applied across the entire tree of life, their AI further supports biologists whose real-world research is more broadly focused, rather than those studying specific niches, a- he added.

What makes this team’s approach so effective, said Yu Su, co-author of the study and assistant professor of computer science and engineering at Ohio State, is their model’s ability to learn representations refined images, or to tell the difference between organisms of similar appearance within the same species and a species imitating their appearance.

While general computer vision models are useful for comparing common organisms like dogs and wolves, previous studies have found that they cannot account for subtle differences between two species of the same plant genus.

Because of its better understanding of nuances, Su said, the model presented in this paper is also uniquely qualified to determine rare and unseen species.

“BioCLIP covers many more species and taxa than previously publicly available general view models,” he said. “Even if he’s never seen a certain species before, he can come to a reasonable conclusion about how this organism looks like this one, then it’s likely that it is.”

As AI continues to advance, the study concludes, machine learning models like this could soon become important tools for unraveling biological mysteries that would otherwise take much longer to understand. And while this first iteration of BioCLIP relied heavily on images and information from citizen science platforms, Stevens said future models could be improved by including more images and data from science labs and museums. As labs are able to collect richer textual descriptions of species that detail their morphological characteristics and other subtle differences between closely related species, these resources will provide a wealth of important information for the AI ​​model .

Additionally, many scientific laboratories have information on the fossils of extinct species, which the team believes will also expand the model’s usefulness.

“Taxonomies are constantly changing as we update names and new species, so one thing we would like to do in the future is take more advantage of existing work on how to incorporate them,” he said. he declares. “In AI, when you throw more data at a problem, you get better results. So I think there’s a bigger version that we can continue to train into a bigger, stronger model.”

Other Ohio State co-authors include Jiaman Wu, Matthew J. Thompson, Elizabeth G. Campolongo, Chan Hee Song, David Edward Carlyn, Tanya Berger-Wolf and Wei-Lun Chao. Li Dong of Microsoft Research, Wasila M Dahdul of the University of California, Irvine, and Charles Stewart of Rensselaer Polytechnic Institute also contributed.

More information:
Samuel Stevens et al, BioCLIP: A Vision Foundation Model for the Tree of Life, arXiv (2023). DOI: 10.48550/arxiv.2311.18803

Journal information:
arXiv

Provided by Ohio State University

Quote: A new AI tool makes it possible to exploit a database of 10 million biological images (February 13, 2024) retrieved on February 13, 2024 from

This document is subject to copyright. Apart from fair use for private study or research purposes, no part may be reproduced without written permission. The content is provided for information only.



Tags: biologicaldatabaseexploitimagesmilliontool
Previous Post

Hospitalized, the head of the Pentagon must return to his functions Tuesday

Next Post

(VIDEO) Skater seriously injured following accident during Disney on Ice show

Next Post
(VIDEO) Skater seriously injured following accident during Disney on Ice show

(VIDEO) Skater seriously injured following accident during Disney on Ice show

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Category

  • Blog
  • Business
  • Health
  • International
  • National
  • Science
  • Sports
  • Wall Street
  • World
  • About
  • Advertise
  • Contact

© 2023 Manhattan Tribune -By Millennium Press

No Result
View All Result
  • Home
  • International
  • World
  • Business
  • Science
  • National
  • Sports

© 2023 Manhattan Tribune -By Millennium Press