Nature has been the muse of photographers since the medium began. The changing of seasons, flowers blooming and butterflies migrating are captured in countless camera rolls and photo albums across the world.

These images are the key for researchers at the Florida Museum of Natural History, who have utilized the University of Florida’s HiPerGator supercomputer to transform this global gallery into an expansive database that help researchers more rapidly and effectively predict and mitigate the effects of climate change. 

Ella and Josie Grace taken in their backyard to illustrate the Backyard Biodiversity Bonanza happening as part of the Virtual Earth Day celebrations to demonstrate the iNaturalist app that helps users identify plants and animals while also contributing to data collection.

Rob Guralnick, the museum’s curator of biodiversity informatics, and his team focus on phenology, the study of the seasonal timing of natural events. For this project, they are partnered with iNaturalist, an app that helps users identify plants and animals while also contributing to data collection. 

“We’ve really never had real-time data available on phenology,” Guralnick said. “But we’re starting to get those data at a density where we can make predictions and forecasts about phenology into the future using these rich data sources.”

Through the app, users have uploaded 100 million plant photographs, including from places where researchers have gaps in knowledge, Guralnick said. By analyzing these images with HiPerGator, these researchers are creating a leading-edge model to extract and annotate phenology information from millions of images — something it would take his team countless hours to achieve manually. They are also creating an application that houses this trustworthy data that can be interoperable and explorable by anyone.

The image storage alone for this kind of system is complex, which makes HiPerGator integral to the project. The supercomputer can automate the downloading, annotation and integration into a knowledge system that can be accessed by researchers, policymakers and the public. 

“We have all these amazing photographs from Ukraine, all these amazing photographs from Siberia, areas where I have no idea how phenology works there, and no one’s ever had data, ever for these regions. All of a sudden, they’re available. It’s incredible, right?” he said. “It’s like a kid in a candy store.” 

Analyzing countless museum specimens in moments

Arthur Porto, the museum’s curator of artificial intelligence, said his work at the museum focuses on developing and implementing machine learning and computer vision tools. His lab’s research is focused on creating methods that can query image data to extract and process information quickly. 

These extractions use models made with the HiPerGator. He is working to create a foundational model for biodiversity, which will use the museum’s digital collections. The museum has more than 30 million images of specimens from natural history museums all over the globe that are part of Integrated Digitized Biocollections, or iDigBio, which is similar to iNaturalist but more for museum specimen-based research. 

Porto has been working on a project called Biocosmos, which would use those images to train a machine-learning model capable of doing species classifications, connecting image data with text and more. 

This work would allow someone to ask a seemingly simple question, such as, “What butterflies have iridescent wings?” and retrieve the information immediately. But that kind of question would normally be an almost impossible task to do by hand because a scientist would have to visit multiple collections, and view and record an exorbitant number of specimens. 

Butterflies, McGuire, McGuire Center for Lepidoptera and Biodiversity, White Ringed Atlas (Epiphora mythimnia), butterfly, exhibits, permanent exhibits

“Biology has always relied on collecting information from the natural world to answer the questions that we need,” Porto said. “But oftentimes, that kind of collection is the main bottleneck for the kind of research that we do.” 

This work is only possible with the use of a supercomputer like HiPerGator, Porto said. Without it, the project would require clusters of GPUs that cost thousands of dollars each, vastly beyond what personal computer can do.  “For anyone who wants to be at the cutting edge, anyone who wants to be able to train these large models and be able to do research at that interface, you need something like a HiPerGator,” he said. “Without it, you’re basically not competitive.”