Revealing Hidden Histories: Data Science Unearths California’s African Legacy

History lecturer Cameron Jones didn’t expect to uncover a lost chapter of California’s past while studying missionaries in the Amazon. But as he pored over mission records, a striking pattern emerged — many of the people mentioned were of African descent.
Intrigued, he turned his focus back to California, where further research led him to the Santa Barbara Mission archives. That discovery launched a groundbreaking effort to trace the lives of African-descended Californians between 1768 and 1850, a narrative that had largely faded from view. Today, an interdisciplinary team at Cal Poly is using advanced data science to illuminate their legacy.
“We’re reconstructing a past that was nearly erased,” Jones said. “It’s about more than just uncovering names — it’s about understanding how these communities formed, thrived and contributed to California’s history.”
Their AfricanCalifornios.org project recently received a $150,000 National Endowment for the Humanities grant to expand its reach. Led by Jones and computer science Professor Foaad Khosmood, the research team is developing digital tools to analyze archival records and create interactive visualizations, making their findings more accessible.
By merging historical research with data science, they are redefining how California’s diverse heritage is explored and understood.
• • •

Khosmood had long been interested in California’s early history. When he learned about Jones’ research, he saw how data science could help reveal relationships within the underrepresented population of African Californios.
“Very few people have studied the African Californios, and no one has approached it with computational tools,” Khosmood said. “I knew we could use visualizations and digital tools to bring this story to a wider audience.”
In 1790, nearly one in five non-native Californians were of African descent, with large communities in Los Angeles and San José.
As Spain sought to strengthen its hold on California, it expanded its military by enlisting people of African descent and individuals of mixed heritage. By 1814, records from a mission in San Luis Obispo noted that five of the six soldiers stationed there were of African descent, as Spain recruited beyond native Spaniards to protect its territories.
African-descended individuals played an essential role in California’s development, and the final Spanish census in 1821, conducted just before the transition to Mexican control, reaffirmed their lasting presence in shaping the region.
One prominent figure during this time was Pio Pico, the last governor of Mexican California, whose African ancestry has made him a key part of the project’s research.
“Not many realize there were that many Black people in early California,” Jones said.
Jones and Khosmood developed a system to match individuals of African descent across historical documents, which allowed them to construct detailed family trees. The process was complicated by incomplete records and discrepancies, making it difficult to connect the dots.
They relied on the Early California Population Project — a digital database of baptism, marriage and burial records from California’s missions — but these documents lacked one crucial detail: race. To fill this gap, they turned to census data, which included some racial information.

Over several months, Jones and his students scanned the census records into digital spreadsheets, encountering additional challenges with different spellings, accented letters and name variations.
To address this, the team modified an algorithm used to compare text strings. Spanish names, with variations like “S” and “Z,” required further customization, so they developed a list of letter substitutions specific to colonial Spanish to improve accuracy.
“Data allows us to piece together details that might otherwise remain fragmented, giving us a more complete and nuanced understanding of the past,” Khosmood said.
• • •
A driving force behind the project’s technical advances is Anthony Colin Herrera, a computer science master’s student from Bakersfield, California, who brought both his expertise and a deep connection to his heritage.
“When I learned about the African Californios, I was struck by how little-known their story is,” Colin Herrera said. “My background made this project especially meaningful, and working with real-world data felt like the perfect way to honor that history.”
Fluent in Spanish, Colin Herrera identified “family units” based on shared last names, parent-child relationships and spousal connections. He traced generations and built family trees linking parents and children across multiple datasets.
Their research is publicly accessible through AfricanCalifornios.org, which will soon feature lesson plans for educators, from elementary school through college.
A major milestone came this summer when a group from Cal Poly, including Colin Herrera, presented the project at DH 2024, the annual conference of the Alliance of Digital Humanities Organizations. Their presentation highlighted progress in using data science to reconstruct family histories, sparking interest from scholars eager to explore the project’s potential.

“I was nervous to present in front of experts in the field of digital humanities,” Colin Herrera admitted. “But knowing I was representing Cal Poly, the Computer Science Department and our research team, I had to give it my best.”
For Colin Herrera, stepping onto that stage was just one of many firsts. As the first in his extended family to attend college — and soon to earn his master’s degree — he’s paving the way for others. “I wanted to be the first to go to college, and now, I’m getting my master’s,” he said, noting that some of his family members didn’t finish high school.
This spring, Colin Herrera will defend his thesis after spending the year refining family trees based on project data. The team’s next step is to use natural language processing tools to analyze a scanned book of colonial-era land grants and extract details like people, places and plot sizes.
As the project expands, Jones reflected on the importance of reclaiming these narratives: “We know a lot about the wealthy, powerful white settlers but much less about the people of color who played vital roles in shaping our state’s history,” Jones said. “California’s past is rich with diversity, far beyond what many realize.”
Call to Action: Engage with History!
We invite educators, students and history enthusiasts to explore the African Californios website as a valuable resource for learning and teaching. Integrating these often-overlooked stories into California’s history fosters a deeper understanding of our cultural heritage. By sharing this resource, you help highlight the contributions of African descendants while inspiring conversations about the diverse chapters that shape our past.