Santa Fe
Institute
  • Research
    • Themes
    • Projects
    • SFI Press
    • Researchers
    • Publications
    • Library
    • Sponsored Research
    • Fellowships
    • Miller Scholarships
  • News + Events
    • News
    • Newsletters
    • Podcasts
    • SFI in the Media
    • Media Center
    • Events
    • Community
    • Journalism Fellowship
  • Education
    • Programs
    • Projects
    • Alumni
    • Complexity Explorer
    • Education FAQ
    • Postdoctoral Research
    • Education Supporters
  • People
    • Researchers
    • Fractal Faculty
    • Staff
    • Miller Scholars
    • Trustees
    • Governance
    • Resident Artists
    • Research Supporters
  • Applied Complexity
    • Office
    • Applied Projects
    • ACtioN
    • Applied Fellows
    • Studios
    • Applied Events
    • Login
  • Give
    • Give Now
    • Ways to Give
    • Contact
  • About
    • About SFI
    • Engage
    • Complex Systems
    • FAQ
    • Campuses
    • Jobs
    • Contact
    • Library
    • Employee Portal

Science for a Complex World

Events

Here's what's happening

Give

You make SFI possible

Subscribe

Sign up for research news

Connect

Follow us on social media

© 2026 Santa Fe Institute. All rights reserved. This site is supported by the Miller Omega Program.

Home / News

Scholars seek a lingua franca for linguistics research

August 20, 2015

Over time, English has swirled into dialects so different that speakers from the same country cannot always understand each other. Similarly, linguists – as they have catalogued words, spellings, pronunciations, and meanings – have stylized their individual academic databases to suit the needs of their own research.

In an age of computational linguistics, that can be a problem. Computers offer vastly improved capabilities for finding patterns and connections. But while human brains are good at smoothing over minor inconsistencies, computers tend to be very literal.

And data that can’t be understood can’t be part of the conversation. “Because of the large quantities of data that can be brought to bear on a problem, for many studies occasional data quality issues are not fatal,” explains SFI Professor Tanmoy Bhattacharya, who leads SFI’s linguistics program. But, he says, “the next advance in linguistics will need to understand weak signals or complicated histories deep in the data, and in these situations data issues will be very important. We will need to understand how the data being used are selected, curated, and presented.”

Further, language databases will need to adopt coding conventions that allow them to talk to one another. “We need to develop a lingua franca for all linguistics databases to speak,” he says. “Whatever way databases organize their own data, or speak their own internal dialect, we should be able to translate them all into something universally understandable and answer queries using the same code all others use.”

Bhattacharya, SFI Distinguised Fellow Murray Gell-Mann, and longtime SFI collaborator George Starostin are hosting an invitation-only working group this week at SFI to address this challenge. Conventional and computational linguists will evaluate existing relevant online and offline databases, explore optimal data formats, and discuss– perhaps even establish – the most useful programmed analysis tools for historical linguistics research.

“What is going to come of this is the preparation to enable the next big advance in computational linguistics,” Bhattacharya says.

More about the working group here.





Share
  • Sign Up For SFI News
News Media Contact

Santa Fe Institute

Office of Communications
news@santafe.edu
505-984-8800



  • Tags
  • Research
  • Events


  • Related Projects
  • The origins, evolution, and diversity of human languages


More SFI News

View All News

SFI External Professor Nicholas de Monchaux named Dean of UC Berkeley College of Environmental Design

Simon Levin named Fellow of the Royal Society

Brian Enquist receives Robert H. MacArthur Award

Han van der Maas named director of Amsterdam’s Institute for Advanced Study

Marina Dubova receives Dissertation Prize

Smart parts for smart wholes

Aaron Clauset receives honors from AAAS and University of New Mexico

Laurent Hébert-Dufresne receives Erdős-Rényi Prize

Why noise may be the key to understanding cell group patterns

Reinventing democracy before it breaks

Do deep learning models recognize 3D shapes in the same way humans do?

Upending assumptions about learning, inspired by an AI phenomenon

Looking at AGI through the lens of natural intelligence

A simple baseline for AI forecasting in machine learning

Constantino Tsallis to co-chair the 2027 Nobel Symposium on Statistical Mechanics

How novelty arrives: Review of “The Origins of the New”

Working group asks, what’s the benefit of a brain?

Measuring irreversibility in gene transcription

ACtioN Academy engages industry leaders on AI and complexity

Arguing for a complex adaptive power grid