LangNet
Exploring language families through the names of numbers from one to ten.
- Date
- September 2024
The project began with a note on a metro ride: sort number names alphabetically in different languages. That question led me to compare how languages name one to ten, measure the distances between those words, and map the relationships that appeared.
Approach
- Processed number names from more than 5,000 languages and compared distance functions.
- Used normalized Damerau-Levenshtein distance, MDS, t-SNE, and graph views.
- Published an interactive Shiny app and long-form technical posts.
Impact
- Turned a metro-note question into two long-form analyses and an interactive map of more than 3,800 languages.
- Made it possible to explore language relationships using only ten familiar words.