Overview
In 2030, the world's population is projected to be 8.6 billion, almost 80% of which will live in Africa and Asia. Latin America's population will continue to grow rapidly while population growth in Europe and Northern America—today's largest sources of contributors and readership to Wikimedia projects—will plateau. How can we help Wikimedia projects thrive in a world that is becoming increasingly different from the one we are building for today, both in terms of production and consumption of content?
The Wikimedia movement has identified as a strategic goal supporting "the knowledge and communities that have been left out by structures of power and privilege". In order to meet this goal, we need to understand how to serve audiences, groups, and cultures that today are underrepresented in Wikipedia, Wikidata, Commons and other Wikimedia projects—in terms of participation, access, representation, and coverage.
We have begun to advance knowledge equity with a research program to address knowledge gaps. This program aims to deliver citable, peer-reviewed knowledge and new technology in order to generate baseline data on the diversity of the Wikimedia contributor population, understand reader needs across languages, remove barriers for contribution by underrepresented groups, and help contributors identify and expand missing content across languages and topics.
More information can be found in our roadmap.
Recent updates
Resources and links
Research pages
- Characterizing Wikipedia Reader Behavior
- Expanding Wikipedia articles across languages
- Voice and exit in a voluntary work environment
Slides
Videos
- Using Wikipedia categories for research: opportunities, challenges, and solutions
- Visual Enrichment of Collaborative Knowledge Bases
- Recommendation systems and Knowledge Gaps in Wikipedia
Publications
- Tomás Feith, Akhil Arora, Martin Gerlach, Debjit Paul, and Robert West. 2024. Entity Insertion in Multilingual Linked Corpora: The Case of Wikipedia. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP '24).
- Dale Zhou, Shubhankar P. Patankar, David M. Lydon-Staley, Perry Zurn, Martin Gerlach, and Dani S. Bassett. 2024. Architectural styles of curiosity in global Wikipedia mobile app readership. Science Advances, 10, eadn3268. https://doi.org/10.1126/sciadv.adn3268
- Mykola Trokhymovych, Indira Sen, and Martin Gerlach. 2024. An Open Multilingual System for Scoring Readability of Wikipedia. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL '24).
- Paramita Das, Isaac Johnson, Diego Saez-Trumper, and Pablo Aragón. 2024. Language-Agnostic Modeling of Wikipedia Articles for Content Quality Assessment across Languages. Proceedings of the Eighteenth International AAAI Conference on Web and Social Media (ICWSM '24).
- Mo Houtti, Isaac Johnson, Morten Warncke-Wang, and Loren Terveen. 2024. Leveraging Recommender Systems to Reduce Content Gaps on Peer Production Platforms. Proceedings of the Eighteenth International AAAI Conference on Web and Social Media (ICWSM '24).
- Akhil Arora, Robert West, and Martin Gerlach. 2024. Orphan Articles: The Dark Matter of Wikipedia. Proceedings of the Eighteenth International AAAI Conference on Web and Social Media (ICWSM '24).
- Tiziano Piccardi, Martin Gerlach, and Robert West. 2024. Curious Rhythms: Temporal Regularities of Wikipedia Consumption. Proceedings of the Eighteenth International AAAI Conference on Web and Social Media (ICWSM '24).
- Morten Warncke-Wang, Rita Ho, Marshall Miller, and Isaac Johnson. 2023. Increasing Participation in Peer Production Communities with the Newcomer Homepage. Proc. ACM Hum.-Comput. Interact. (CSCW '23).. https://doi.org/10.1145/3610071
- Tiziano Piccardi, Martin Gerlach, Akhil Arora, and Robert West. 2023. A Large-Scale Characterization of How Readers Browse Wikipedia. ACM Transactions on the Web. https://doi.org/10.1145/3580318
- Narges Azizifard, Lodewijk Gelauff, Jean-Olivier Gransard-Desmond, Miriam Redi, and Rossano Schifanella. 2022. Wiki Loves Monuments: Crowdsourcing the Collective Image of the Worldwide Built Heritage. J. Comput. Cult. Herit, 16, 1, Article 20 (March 2023). https://doi.org/10.1145/3569092
- Mo Houtti, Isaac Johnson, Joel Cepeda, Soumya Khandelwal, Aviral Bhatnagar, and Loren Terveen. 2022. "We Need a Woman in Music": Exploring Wikipedia's Values on Article Priority. 25th ACM Conference On Computer-Supported Cooperative Work And Social Computing (CSCW '22). https://doi.org/10.1145/3555156
- Tiziano Piccardi, Martin Gerlach, and Robert West. 2022. Going Down the Rabbit Hole: Characterizing the Long Tail of Wikipedia Reading Sessions. WikiWorkshop 2022: In Companion Proceedings of The Web Conference 2022 (WWW '22).
- Akhil Arora, Martin Gerlach, Tiziano Piccardi, Alberto García-Durán, and Robert West. 2022. Wikipedia Reader Navigation: When Synthetic Data Is Enough. Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining (WSDM '22). https://doi.org/10.1145/3488560.3498496
- Pablo Beytía, Pushkal Agarwal, Miriam Redi, and Vivek K. Singh. 2022. Visual Gender Biases in Wikipedia: A Systematic Evaluation across the Ten Most Spoken Languages. Proceedings of the Sixteenth International AAAI Conference on Web and Social Media (ICWSM '22).
- Daniele Rama, Tiziano Piccardi, Miriam Redi, and Rossano Schifanella. 2022. A Large Scale Study of Reader Interactions with Images on Wikipedia. EPJ Data Science. https://doi.org/10.1140/epjds/s13688-021-00312-8
- Martin Gerlach, Marshall Miller, Rita Ho, Kosta Harlan, and Djellel Difallah. 2021. A Multilingual Entity Linking System for Wikipedia with a Machine-in-the-Loop Approach. 30th ACM International Conference on Information and Knowledge Management (CIKM '21).
- Isaac Johnson, Florian Lemmerich, Diego Sáez-Trumper, Robert West, Markus Strohmaier, and Leila Zia. 2021. Global gender differences in Wikipedia readership. Proceedings of the Fifteenth International AAAI Conference on Web and Social Media (ICWSM '21).
- Miriam Redi, Martin Gerlach, Isaac Johnson, Jonathan Morgan, and Leila Zia. 2021. A Taxonomy of Knowledge Gaps for Wikimedia Projects (Second Draft).
- Oleksii Moskalenko, Denis Parra, and Diego Saez-Trumper. 2020. Scalable Recommendation of Wikipedia Articles to Editors Using Representation Learning. ComplexRec 2020, Workshop on Recommendation in Complex Scenarios at the ACM RecSys Conference on Recommender Systems (RecSys 2020).
- Miriam Redi, Martin Gerlach, Isaac Johnson, Jonathan Morgan, and Leila Zia. 2020. A Taxonomy of Knowledge Gaps for Wikimedia Projects (First Draft).
- Valerio Lorini, Javier Rando, Diego Saez-Trumper, and Carlos Castillo. 2020. Uneven Coverage of Natural Disasters in Wikipedia: The Case of Floods. 17th International Conference on Information Systems for Crisis Response and Management (ISCRAM 2020).
- Kateryna Liubonko and Diego Sáez-Trumper. 2020. Matching Ukrainian Wikipedia Red Links with English Wikipedia's Articles. WikiWorkshop 2020: In Companion Proceedings of the Web Conference 2020 (WWW '20). https://doi.org/10.1145/3366424.3383571
- Ramtin Yazdanian, Leila Zia, Jonathan Morgan, Bahodir Mansurov, and Robert West. 2019. Eliciting New Wikipedia Users' Interests via Automatically Mined Questionnaires: For a Warm Welcome, Not a Cold Start. In Proceedings of the Thirteenth International AAAI Conference on Web and Social Media (ICWSM '19).
- Florian Lemmerich, Diego Sáez-Trumper, Robert West, and Leila Zia. 2019. Why the World Reads Wikipedia: Beyond English Speakers. International ACM Conference on Web Search and Data Mining (WSDM '19). https://doi.org/10.1145/3289600.3291021
- Tiziano Piccardi, Michele Catasta, Leila Zia, and Robert West. 2018. Structuring Wikipedia Articles with Section Recommendations. Proceedings of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '18).
- Philipp Singer, Florian Lemmerich, Robert West, Leila Zia, Ellery Wulczyn, Markus Strohmaier, and Jure Leskovec. 2017. Why We Read Wikipedia. Proceedings of the 26th International Conference on World Wide Web (WWW '17). https://doi.org/10.1145/3038912.3052716
- Ashwin Paranjape, Robert West, Leila Zia, and Jure Leskovec. 2016. Improving Website Hyperlink Structure Using Server Logs. Proceedings of the Ninth ACM International Conference on Web Search and Data Mining (WSDM '16). https://doi.org/10.1145/2835776.2835832
- Ellery Wulczyn, Robert West, Leila Zia, and Jure Leskovec. 2016. Growing Wikipedia Across Languages via Recommendation. Proceedings of the 25th International Conference on World Wide Web (WWW '16). https://doi.org/10.1145/2872427.2883077
