Skip to content

Research Report N° 12

The twelfth in a series of biannual reports from the Wikimedia Foundation Research team.

August 2025

Foreword

Hi. Welcome to our bi-annual Research Report. During the six months that this report spans, our team of research scientists, research engineers, survey specialists, analysts, design researchers, and our community officer alongside our research fellow, formal collaborators, and contractors led or significantly supported more than 30 projects and initiatives to support the development of new technologies and inform governance decisions in the Wikimedia Movement.

We continued to invest in strengthening the Wikimedia research communities. The team organized Wiki Workshop, the flagship event of the year for the Wikimedia research communities to engage in the research and scientific discourse around the Wikimedia projects; the Monthly Research Showcases to provide the space for the Wikimedia research community to showcase their latest work; the Research Fund through which, in collaboration with the Community Resources team, the team disseminated 261,730 USD to 8 accepted research proposals for funding; the Wikimedia Foundation Research Award of the Year through which we recognize novel research which has high potential for significant impact on the Wikimedia projects; and held more than 30 one-on-one public Office Hours to consult with the community members and the friends of the Wikimedia Movement on a variety of topics and questions. I am thankful to everyone who supported our work to strengthen the Wikimedia research community, especially in this moment where many research institutions and researchers can particularly benefit from the continuity of our services and support.

We made informed decisions about how to approach AI development in support of Wikipedia editors. At the Wikimedia Foundation we strive to take an intentional approach to opportunities and challenges and for me our new AI Strategy for Editors is a good example of that intentionality. I am proud that we prioritized not only what we will do, but also why, and how we will do that, embracing trade-offs that we had to make, and publicly communicating about our direction. I hope that the open knowledge institutions find strength and support through our approach and I hope that more people and organizations join us in our efforts to nurture the open knowledge ecosystem.

We are further investing in understanding Wikipedia reader needs, building on almost a decade long research on this front (See reports No. 1, No. 4, and No. 9 for some examples). What makes this moment particularly important for revisiting this line of research are the trends with regards to how online information is distributed and regulated. The risk of further intermediation because of AI is present and at the same time because many high traffic online platforms and tools are not prioritizing human agency in learning, Wikipedia has a unique opportunity to recognize this moment and experiment to help more readers access neutral point of view encyclopedic content and engage with knowledge directly through the platform, while their privacy is respected. The understanding through research such as the focus on discovery needs and motivations will equip us for further informed research, experimentation, and feature development in support of Wikipedia readers.

We are investing in understanding contributor motivations. The Wikimedia Foundation’s Multigenerational strategy calls on us to invest in fueling volunteer growth and a key component of that growth is motivating volunteers to contribute to Wikipedia “in perpetuity”. The focus on volunteer motivation is tightly related to the field of incentive design, which spans multiple fields of research including psychology, behavioral economics, computer science and more. This is a very important and novel area of research in that while the state of research is relatively mature when it comes to incentive design in for-profit endeavors, it is much less developed considering large scale social good products such as Wikipedia. I look forward to learning from the line of research started and for us to support informed decision making on the product and community governance and policy fronts through this work.

I now invite you to read the rest of this report to learn more about the Projects, the People on the team, Events, Donors, Collaborations, and Trends we watch.

– Leila Zia, Director and Head of Research at the Wikimedia Foundation

Projects

The large majority of the work of our team is done in the open and reported here. We also provide consultation to other teams at the Wikimedia Foundation (WMF) or external researchers and not all of that work has public outputs. We strive to share what we can and will include these additional projects as they become publicly available in future Research Reports.

Address Knowledge Gaps

We develop models and insights using scientific methods to identify, measure, and bridge Wikimedia’s knowledge gaps.

LLMs for text simplification

We developed a generalizable framework to systematically evaluate models for automatically generating simple summaries, tackling an unsolved problem. Learn more

Metrics to measure Wikipedia’s language gap

We documented our learnings from research to develop metrics for Wikipedia’s language gaps and concluded the project. Learn more

Reader Survey 2024

We concluded the analyses of the large scale survey of Wikipedia readers across 11 languages and documented our findings. Learn more

Discovery needs of Wikipedia readers

We started research to deepen our understanding of the needs and opportunities for improving discovery for Wikipedia readers. Learn more

Wikipedia reader use cases and user needs

We started research to revisit our understanding of reader motivations and inform what product interventions to prioritize for readers. Learn more

Understanding language switching

We explored different approaches for measuring the prevalence of reader language-switching behavior on Wikipedia. Learn more

Improving topic models

We gathered feedback from focus groups about the current generation of topic classification models and made recommendations for how to improve the quality of the models and the topic taxonomy. Learn more

Exploring Wikipedia App’s reading list data

We explored and demonstrated the utility of user-created reading lists for evaluating models that seek to capture relatedness of Wikipedia articles. Learn more

Deepening our understanding of collaborative translation

We studied the use of translation in collaborative contexts and learned about current gaps and opportunities for event organizers and participants. Learn more

Improve Knowledge Integrity

We develop models and insights using scientific methods to support the technology and policy needs of the Wikimedia projects in the areas of misinformation, disinformation, and content integrity.

We further communicated the learnings from the research with different groups (e.g., see the February Research Showcase) to raise awareness and create spaces for brainstorming about next steps. Learn more

A model for moderation detection

We analyzed the addition and removal of issue templates across 11 major Wikipedia language editions and prototyped methods for measuring patrolling activity across Wikipedia. Learn more

Towards models to better detect unique devices

We continued our support of the Trust and Safety Product team by developing models to improve our ability to separate scraping traffic from human traffic on the Wikimedia projects. Demo

Election analysis

We extended our previous election analyses to explore elections in the EU and India. Learn more

Informing the design of a centralized hub for contributors

We validated that experienced editors saw value in the concept of a centralized contribution hub, and learned from them what would be important to see in such a hub, such as personalization to their most frequent contribution tasks. Learn more

Informing the future of FlaggedRevisions

We learned how editors across different Wikipedias use the FlaggedRevisions extension to learn about the potential risks of its ongoing lack of maintenance, and takeaways for the Product department from its current use. Learn more

A model for Wikidata vandalism detection

We published and presented a paper in the ACL '25 Industry Track about detecting vandalism on Wikidata. Paper

Guidance and research on NPOV

We supported the work of the newly formed NPOV Working Group through the research and by contributing to the work of the research workstream. Learn more

Tone Support

We supported the Machine Learning and Editing teams in adapting a classification model for detecting "peacock" violations on Wikipedia. Learn more

Building the foundations

Some of our work is in support of multiple initiatives or can inform multiple lines of future research, technology developments, and change of governance on the ground. We call this foundational work.

AI Strategy for editors

In collaboration with the Director of Machine Learning at the Wikimedia Foundation, we concluded and published the Wikimedia Foundation AI Strategy for Editors where we share the priority areas where we will use and develop AI in support of the editors’ work. Learn more

Developer Satisfaction Survey

We published the results from the over 200 respondents of the 2025 Developer Satisfaction Survey. Learn more

Community Safety Survey

We published the longitudinal results of the 2025 Community Safety Survey across six language editions of Wikipedia. Learn more

A whitepaper for Wikipedia research and privacy recommendations

We submitted a panel proposal to AoIR 2025 to further raise awareness about this work within the internet research communities.

Contributor Motivations

We started research on understanding contributor motivations as part of implementing the Wikimedia Foundation’s Multigeneration Strategy, pillar “Fuel volunteer growth”. We published a literature review of the theoretical perspectives on Wikimedia contributor motivations. Learn more

AI Retraining

We started developing a prototype for more robust training variants of ML models to assess the impact of different choices and retraining frequencies on model quality. Task

Updating research pipelines for efficiency

We updated our pipelines to use the new MediaWiki Content History dataset to improve efficiency and reduce maintenance costs. Task

Understanding editor pain points on iOS App

We conducted qualitative research to learn about the editor pain point on the Wikipedia iOS App to inform the decisions of the product team. Slides

Strengthening the research communities

Wikimedia projects are created and maintained by a vast network of individual contributors and organizations. We focus part of our efforts on strengthening part of this network: the Wikimedia research community.

Presentations and Keynotes

Through the following presentations and keynotes, we engaged with the research and scientific communities, and supported the Wikimedia communities in advancing their work.

We presented at the Wikimedia Youth Conference 2025 on Participation on Wikimedia Projects, discussing what we know about young readers, editors, and administrators, and future plans for young audience research. Slides

We gave a seminar titled A Tour through Source Reliability in Wikipedia at the Faculty of Engineering of the Free University of Bolzano.

We attended the ICWSM 25 conference to present ongoing work on article maintenance templates as part of a workshop on Computational Approaches to Content Moderation and Platform Governance.

We attended the Wikimedia+Libraries International Convention 2025 and presented a talk titled Wikipedia's role in the Artificial Intelligence ecosystem and the fight against disinformation, highlighting the critical importance of Wikipedia in supporting underrepresented languages on the web and its unique position in promoting reliable, multilingual knowledge in the age of AI. Slides

We attended CHI 2025 where we hosted a Wikimedia Research booth to strengthen the Wikimedia research communities by raising awareness about Wikimedia research in the CHI community.

We attended WWW 2025, which has published much Wikimedia research and is where WikiWorkshop was co-located for several years. This year, the conference was held in Sydney, Australia, which also enabled us to reach a community of researchers with whom we generally have fewer opportunities for co-location.

We gave a talk in Wikiherramientas in support of Wikimedistas de Uruguay showcasing several research findings from the work of the team to raise awareness among Wikimedia affiliates about the potential for turning research insights into changes on the ground. Video

Office Hour Series

We held over 30 one-on-one office hours to support researchers by answering questions about proposed or ongoing projects, guiding dataset access and analysis, sharing insights on Wikimedia Research team initiatives, exploring potential collaborations, and more. Learn more

Research Showcases

Our Monthly Research Showcases featured diverse presentations on content integrity, editor motivations, gender gaps, reader attention and curiosity, and Wikipedia Admin recruitment, retention, and attrition. Learn more

Wiki Workshop

We held the 12th edition of Wiki Workshop, for the first time over two days. More than 250 participants joined us in this year's edition, composed of researchers, Wikimedia volunteers, WMF staff, and more. We received 62 submissions for the research track and held new programming, including two AMA sessions with Wikipedia administrators.

If you missed the event, worry not! You can review the schedule, check out the accepted extended abstracts, and even watch the recorded sessions.

Research Fund

This year, we made some changes to the Research Fund in order to better support the research community and increase the impact of the funds we give. We revamped the fund to include three proposal types (standard 12‑month research, extended up to 24 months with a possible third year, as well as event and community building), higher funding limits (up to 150K for extended research), a one‑stage review process, and clearer eligibility criteria. We received 61 submissions, funded 8 projects, distributing a total amount of over 260,000 USD among these projects.

WikiNLP

We worked on co-organizing the second edition of WikiNLP which will take place as part of ACL 2025. The event will create a space for showcasing Wikimedia's contributions to the NLP community and highlighting approaches to ensure the sustainability of this relationship for years to come.

WMF-RAY

Since 2021 we have recognized novel research that has the potential to have significant impact on the Wikimedia projects or the research space. We worked on award selection and organized the award ceremony. Learn more about the winners and watch the award ceremony if you missed it. Congratulations to the 2025 winners.

The people on the Research team

New members

Debra Kumar joined us in April 2025 as Manager, User Experience Research. Debra is a mixed-methods researcher with a background in computer science and psychology (B.S., Iowa State University) and a Masters’ in Human-Computer Interaction (University of Michigan). Prior to joining Wikimedia, Debra worked for 11 years as a User Experience Researcher at Google, where she worked on Gmail and Google Search. In this role, she led researchers and research across countries to understand user needs and inform product direction. On Google Search, this research focused on understanding the role that video plays in information-seeking online and when and why users prefer video or visual content over text-based information sources – as well as understanding young people’s information-seeking behaviors and preferences both on Google Search and off. Within the Wikimedia Research team, she is managing the Design Research team and also contributing to the research community initiatives within the broader Research team.

Collaborations

The Research team's work has been made possible through the contributions of our past and present formal collaborators. During the last six months we did not create new formal collaborations. Instead we focused more heavily on wrapping up existing collaborations, and we initiated internal conversations about which areas of research we want to pitch for new formal collaborations. We expect to gradually start adding formal collaborations in the coming six months.

We also want to take this opportunity to thank Benjamin Mako Hill (University of Washington) for extensively supporting our team's efforts on two major fronts. Mako served as a Research Fund co-chair during the past six months as well as the Award co-chair for the WMF Research Award of the Year (WMF-RAY).

To inquire about becoming a formal collaborator, please tell us more about yourself and your interests.

Events

For other Wikimedia community organized events, check the events calendar.

As we start the new fiscal year at the Wikimedia Foundation, we are heavily focusing on responding to the Global Trends in how people receive and contribute to information online, changes in how information is distributed and regulated, and the decline of users with extended rights in the majority of large Wikipedia languages. We invite you to familiarize yourself with these trends and join us in responding to these trends in support of the Wikimedia Movement and Wikipedia.

Donors

Funds for the Research team are provided by donors who give to the Wikimedia Foundation, and by a grant from the Sloan Foundation. Thank you!

Keep in touch with us

The Wikimedia Foundation's Research team is part of a global network of researchers who advance our understanding of the Wikimedia projects. We invite everyone who wants to stay in touch with this network to join the public wiki-research-l mailing list.

In addition, our team offers one-on-one conversation times. You are welcome to schedule one of our upcoming Public Office Hours.

We look forward to connecting with you.