Technical talks are video presentations created by and for members of the Wikimedia technical community. Technical talks cover a wide range of technical concepts and ideas: From how a technology or process works, to how to perform a specific task, to lessons learned in a project. They help make it easier to contribute to Wikimedia projects.
Technical Talks took place from 2014 to 2020.
Archive Writing PHP unit tests for MediaWikiDecember 07, 2020 at 20:00 UTC
YouTube video stream: YouTube
Slides: TBA
Speaker: Kosta Harlan
Topic Areas: Technology
Description: This talk covers tooling, tips, and best practices.
Learn more here: phab:phame/post/view/169/changes_and_improvements_to_phpunit_testing_in_mediawiki/
(Modern) Event (Data) PlatformSeptember 23, 2020 at 17:00 UTC
YouTube video stream: YouTube
Slides: Wikimedia Commons
Speaker: Andrew Otto
Topic Areas: Technology
Description: Wikimedia's Event (Data) Platform provides a foundation for building loosely coupled event-driven software systems. This talk goes over why we built Event Platform, and give an overview of its components and how they work.
Notes During the talk we mentioned Architecture office hours. If you are interested in participating, send an email to architecturewikimedia.org.
openZIM/Kiwix ETL toolchain for Wikipedia dumpingAugust 26, 2020 at 17:00 UTC
YouTube video stream: YouTube
Slides: Wikimedia Commons
Speaker: Emmanuel Engelhart / Kelson
Topic Areas: Technology
Description: Summary of the talk: Enjoying Wikipedia offline wherever, whenever is easy with Kiwix. But behind the scenes, a bunch of tools are needed to make it work. From article selection to dump publishing through scraping, optimisation and packaging: here is a quick overview of how we do it.
Retargeting extensions to work with ParsoidAugust 12, 2020 at 17:00 UTC
YouTube video stream: YouTube
Slides: Wikimedia Commons
Speaker: Subramanya Sastry
Topic Areas: Technology
Description: The Parsing team is aiming to replace the core wikitext parser with Parsoid for Wikimedia wikis sometime late next year. Parsoid models and processes wikitext quite differently from the core parser (all that Parsoid guarantees is that the rendering is largely identical, not the specific process of generating the rendering). So, that does mean that extensions that extend the behavior of the parser will need to adapt to work with Parsoid instead to provide similar functionality [1]. With that in mind, we have been working to more clearly specify how extensions need to adapt to the Parsoid regime. At a high level, here are the questions we needed to answer:
How do extensions "hook" into Parsoid?
When the registered hook listeners are invoked by Parsoid, how do they process any wikitext they need to process? How is the extension's output assimilated into the page output?
Broadly, the (highly simplified) answers are as follows: Extensions now need to think in terms of transformations (convert this to that) instead of events (at this point in the pipeline, call this listener). So, more transformation hooks, and less parsing-event hooks.
Parsoid provides all registered listeners with a ParsoidExtensionAPI object to interact with it which extensions can use to process wikitext.
The output is treated as a "fully-processed" page/DOM fragment. It is appropriately decorated with additional markup and slotted into place into the page. Extensions need not make any special efforts (aka strip state) to protect it from the parsing pipeline. In this talk, we go over the draft Parsoid API for extensions [2] and the kind of changes that would need to be made. While in this initial stage, we are primarily targeting extensions that are deployed on the Wikimedia wikis, eventually, all MediaWiki extensions that use parser hooks or use the "parser API" to process wikitext will need to change. We hope to use this talk to reach out to MediaWiki extension developers and get feedback about the draft API so we can refine it appropriately.
[1] https://phabricator.wikimedia.org/T258838
[2] https://www.mediawiki.org/wiki/Parsoid/Extension_API
Beyond Wikipedia - Knowledge that even a computer can understandJuly 22, 2020 at 17:00 UTC
YouTube video stream: YouTube
Slides: Google Slides
Speaker: Zbyszko Papierski
Topic Areas: Technology
Description: Everybody knows what Wikipedia is, right? This magnificent source of knowledge has been helping countless people with their everyday lives for nearly two decades. Whether you want to know how to calculate the circumference of the circle, whether hyenas are pack animals or what really happened to the Ottoman Empire - Wikipedia’s got your back.
Well, unless you happen to be a computer.
One issue with Wikipedia is that knowledge there isn’t very well structured. There are links to other pages, sure - but unless you actually understand the text, you won’t understand what the link actually is. This is, of course, a field day for AI/ML experts - and there are a lot of people already scavenging Wikipedia for any meaningful relations. Fortunately, this is not the only way.
Enter Wikidata - Wikipedia’s younger sister. Wikidata is also a source of knowledge curated and provided by a community of volunteers, but presented in a relational graph format. Structuring the knowledge has huge ramifications - it not only makes it easier to digest by software, but also allows you to infer new knowledge.
There are different ways for developers to interact with Wikidata, but we’ll focus on Wikidata Query Service - a service my team is responsible for. It provides a queryable interface - using an RDF graph language called SPARQL (not to be confused with a hundred other things in IT with “spark” in the name).
Let’s do some discovery!
API portal and gateway projectJune 05, 2020 at 17:00 UTC
YouTube video stream: YouTube
Slides: Wikimedia Commons
Speaker: Evan Prodromou
Topic Areas: API, technology
Description: How does Wikimedia become "the essential infrastructure in the ecosystem of free knowledge"? One way is by making a platform that helps software developers become successful. In this talk, Evan Prodromou, Product Manager for APIs in the Platform Team, discusses the ongoing work to provide a Wikimedia developer platform. With this platform, app creators can include Wikimedia data and content into their software in new and emergent ways. From modernizing our API paradigm, through unified user authorization, documentation, and developer onboarding, the Platform team is working to make a developer experience that rivals those from other major Internet players.
Links
The basics of cryptography using OpenPGP and GnuPGApril 29, 2020 at 17:00 UTC
YouTube video stream: YouTube
Slides: TBA
Speaker: Lars Wirzenius
Topic Areas: Technology, structured data
Description: OpenPGP is the prevalent standard for cryptography for secure software distribution and GnuPG is its prevalent open source implementation. This talk introduces things at a conceptual level: what cryptography is for, why is it useful, and the basic use of GnuPG by creating cryptographic keys, using digital signatures, and encryption. No previous experience with GnuPG or OpenPGP is needed, but all examples use the Linux command line.
Understanding Wikimedia Maps and its challengesMarch 25 2020, at 18:00 UTC
YouTube video stream: YouTube
Slides: Wikimedia Commons
Speaker: Mateus Santos, Software Engineer
Topic Areas: Maps, product, Site reliability
Description: The WMF Product Infrastructure Team has been maintaining the Wikimedia Maps service for the last year and a half with help from SRE. This talk shares the challenges and work of creating a better development environment to enhance productivity, solve technical debt and keep up with platform modernization.
Data and Decision Science at WikimediaFebruary 26, 2020 at 18:00 UTC
YouTube video stream: YouTube
Slides: TBA
Speaker: Kate Zimmerman, Head of Product Analytics at Wikimedia
Topic Areas: Technology, data, data visualization
Description:
How do teams at the Foundation use data to inform decisions? Sarah Rodlund talks with Kate Zimmerman, Head of Product Analytics at Wikimedia, about what sorts of data her team uses and how insights from their analysis have shaped product decisions.
Kate Zimmerman holds an MS in Psychology & Behavioral Decision Research from Carnegie Mellon University and has over 15 years of experience in quantitative and experimental methods. Before joining Wikimedia, she built data teams from scratch at ModCloth and SmugMug, evolving their data capabilities from basic reports to strategic analysis, automated dashboards, and advanced modeling.
Links mentioned in talk:
Structured data on CommonsDecember 11, 2019 at 18:00 UTC
YouTube video stream: YouTube
Slides: TBA
Speaker: Cormac Parle, Software Engineer
Topic Areas: Technology, structured data
Description:
The talk covers Structured Data on Commons:
November 20, 2019 at 19:00 UTC, 45 Minutes
YouTube video stream: YouTube
Slides: Wikimedia Commons
Speaker: Amir Sarabadani, Software Engineer
Topic Areas: Technology, Wikidata
Description:
Wikidata is a complex and large-scale project. We all know how to use it and how to contribute to it but it's a little bit hard to understand how it actually works, how it scales and what parts are tricky about it. To lots of developers, it's a black box and this is not good. This talk plans to explain internals of Wikidata to other developers and explain future changes to Wikidata on its technical layer.
ResourceLoader tips and tricksOctober 23, 2019 at 45 Minutes
YouTube video stream: YouTube
Slides: TBA
Speaker: Roan Kattouw, Principle Software Engineer
Topic Areas: Technology
Description:
Did you know that you could require() files in JavaScript? That you could make your own icon modules with 10 lines of code? That there's a new way to export configuration variables to JavaScript?
Learn about new ResourceLoader features introduced this year, and how you can use them to improve your code. We'll start with a quick introduction to ResourceLoader, then dive into some of the advanced features like require(), config var bundling, generated JSON files and icon modules.
How to compare text across multiple languagesSeptember 25, 2019 at 18:00 UTC, 45 Minutes
YouTube video stream: YouTube
Slides: TBA
Speaker: Diego Saez-Trumper, Research Scientist
Topic Areas: Technology, languages
Description:
This talk explains how cross-lingual word embeddings works, and how they can be used to measure the semantic distance between words and documents across different languages, as well of showing some use cases in our section and template alignment work.
Documenting Wikimedia technical projectsSeptember 04, 2019 at 18:00 UTC, 45 Minutes
YouTube video stream: YouTube
Slides: TBA
Speaker: Sarah R. Rodlund
Topic Areas: Technology, technical writing, technical documentation, Toolforge, Wikimedia Cloud Services
Description:
This talk discusses what technical writers do, and why they are critical members of our technical community. You learn more about the skills needed to be a technical writer and how to build these skills by participating on Wikimedia and other open source projects.
The talk also covers some ongoing initiatives to improve technical documentation for Wikimedia projects.
A Deployment Pipeline OverviewJuly 10, 2019 at 16:00 UTC, 45 Minutes
YouTube video stream: YouTube
Slides: Wikimedia Commons
Speaker: Alexandros Kosiaris
Topic Areas: Technology, Deployment, Mediawiki
Description:
The deployment pipeline project has been ongoing for a while, sometimes with more resources poured into it, sometimes less, but it's finally in a state that is ready to be used (it's already being used!). This tech talk is about a presentation to wider technical audiences, discussing the goals of the project, the implementation decisions and how it's meant to be used and adopted by the deployers of services (and eventually MediaWiki) in the coming months.
Just what is Analytics doing back there?June 25, 2019 at 18:00 UTC, 45 Minutes
YouTube video stream: YouTube
Slides: Wikimedia Commons
Speaker: Dan Andreescu
Topic Area: Data flow, Analytics Infrastructure
Description:
We take care of twelve systems. Data flows through them to answer the many questions that our community and staff have about our piece of the open knowledge movement. Let's take a look at how these systems fit together to answer questions. Let's also look at an example trick we use to join big data in a distributed world.
Wikimedia and W3CMay 23, 2019 at 15:00 UTC, 45 Minutes
YouTube video stream: YouTube
Slides: Wikimedia Commons
Speaker: Evan Prodromou and Gilles Dubuc
Topic Area: Standards
Description:
The Wikimedia Foundation is now a member of the W3C, as of April. We walk you through how you can join working groups, what to expect of W3C participation, what we hope Wikimedia staff can achieve through W3C and we share our own experiences as W3C members.
April 24, 2019 at 18:00 UTC, 45 Minutes
YouTube video stream: YouTube
Slides: Wikimedia Commons
Speaker: Srishti Sethi, Developer Advocate, Wikimedia Foundation
Topic Area: Developer Advocacy, onboarding new technical contributors
Description:
Wikimedia offers a plethora of opportunities for newcomers to get involved; however, as with many other free software projects, getting involved with the Wikimedia technical community can be a daunting prospect for newcomers. This talk is a gentle introduction to the Wikimedia ecosystem, and gives pointers on how to get involved as a volunteer. We delve into the various ways newcomers can make successful contributions in areas ranging from design to documentation, from programming to testing, and much more.
Ouch, I have an OOUI: using OOUI without painMarch 27, 2019 at 18:00 UTC, 45 Minutes
YouTube video stream: YouTube
Slides: TBA
Speaker: Moriel Schottlender
Topic area: OOUI
Description: OOUI is the interface widget library we are using for UI in the Wikimedia projects. The library is meant to allow implementers to create useful interfaces that automatically answer internationalized needs that are unique to the global nature of our projects. Right-to-left support, supporting old browsers, accessibility, etc, are things that OOUI is doing in the background for you. This tech talk presents OOUI’s history, basic and advanced usage, and demonstrate how to create great interfaces without (much) pain within our wiki ecosystem.
Links mentioned in the talk:
February 27, 2019 at 19:00 UTC, 45 Minutes
YouTube video stream: YouTube
Slides: Wikimedia Commons
Speaker: Subbu Sastry, Principal Software Engineer
Topic area: Parsoid, Wikitext Parsing
Description:
This talk has two parts: The first part provides a bunch of background to make sense of the roadmap presented in part 2. The second part has 3 components: (a) Parsoid history (b) Porting Parsoid to PHP: the whys and wherefores (c) From here to Parsoid as the default.
Parsoid started in 2012 as a project to support Visual Editing and since then has gone on to support a number of products (Flow, Content Translation, Kiwix, and Android app). Given that (a) Parsoid's annotated HTML output enables clients to infer things about wikitext without having to parse wikitext, (b) the PHP parser cannot support Visual Editor and other products, and (c) we cannot continue to have two parsers, it is inevitable that Parsoid will be the default parser for MediaWiki. This has been known since at least 2015 but while we are nearer to that goalpost, we are still not quite there yet.
In this talk, we'll talk about what else needs to be completed, and what the porting of Parsoid to PHP means for this goal.
Older tech talksYou can browse through past tech talk recordings in the Commons category and on the MediaWiki YouTube channel.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4