Uncategorized

6. December 2017

“My Colleague the Robot” – How People and Automated Assistants Work Together at Wikipedia.

Bots take care of many routine tasks at Wikipedia. The automated assistants greatly influence how the community interacts at the online encyclopedia. Wikipedians have learned to respond to the social effects of algorithms – for example by creating their own bot policy.

Wikipedia is a giant laboratory for exploring how algorithms and people can work together (see Part 1 of our series on Wikipedia bots: “Guardians of Global Knowledge”). Wikipedians working on the English-language Wikipedia put together an extensive set of guidelines for using automated bots soon after the site was launched. And while every Wikipedia user can run bots on their own computer to automatically change articles in the encyclopedia, anyone who does so without permission from the Wikipedia community risks having their account blocked.

Wikipedia’s six laws of robotics

The Wikimedia Foundation operates servers which volunteers can use to store their bots. The advantage is that programs can be tested there without doing any harm, and operators don’t have to worry about damaging the infrastructure that runs the software. Bots are given certain privileges depending on the community in question. For example, they can make more changes per minute than a normal user is permitted to carry out, and their input is generally not checked for vandalism as often as that of human authors. Ultimately, each community working on Wikipedia’s numerous language versions and sister projects decides for itself how it wants to engage with bots.

To be granted the above privileges, however, the bots have to be approved by the community. A Bot Approvals Group decides which automated programs may run on the English-language Wikipedia. The group has put together six requirements that a bot must meet to be approved.

These criteria are the result of a long learning phase, since Wikipedians had the idea to at least partly automate, what is in effect the repository of the world’s knowledge, for quite a while. Yet while trying to do so they repeatedly encountered problems that made it necessary to reconsider the existing rules.

Small mistakes with a big impact

The first example was the “rambot”. Beginning in October 2002, it used data made available online by the US government to create articles about cities and towns not yet included in Wikipedia. The rudimentary entries created by rambot only contained basic information, such as location and population, which had been inserted into a simple template. On the one hand, the experiment was highly successful: rambot created over 30,000 new articles at a time when the English Wikipedia only had 50,000.

In the study “The Emergence of Algorithmic Governance in Wikipedia” the authors determined, how much the number of bot-edits in the German-language Wikipedia has increased over the past years.

On the other, the automated effort quickly produced a number of problems. Errors in the data accessed by the bot resulted in more than 2,000 defect articles, causing even more work for the site’s human authors. At times rambot overtaxed the young encyclopedia’s resources. For example, users looking at Recent Changes – an automated log of the latest edits – had a hard time understanding what was happening on the platform since the list was inundated with entries stemming from the bot’s work.

As a result, the English Wikipedia community wanted to stop using bots completely, at least temporarily. Yet given Wikipedia’s rapidly growing significance and complexity, they ultimately decided to deploy bots on the site – albeit only those that met strict requirements.

Minor details become full-blown controversies

Sometimes the conflicts between humans and machines are not technical, but social. This can be seen in an incident described by ethnographer R. Stuart Geiger.

In 2006, a Wikipedia author using the pseudonym “Hagerman” developed a bot to fix what he believed was a simple defect: Wikipedia does not have a traditional discussion forum. Instead, users have to insert their comments at the appropriate spot on discussion pages and use the characters “–~~~~” to sign them. That makes it clear who contributed and when. As Hagerman noticed, however, many entries lacked a signature, with the result that already confusing discussions were even harder to follow.

Hagerman decided to program a bot to insert missing signatures and got approval to do so from the newly formed Bot Approvals Group. Although the program worked as intended and was altered whenever a problem arose, HagermanBot proved highly controversial. Some people saw it as an unseemly infringement of user rights that a bot was basically telling them how to sign their own contributions to the discussion. Hagerman then programmed the bot to offer an opt-out feature, allowing users to deactivate the automatic signature for any comments they made.

In the administrative part of the German-language Wikipedia an increasing share of contributions is made by bots. They are used for diverse purposes — to add signatures, send out newsletters, etc.

Bots have a social effect

Yet the controversy took a more fundamental turn when users began asking if a bot should even be allowed to alter a comment made by a human. In light of Wikipedia’s consensus-oriented decision-making system, additional concessions were made. Consequently, Hagerman developed a function that allowed users to disable HagermanBot for certain user and discussion pages, a function that other bot programmers ultimately adopted as well.

It comes as no surprise to researchers that such conflicts can flare into major controversies. “Wikipedia is a socio-technical system,” says Claudia Müller-Birn, a professor at Freie Universität Berlin who carries out research on human-computer collaboration, speaking with blogs.bertelsmann-stiftung.de/algorithmenethik. That means the site’s technical and social elements cannot simply be separated, but interact with each other on many levels.

The Wikipedia bots work the way speed bumps do on a city street. Even if the traffic devices are only used to enforce existing speed limits, they change the nature of the street itself, for example by making it impossible for everyone, even for ambulances, to go faster than the legal limit. Social consensus is thus transformed into an inescapable rule.

Human authors scared off by bots

The use of bots and other software-supported tools has had a lasting influence on the Wikipedia community’s social structure. In one study, for example, Aaron Halfaker and R. Stuart Geiger found that uncompromising efforts to combat vandalism have measurably impacted the number of Wikipedia users.

Based on their data, the researchers maintain that the percentage of good-faith newcomers editing a Wikipedia article for the first time has remained constant over the years. What has increased rapidly, however, is the share of “reverts,” the editing changes that undo the contributions made by newcomers, often within seconds. The result is that many potentially valuable authors have been turned away at Wikipedia’s front door – and possibly discouraged from participating for good.

It is not only bots who are scaring off newcomers by criticizing and reverting their work. The German-language version of Wikipedia has also been struggling to retain editors, even though automated tools such as ClueBot NG are not allowed to autonomously delete contributions made by humans. Instead, the German community deploys software like Huggle, which accesses the same data used by bots in order to decide whether a new entry is vandalism. Ultimately, however, the decision to delete a contribution is not made by an algorithm, but by a human being – even if this hardly makes a difference to the author in question: As Geiger and Halfaker ascertained, Huggle users only reply to 7% of the queries they receive about reverts they have made.

Having understood the far-reaching implications of using bots, many Wikipedia communities have become more cautious about the processes they use. “As a bot operator you have to take responsibility for your bot,” says Wikipedia author Freddy2001, who also operates a bot. That is why she has included buttons on her bot user page that people can use to disable the tool. In other words, before the automated helper annoys a user it should just stop doing what it was programmed to do.

New openness stops user hemorrhage

Obviously this is not possible for every bot, since tools such as ClueBot NG are more or less indispensible when it comes to fighting vandalism, a constantly growing problem. In view of how important the online encyclopedia has now become, the Wikipedia communities must always strike a balance between making it easier for existing users to do their work and being open to new users.

The Wikimedia Foundation has now launched a number of initiatives to make the project more attractive to newcomers. For instance, a new program, VisualEditor, makes it easier to create articles without first having to learn all of the site’s complex formatting rules. There are also new mentoring programs and tools people can use to say “thank you” for constructive contributions. Although these measures have not returned the Wikipedia community to its former size, they do seem to have stopped the rapid loss of users.

With the launch in 2012 of Wikidata, Wikimedia’s new platform for facts and figures, bots gained another area of application and became an even bigger presence on the site. To learn more, see Part 3 of our series. You can also subscribe to our RSS feed or e-mail newsletter to find out when new posts appear in this blog.

(424)

Beitrag teilen

Write a comment

Cancel reply

Subscribe to our newsletter

Keep up to date! Enter your email addess and click on subscribe

About us

In the "reframe[Tech] - Algorithms for the Common Good" project, we are dedicated to ensuring that the development and use of digital technologies are more closely aligned with the common good. We provide policymakers, public administration employees and organizations with a focus on the common good with analyses of the risks and opportunities of digital technologies, along with insights and solutions to harness their potential for greater common good.

Our topics