An interesting story broke this morning. The Times journalist Giles Hattersley wrote, in an article about the inaccuracies of Wikipedia:
My entry features at least two errors, one libellous (unless my mother has been keeping a dark secret, I am not Roy Hattersley?s son).
Interesting because, at the time, Giles Hattersley did not have an article about him on Wikipedia. As the Telegraph’s Shane Richmond pointed out:
Yet I can’t find an entry for Giles Hattersley in Wikipedia. And, as Martin Belam points out, it doesn’t look like there has ever been one.
I can confirm this quite easily – I’m a Wikipedia administator and so I can examine the entire article history – available here. I can quite authoritatively say no article on Giles Hattersley existed on Wikipedia before 1507 on February 8th – which was after Hattersley wrote his article and Shane Richmond had picked it up. “So far, so what?”, you might ask – journalists have long been used to making erroneous or exagerrated claims about Wikipedia – such as Petronella Wyatt in 2007. But then it all got a bit weird.
At 2048 tonight, Jimmy Wales, the founder of Wikipedia, unilaterally decided to delete the article, even though it does not meet the criteria for speedy deletion. If you want to judge for yourself, a copy of the last revision before deletion is available here – it is fairly neutral and with citations from reliable sources, I hope you’d agree (disclosure – I edited this article once, but only a minor one, adding the word ‘falsely’ in the first sentence third paragraph)
While Hattersley’s Times piece was undoubtedly a motivation to create the article, his job as a journalist for a national newspaper and former editor of ARENA make him probably notable enough for a Wikipedia article in the first place; Wikipedia’s objective philosophy makes clear that it is the content of an article, not the motivation behind its creation, is what counts towards its inclusion. Any article that does not meet the speedy delete criteria should have its deletion first discussed, according to Wikipedia policy. This was not considered here.
So what happened? The reasons given for deletion are not very clear:
I have temporarily deleted this article, and kindly request that no one restore it until we’ve sorted out all the facts. Giano has been blocked for 24 hours by me for incivility related to this entry. Jay and I are already aware of the situation and I am reaching out to the newspaper for further clarification.
There is certainly no question of libel (or else the Telegraph’s lawyers should be worried about Shane Richmond’s article) – and there were appropriate citations given. Reaching out to the newspaper (presumably the Times) is also a strange thing to mention – as the article deals with Hattersley personally and not his employers. Jimmy Wales’s reasons for deleting the article certainly transcend the usual policies towards controversial articles. The involvement of Giano, an editor who has come into frequent conflicts with other editors before, adds another intriguing dimension to further confuse matters.
Finally, this is not the first editing controversy Jimmy Wales has gotten into. It brings to mind past stories of him editing entries against policy, for seemingly personal reasons – even as petty as editing his own date of birth on his Wikipedia article.
No-one comes off well in this. The Wikipedia community doesn’t, for creating a vindictive article about Hattersley (I must confess I joked about doing so but did not carry it through). Jimmy Wales doesn’t, for not going through established Wikipedia policies, in a community whose philosophy and outlook he is meant to not just adhere to but defend. Giles Hattersley comes off pretty badly as well – the most charitable interpretation of his Wikipedia claim one could give is that it was sloppy journalism
There is further, ugly implication from all this. Much of journalists’ research is done from the Internet these days – nothing wrong with that, as long as it considers the veracity of sources in mind. But bad journalists and pressing deadlines can mean sometimes journalists cross the boundary from legitimate research to outright plagiarism or falling for hoaxes. Yet the newspapers these journalists write for are considered by Wikipedia as reliable sources – so there easily lies a vicious circle where misinformation can become ‘fact’ by being picked up by a lazy or incompetent writer.
But the answer to this isn’t by using your website to vindictively attacking the mainstream media and its employees just to settle a grudge. The answer is a tightening not just of editorial standards but of ethics – the philosophy and approach you take to what you write and say.
Journalists need to learn to appreciate Wikipedia’s strengths and weakenesses, as well as realising how much harder it is for mistakes to go unnoticed in this world today. Wikipedians need to learn to stick to the policies they’ve agreed on and not to use the site as means of exacting vendettas upon others. If both sides fail to raise their game, trust in both will crumble in the long term.
Update: More in the comments – basically he has blamed it on an errant subeditor, and that the error existed not in his article but elsewhere on Wikipedia, “one or two years ago”, but he’s not sure where. No-one can find such an error. The lack of basic fact-checking doesn’t exactly do his reputation as a journalist any favours.
Further update: The Times corrected the article (and a less inflammatory headline) on Monday afternoon, although the claim that there was any sort of error on Wikipedia has still not been fully substantiated.
Charles Arthur had a nice post at the weekend entitled: If I had one piece of advice to a journalist starting out now, it would be: learn to code. As any modern journalist is able to Google around for facts, Charles tells any budding journo to set themselves above and beyond the normal set of “IT skills”; being able to get a more powerful grip on data is now becoming part of what a journalist should know:
None of which is saying you shouldn?t be talking to your sources, and questioning what you?re told, and trying to find other means of finding stuff out from people. But nowadays, computers are a sort of primary source too. You?ve got to learn to interrogate them effectively – and quote them meaningfully – too.
It’s great advice – playing with data and getting a feel for how to get the best out of it not only helps you find new things out, but also helps open your mind up to a more healthy appreciation of data. It allows you to explore the possibilities of data as well as its flaws, when it can be trusted and when it should be taken with a pinch of salt. And it’s things like this that contribute towards a sense of joyful skepticism that any self-respecting geek should possess (and you thought it was just about watching every episode of Battlestar Galactica).
I gave up programming as a full-time career more than three years ago but have still kept my hand in programming since, either for fun or to make work quicker and easier. Working in the digital and social media PR sector isn’t just about going to the pub (truth be told, it’s actually about going to very expensive pubs) but also about dealing with vast quantities of information – so you can see how programming can help. Making tasks faster is part of it, but the programming mindset is equally if not more important: it has taught me skills such as looking to optimise and make things quicker, filtering noise from the signal, reusing what you have to save effort in the future, and not being surprised by the unexpected.
So I’m going to say what Charles Arthur said, but bigger. If you work in any information industry, or are thinking about a career in it, learn to code. And by code I don’t mean learn something hardcore like Java or C++, or even learn a full programming language (as you’ll see below). But it means getting above the usual abstractions you see – your web browser, Word, Excel – and getting involved at a deeper level, get to appreciate what the data it is you’re reading and realise it’s not just something to look at.
So where and what would I recommend getting started with programming? From my own weary geek’s viewpoint, here’s six ways of getting into it – three of which really aren’t strictly programming at all:
Regular expressions. I cannot begin to think how many times these have bailed me out of an otherwise unrecoverable situation. Regular expressions are ways of finding and replacing text that are much more powerful than the bog standard. For example, you might want to get all the telephone numbers or postcodes, out of a document, but they are all different so a simple search wouldn’t be able to do it, so you have to do it by hand (and might miss one out).
A regular expression on the other hand can say “find me any group of eleven digits that begin with a 0, and either match the patern 0xx xxxx xxxx, or 0xxxx xxxxxx” – and bingo, you have all your phone numbers. Get clever and you can even tell it to not worry about whether it’s a space or a hyphen in between the groups of numbers. Be careful – they can get complicated, so build them up slowly and step by step – and they can do unexpected things, so always back stuff up.
CSV. Many people work with Excel spreadsheets and while it is great for tabulating data it isn’t a very portable format. Often you want to copy data in or out of Excel into other applications and it ends up being a horrible mess of numbers separated by spaces and tabs that you have to re-align yourself. CSV (comma-separated values) is the very boring but portable way of getting data in and out of Excel – it just consists of text with no styles, with commas to mark in between each column.
CSV looks like shit but it makes up for it by being able to be extremely portable and lightweight. Combined with regular expressions above and you’re able to take the useful data out of a horrible mess, replace everything between it with commas, and you can now import it straight into a spreadsheet. Or vice-versa – extracting numbers out of the spreadsheet and allowing other apps to play with it (like I did with the general election map)
Yahoo! Pipes. I am still waiting for Yahoo to piss this one up against the wall like they have done with Technorati and Flickr. So much of the web already runs on RSS (Really Simple Syndication) – streams of links and articles – that being able to manipulate them like this is a real boon. Yahoo! Pipes takes RSS feeds and allows you to merge them together, filter them, cross-reference them and more. When I was looking for a job last year I used a series of Pipes to pull feeds from various job websites, filter out the kinds of jobs I didn’t like, and then remove the duplicates so I wasn’t wasting my time – all delivered to Google Reader for easy perusal, as and when they came in. The interface is as reasonably usable as you could expect and has led to some really useful apps being created.
Python. The biggie and the one I use the most. Python‘s strengths lie in its simplicity – it’s quite simple and human-friendly and runs pretty much on anything. It also has a sensible structure and organisation, which teaches you to code well and clearly. Finally, the vast libraries available mean you can play with pretty much any data format, such such as BeautifulSoup, which allows you extract data from webpages easily. Python’s one drawback is that it falls down on its relatively poor documentation & tutorials, with some honourable exceptions such as Mark Pilgrim, so do hunt around and don’t let the technicalese put you off.
Finally, PHP gets an honourable mention – a easy enough language to learn and used widely, but with so many evolutions and a complicated past the language is a mess, and it teaches several bad programming habits.
I wouldn’t recommend doing all six at once, or even ever, nor would I set expectations too high. In some respects, it’s not even about the code or the results you get – it’s as much about the philosophy and understanding it brings with it: that data is not a static thing but ours to play with, making us able to create wonderful new things or change society for the better.
Last year, Jason Kottke charted the rise of the single-serving site – “web sites comprised of a single page with a dedicated domain name and do only one thing”, as he puts it. They range from the facetious (IsItChristmas.com), to the possibly useful (IsLostARepeat.com), to the downriight marvellous (ItsNotLup.us). They’re almost the anti-Web 2.0 – uninteractive, dry-looking – devoted to a single psychotic purpose, neatly spelt out in the URL.
Since then, another kind of sites with a single purpose has sprung up – they can feature content – typically photos – and just that. They’re updated, but unlike blogs focused on single subjects, there’s no commentary, no snarking, no linky enthusiasm. It’s just content shovelled up for you straight away – and a lot of them use Tumblr. Brokers With Hands on their Faces, Garfield Minus Garfield and Fuck Yeah Sharks are three of my favourites, whiile Bale Yeah! and White People Trying to Look Serious get honourable mentions. The less commentary the better – let the content do the talking. Dude Totally Punched A Horse slips on this last one – it should just be the videos. Ditto for Arrested Development Stills – just leave them be!
Tumblr is ideal for this kind of streaming – having managed a super-secret Tumblr myself these past few months (I’ll reveal all sooner or later), the interface is a cinch, and as comments are not enabled by default it saves the messiness of moderating or dealing with the content. It’s halfway between single-serving and full-on blogging, leaving you free to obsess about whatever you obsess about. It’s light, fun and often hilarious. Long may it continue.
Note: Mostly written while watching the film version of V for Vendetta over Christmas with a hangover, spoilers galore for both it and the book within, so proceed at your own risk.
Of the many things wrong with the Wachowski Brothers’ flawed adaptation of V for Vendetta, the omission of the computer Fate is by far the biggest. Fate is the computer that runs the society in V’s alternate future; it hooks into to the surveillance systems used throughout British society and makes all the decisions. As the novel progresses, the high chancellor Adam Susan, supposedly the fascist dictator in charge of society, turns out to be in thrall to Fate’s machinations, believing it to be a goddess; with it his truly wretched and lonely character is revealed
From Fate and her omniscience and omnipotence, all the best complexities of the characters come – for example, the curious hidden nature of Lewis Prothero, the “Voice of Fate”, a sociopathic concentration camp commandant with a nevertheless seductively charismatic voice (and a natty line in girls’ dolls). In the book he is the human voice of the computer, broadcasting sonorously to the nation, but in the film, robbed of his duality he gets turned into a shitty cross between Richard Littlejohn and Bill O’Reilly, ranting away incoherently on national television every night.
Despite bring set in Britain, the Wachowskis’ adaptation is very Americocentric (as demonstrated by the recharacterisation of Prothero); it details a narrative based on opposition to the neoconservative agenda in America and the resulting foreign policy; the film is peppered with references to the Iraq war, Islamophobia and homophobia, and the bioterror plot within is a little reminiscent of the 9/11 conspiracy theories. The film is very much a product of the early 2000s – and with the crushing defeat of the neoconservatives in the US mid-term and presidential elections, now already seeming a little dated.
With this in mind, the more I think about it, the better allegory for our times doesn’t come from the post-war on terror ostentatious authoritarianism but on the Fate plotline, of a more insidious system of control. Successive governments have become increasingly in thrall to mass surveillance, but it has especially been the case with the present one – whether it be CCTV cameras, the national identity register, DNA databases (even if you’re innocent), mass-snooping of emails and phone calls, or even outright hacking of your computer without a warrant.
And thrall is the right word to use here, as decisions are made not on evidence based on their efficacy but on an ideology that the more is more: the more data the government has, the more able it is to govern. Focusing on the quantity rather than the relevance of data has various unfortunate consequences; we fall risk to garbage in-garbage out: supposedly reliable databases turn out to be heavily flawed. It leads to greater risk of security breaches, whether they be accidental or malicious. And most importantly it leads to a system of governance where everybody is treated as a datapoint – and thus governments manipulate people just like they would like to manipulate datapoints. The end result is a dehumanised and rather bleak polity, with every facet of public service characterised with targets, performances and star ratings, human beings reduced to automata in a fabulous number-crunching system.
There’s another twenty blog posts I could write on this theme, but I won’t for now. But do check Adam Curtis’ The Trap as a primer on it from a philosophical/psychological point of view; The Tiger That Isn’t by Michael Blastland and Andrew Dilnot for a mathematical examination; there is no equivalent from a sociotechnical or economical aspect exists, as far as I know.
Anyway, back to V for Vendetta, and Alan Moore. The comic was set against the spectre of nuclear war (from which the putative fascist Britain would rise), with a hint of a warning about where the right-wing agenda of the early Thatcher government could take us. And through this system, the monstrous system of Fate is created, and we are beholden to it. The odd thing is that we’re being taken towards the end without going through the intermediate stages – which is a relief in some ways (eating dead rats out of radioactive rusty hubcaps is never a good thing) but also oddly chilling. The pessimistic conclusion is that supreme control and omniscience is the goal of anyone in power and with the technology at hand it’s an inevitability. The optimistic conclusion is that despite the steady encroachment, it’s never too late to turn it back, if only we have the will. What’s it going to be? Hopefully it is not a matter of Fate.
Back in 2003, before I even started blogging, I created the Daily Mail Headline Generator, and within a couple of weeks a friend suggested some extra things to put in it. “Good idea”, I said, “I’ll update the code when I have the time”.
And while I’m on the subject of the Daily Mail – and I’d normally refrain from even telling you it has a website, let alone linking to it, the hatemongering bogroll that it is, but something in the latest column by homo-obsessed walking shitbag Richard Littlejohn slipped in unnoticed by the Mail’s irony detectors, it seems:
Apparently, my column is a constant reminder of why they did the right thing in emigrating to New Zealand.