There is no HTTP code for censorship

9 December 2008

The biggest story in UK Internet circles the past couple of days has been the censorship of Wikipedia by the UK’s Internet Watch Foundation (IWF). For those of you not entirely sure what it’s all about – after a tipoff the IWF blacklisted the Wikipedia article (NSFW) on the Scorpions’ 1976 album “Virgin Killer”, thanking the IWF for bringing it to his attention). The album in question contains the picture of a naked prepubescent girl (with a fake lens crack obscuring her genitalia), and was not only freely available all over the Internet (including Amazon) but has never been classified as child pornography nor has anyone been prosecuted for it.

The IWF’s blacklist is used by many major ISPs without question, and so they added wikipedia.org to a list of sites routed through their transparent proxy servers, normally use to deal with traffic aimed child pornography sites in Russia and other poorly-regulated areas of the world. The transparent proxy works in a roundabout way. With my provider, O2, rather than blocking off the entire site, it scanned all requests to wikipedia.org; any that weren’t to http://en.wikipedia.org/wiki/Virgin_Killer it OK’ed, but any for the offending page, it would produce a fake 404 message.

This was ridiculously dumb, as it did not block the image directly, but only the container webpage – as the filter was over-precise, in practice the image was still accessible through a variety of other methods, though you had to hack. Not only was it dumb but also oppressive – by blocking the text and discussion around the image rather than the image itself, they censored all discussion of its legality, the controversy around it or accounts of the band’s reaction to it and why it was eventually pulled and replaced. Simply, this is censorship of the most despotically stupid kind.

The transparent proxy also had an unfortunate technical consquence – it limited 95% of UK access to wikipedia.org to just a few IPs. With anonymous edits this meant it was impossible to tell who was a vandal and who was not, meaning well-meaning anonymous editors were summarily blocked, and prevented from creating accounts that could allow them to edit via a login. A terrible impact on the many contributions UK-based contributors make to Wikipedia (disclosure: I am a Wikipedia administrator, but am a volunteer only and hold no post with the Wikimedia Foundation) and something that usually affects countries with strict Internet controls such as Singapore or Qatar.

Thankfully, after a considerable backlash, this evening the block was lifted, and from about 1930 this evening I’ve been able to access the page (as can much of the UK by now, no doubt). In the meantime, no doubt, thousands of people wondering what the fuss is about have looked at and downloaded the “Virgin Killer” artwork, thereby ruining the IWF’s original intention.

The IWF is a curious beast. It’s nominally an independent charity, but in practice acts with the blessing of the Home Office as an unaccountable pseudo-government censor for the Internet, against “potentially illegal online content”. This ranges from child pornography to incitement to racial hatred, though they generally concentrate on the former. Given its enormous impact, the office is surprisingly tiny. From The Guardian:

Normally the IWF, which is paid for by the EU and through a levy on the internet industry, works quietly away in its Cambridge offices. A team of four police-trained “analysts” plough through 35,000 URLs sent to them each year that are under suspicion of being obscene.

That works out to an average of 700 per week, or 140 per working day, or 35 per working day per analyst – giving each an average workload for a seven-hour day of 5 URLs per hour. Typically about one-third of the URLs are deemed illegal.

So that’s four people (whose qualifications are not fully disclosed, nor how they were selected), are responsible for what 95% of the UK online population cannot see, after a typical review of just 12 minutes. Their decisions are implemented through BT’s Cleanfeed system, used with almost blind obedience by ISPs. The blacklist cannot be legitimately seen or reviewed by anyone outside of the IWF (although ironically, Cleanfeed’s architecture may open a backdoor to the blacklist) and site administrators are not notified, so unless you find your own site there by accident (as in the Wikipedia case) then you’ll never know. And the appeals process is conducted via the IWF, without recourse to a independent authority or means of oversight. The Wikipedia appeal, incidentally, was the first to be brought in the IWF’s history.

The IWF/Cleanfeed system of judge, jury & executioner is obviously broken, both technologically and socially. And in an ideal world a system such as the IWF would not exist, but it’s clear that it’s either that or full-on government regulation. Which to be honest, probably wouldn’t look that different from the current system anyway. So in the ugly world we live in, how can we make the current sociotechnical system of censorship less broken and as minimal as possible?

For starters, we have no way of knowing what we see is actually being censored. The blank “404 error page” for blocked sites breaks HTTP status code conventions – although there is no HTTP code for censorship. While some sites such as Google will be good enough to mention if their search results have been censored thanks to the IWF or copyright takedowns under the DMCA, in co-operation with the Chilling Effects project (e.g.). Why was there such reluctance for provision of a notice that says to the effect of “We have been advised by the IWF that this page contains illegal material and has been blocked.”? Was it because telling people what they’re reading is subject to censorship would have looked bad? Surely however that’s not as much a PR disaster as this?

As well as this, Cleanfeed broke Wikipedia by munging IPs and forcing people through a few proxies – not only damaging Wikipedia but any other site that uses source IP checks as part of its efforts to prevent session or identity hijacking. It also failed to filter alternative URLs, or to combat one of Wikipedia’s most pervasive legacies – the free licensing which means articles can also be reproduced on mirrors such as wapedia or Answers.com. In short, it was horribly ineffective while breaking a lot of conventions and testing the goodwill of one of the largest and most open-minded online projects around.

From the more social side, IWF’s approach is broken as well. The blacklist is secret – with at least one good reason – it prevents paedophiles from easily finding a whole new list of sites to bookmark. But this blacklist secrecy lends problems of accountability. It’s a tricky problem – I for one see little gain in opening the list to all, but there’s nothing stopping them publicly releasing one-way hashes of the URLs, say, so researchers and webmasters can check to see if any given site is on it.

The blacklist should definitely be open to an oversight committee, independent of the IWF – specialist police officers, civil servants, lawyers specialising in cases such as this, for their oversight, and these same people should handle appeals to be taken off the list as well, rather than the IWF. An independent board of technologists meanwhile should be tasked with overseeing implementation, testing its robustness independently, and to make sure that blocking and site redirection are properly dealt with according to established RFCs, and not by fake 404 messages.

As for the people who make the decisions, four people in an office with some police training doesn’t sound enough given the impact of the censorship they are inflicting. A review of how many staff there are and what level of training they get, and whether they need to be supplemented by, say, senior police officers. A clearer mandate on the material they should be looking for and blocking is required – the “potentially illegal” is too fuzzy and the censorship of text surrounding images utterly misguided.

This fuzzy remit should be especially borne in mind given their other work on “incitement to racial hatred”, and from January 2009, what has been termed “extreme” pornography. If the IWF are going to be as clueless about content on Wikipedia then I have no confidence in their ability to deal with these new (and untested) laws when they come into place.

Finally a note: the IWF are not evil. Child pornography isn’t just illegal but morally wrong, and their intentions are noble, even if we know what kind of road those intentions can pave. Preventing the spread of child pornography online, particularly when the perpetrators and distributors are beyond usual remits, is not an easy nor a thankful task. But letting them act unilaterally can lead to damaging consequences, as we’ve just seen. Good intentions must be backed up with independent cynical controls, or they are no use at all.

Further reading (updated 10/12): The Open Rights Group has a good post summarising with some questions of their own and there’s a couple of good posts over at Septicisle – which also points out the IWF is the one that has pushed for the “Girls Aloud” case to be prosecuted, the first obscenity case covering fictional text in over two decades, which opens another can of worms.


8 Responses

[...] the final word on the subject, according to Lib Diggers anyway, goes to Chris Applegate on his blog Qwghlm (honestly, what kind of a fool names their blog after an obscure, unpronounceable reference to [...]

If you think the new extreme pornography bill is bad, just consider what their responses are. Not “it’s connected to organised crime” from Kenny MacAskill (http://www.scotland.gov.uk/News/Releases/2008/01/25142201), or “protect the women/children/protitututes/dead chickens/wolly mammoths” from the feminists (http://www.swapcampaign.co.uk/index.html), no no no, nothing so logical!

Quote:
Thank you for your various e mails about the proposals to create a criminal offence for the possession of extreme pornographic material.

Scottish Ministers are of the opinion that material of this type is so abhorrent that it should not be tolerated. There are concerns about the harm to those involved in the making of this type of pornography. You should note that the selling, distributing and importing of this type of extreme, obscene material is already illegal under the provisions of section 51 of the Civic Government (Scotland) Act 1982. What the new offence will do is criminalise the possession of such material.

It appears from your e mails that you are aware that a joint consultation between the Scottish Government and the Home Office was carried out in 2005. You may not be aware that the majority of the responses to that consultation were in favour of the law being strengthened in that area.

I hope this response explains where the draft provisions came from and why they will be included in the Bill.

End Quote

Yes, that explains everything, thank you.

James H

In response to the quote in #1, I found this simply because I searched the net for said “obscure, unpronounceable reference to [sci-fi]“.
Glad I did too!

[...] with plans to age-certify the web like is currently done with films and DVDs. Coming in wake of the IWF’s horribly misguided attempt to block Wikipedia, this is another hamfisted approach to regulating the Internet as if it were old media that solves [...]

[...] Eden On 08/06/2012 · 13 Comments · In politics To quote Chris Applegate: “There is no HTTP code for censorship.” But perhaps there should be.My ISP have recently been ordered to censor The Pirate Bay. They [...]

[...] which in turn references an even older discussion at http://www.qwghlm.co.uk/2008/12/09/there-is-no-http-code-for-censorship/ [...]

[...] 这并不是一个新问题,Tim Bray特别感谢了不久前一位开发者提出的没有专门针对审查机器的HTTP代码。而早在2008年,就已经有人指出这样的问题,但直到现在,才有Google出面试图为网站开发者和网民解决这样的难题。 [...]

[...] • 翻墙必读• 科学上网• 防火长城• 墙外导航(整理中)• 禁书禁片• 禁片大全• 禁书下载(整理中)• 有关部门• 中宣部• 国新办• 网络监控• 国保警察• 真理部• 敏感词库• 河蟹档案• 五毛大观• 网络审查• 真理部指令• 马勒戈壁• 网络民议• 时政漫画• 网络段子• 热传视频• 歌曲精选• 草泥马语• 民主宪政• 人权记录• 天安门母亲• 良心犯• 异议人士• 国家安全罪• 强制堕胎• 结石宝宝• 黑监狱• 维权律师• 政治改革• 新闻自由• 司法独立• 宗教自由• 更多专题• 食品安全• 强制拆迁• 新疆• 西藏• 南海• 香港• 台湾• 朝鲜• 中美关系• 中俄关系• 中日关系• 中法关系• 中德关系• 中印关系 牛博山寨 | 华氏451:网络审查的烈焰 原文401,未经授权,无法访问网页。403,禁止访问网页。404,无法找到文件。对绝大多数网民来说,上述几种错误代码尤其是404总难避免,当大多数国内网民已经逐渐对404习惯而到麻木时,一种新的错误代码451又有可能登场。日前,Open Text、XML的联合发明人,Android高级开发者的Tim Bray向IETF(Internet Engineering Task Force)提交了一份新的HTTP状态代码草案,该方案旨在表明某些无法访问是由于政策或法律方面的原因,从而提醒访问者所访问的内容可能正受到国家机器的审查。这并不是一个新问题,Tim Bray特别感谢了不久前一位开发者提出的没有专门针对审查机器的HTTP代码。而早在2008年,就已经有人指出这样的问题,但直到现在,才有Google出面试图为网站开发者和网民解决这样的难题。从有人提出这样的问题到最终解决方案的提出,整整过去了4年。那我们究竟是经过这么久为人们对网络审查的呼吁终于有了回应而高兴呢?还是为这么久以来大多数人对这类问题的漠视而感到气馁呢?Facebook在招股书中提到现在世界上仍有四个国家将自己屏蔽,伊朗拒绝Twitter进入本国市场,Google最近频频推出了各项服务,包括搜索敏感词高亮显示、Gmail提醒遭官方监控以及提交新代码451。在美国同样存在着严重的网络审查,美国土安全部公布的敏感词连Blizzard都包含在内。本来应该是一个自由开放的世界,互联网的精神本来应该是自由平等开放,可为什么到现在,我们却要面临比以往更严重的监控和审查。从2001年爱国者法案颁布以来,国家机器一步步加紧对个人的控制,而这样的行径还是打着正义和权力的旗号进行的。Facebook上的每条信息,Twitter上的每一条发言,无时无刻不被安全部门监控,从SOPA到CISPA,互联网上的监控和审查正一步步损害着互联网的精神。互联网审查的意义到底在哪里?为了防止犯罪、反恐、维护国家安全以进行各种审查监控?这其中的逻辑就是,为了防止可能发生的犯罪行为,所以对每个人都进行审查和监控,公众反而成为了潜在的嫌疑犯对象。那么这不就意味着每个人都是罪犯?以保护公众权益为目的,最终却以侵犯其利益为手段和后果。另外导致的问题则是,如果在审查过程中出现错失怎么办?2011年,美国当局在一次打击儿童色情的行动中,因DNS服务商的疏漏而误封84000多家网站。相同的事故在今年的丹麦又再度上演,包括Google、Facebook在内的8000多家网站再次无法正常访问。如果连Google和Facebook都不能避免成为审查制度下过失牺牲品的话?那些个人网站、博客、用户的权益该怎么办?1984早已过去,但事实上,big brother却依旧无处不在,而且手段更加隐蔽;华氏451度的火焰还未燃尽,然而,公民的隐私和权益却已经遭到审查机器的肆意践踏和破坏。布雷德波利在《华氏451》曾经说过:“不要寻求担保。也不要指望可以在某件事、某个人、某台机器或者某个图书馆中寻求解脱。要自己解救自己,如果你沉溺了,至少在死的时候,你知道自己正在游向岸边。”事实正如此。Tech2IPO新服务: HT实验室 | 创业者服务 | 投资人服务本文由自动聚合程序取自网络,内容和观点不代表数字时代立场 定期获得翻墙信息?请电邮订阅数字时代 [...]