The Dark Web: Guidance for journalists

We had a lot of coverage of “the dark web” with the latest Ashley Madison leak coverage. Because a link to a torrent was being shared via a Tor page (well, nearly – actually most people were passing around the Tor2Web link), journalists were falling over themselves to highlight the connection to the “dark web”, that murky and shady part of the internet that probably adds another few % to your click-through ratios.

So many outlets and journalists – even big outfits like BBC News and The Guardian – got their terminology terribly wrong on this stuff, so I thought I’d slap together some guidance, being somewhat au fait with the technology involved. Journalists are actually most of the reason why these sorts of tools exist in the first place, in fact – if that surprises you, read on…

The Dark, Deep Internet

What the hell is “the dark web” anyway? Why is it different from the “deep web”? Why, for that matter, does it differ from the “web”?

First up, to clarify: “the dark web” and “darknets” are practically the same thing, and the terms are used interchangeably.

So: The Deep Web and The Web are technically the same. People often refer to the deep web when they are referring to websites (that is, sites on the internet) that are hard to find with normal search engines because they are not linked to in public. Tools like Google depend on being able to follow a chain of links to find a website – if there’s no links that Google can see, it’s not going to get into the Google index, and so will not be searchable. These sites are still on the internet, though, and anyone who is given the link can put that in a perfectly normal browser and reach that site.

The Dark Web, however, refers to a different technical domain. Dark web or “darknet” sites are only reachable using a tool that encrypts and re-routes your traffic, providing a degree of anonymity. These tools we typically call “anonymity networks”, or “overlay networks”, as they run on top of the internet’s infrastructure. You need to be a part of this network to be able to reach content in the “dark web”. The dark web refers to lots of different tools – Tor is the most widely known, but isn’t all about the dark web, as we’ll learn shortly. I2P and Freenet are two other well-known examples of overlay networks. It’s worth noting that these networks don’t interoperate – the Tor darknet can’t talk to the I2P darknet, as they use radically different technical approaches to achieve similar results.

The Onion Router, Clearnet and Darknet

Map from the Oxford Internet Institute showing Tor usage across the world
Map from the Oxford Internet Institute showing Tor usage across the world

Tor (The Onion Router) is a peer to peer, distributed anonymization network that uses strong cryptography and many layers of indirection to route traffic anonymously and securely around the world. Most people using Tor are using it as a proxy for “clearnet” sites; others use it to access hidden services. It’s by far the most popular darknet.

From a darknet perspective, clearnet is the real internet, the world wide web we all know and love. The name refers to the fact that information on the clearnet is sent “in the clear”, without any encryption built into the network protocols (unlike darknets, where encryption is built into the underlying network).

Tor is a technical tool, and is used primarily as a network proxy. To use Tor a client is installed, which will connect to the network. This same client can optionally relay traffic from other clients, expanding the network. As of this post there are about 6500 relays in the Tor network, and 3000 bridges – these bridges are not publicly listed, making it hard for hostile governments to block them, and so allowing users in hostile jurisdictions to connect to the network.

The Tor project also provides the Tor Browser Bundle, which is a modified version of Firefox ESR (Extended Support Release) that contains a Tor client and is configured to prevent many de-anonymization attacks that focus on exploiting the client (for instance, forcing non-Tor connections to occur to a site under the attacker’s control using plugins like Flash or WebRTC, allowing correlation between Tor and clearnet traffic to identify users). This is the recommended way to use Tor for browsing if you’re not using TAILS.

TAILS is a project related to Tor that provides a “live system” – a complete operating system that can be started and run from a USB stick. TAILS stands for The Amnesiac Incognito Live System – as the name suggests, it remembers nothing, and does all it can to hide you and your activity. This is by far the most robust tool if you’re aiming to protect your activity online, and is used widely by journalists across the world, as it’s easy to take with you and hide – even in very hostile environments.

Hiding from the censors

On the internet it’s reasonably easy to find out where a website is hosted, who’s responsible for it, and from there it’s easy for law enforcement to shut it down by contacting the hosts with the right paperwork. It’s also normally quite easy from that point to find out who was running a website and go after them, though there’s plenty of zero-knowledge hosts out there who will accept payment in cash or Bitcoin, ask no questions and so on.

There’s another facet to this – if you’re a government trying to block websites, it’s very easy to look at traffic and spot traffic destined for somewhere you don’t like, and either block it or modify the contents (or simply observe it). This is common practice in countries like Iran, China, Syria, Israel, and quite a lot of the reason why Tor exists – the adoption of this filtering technology by countries like the UK, ostensibly to prevent piracy, limit hate speech or “radical/extremist views”, or to protect children, is driving Tor adoption in the west, too.

Hidden services (and while Tor is the most commonly cited example, other networks support similar functionality) effectively use the same approach they use to hide the origin of traffic destined for the clearnet to hide both the origin and source of traffic between a user and a hidden service. Unless the hidden service itself offers a clue as to its owners or location, then users of that service can’t identify where that hidden service is operated from. Likewise, the operators of the hidden service can’t see where their users come from. Traffic between the two ends meets in the middle at a randomly picked rendezvous point, which also has no knowledge of what’s being transferred or where it’s come from or going to.

This allows for the provision of services within the darknet entirely, removing the need for the clearnet. This has many advantages – mainly, if your Tor exit node for a session happens to be in Russia, you’re likely to see Russian censorship as your traffic leaves Tor and enters the clearnet. If your traffic never reaches the clearnet, government censorship is unable to view and censor that traffic. It’s also very hard for governments monitoring darknets to reach out and shut down sites that are hosted in their jurisdiction – because they don’t know which sites are in their jurisdiction.

Increasingly, legitimate sites have started to offer hidden service mirrors or proxies, allowing Tor users to browse their content without leaving the network. Facebook, ironically, was one of the first major sites to offer this, targeting users in jurisdictions where network tampering is common. The popular search engine DuckDuckGo is another example.

Designed for criminals, or just coincidentally useful?

Of course, there are some criminal users of these networks – just as there are criminal users of the internet, and criminal users of the postal service, and criminal users of road networks. But was Tor made for criminal purposes?

Short answer, no. The long answer is still no – Tor was originally developed by the United States Naval Research Laboratory, and development has been subsequently funded by a multitude of sources, mostly related to human rights and civil liberty movements, including the US State Department’s human rights arm. Broadcasters increasingly fund Tor’s development as they try and find new ways to reach markets traditionally covered by border-spanning shortwave broadcasts. You can read up on Tor’s sponsors here.

The point is, Tor and other networks like I2P and Freenet were never designed with criminals in mind, but rather with strong anonymity and privacy in mind. These properties are technical, and define how the tool is designed and developed. These properties are vital for the primary users of these tools, and are intrinsically all-or-nothing.

This is an important point, and one that crops up again and again in both discussions of Tor and when discussing things like government interception of encryption, or “banning” encryption unless it’s possible for the government to subvert it “in extremis“, as has been called for numerous times by the UK government, to give one example.

On a technical level, and a very fundamental one at that, one cannot make a tool that is simultaneously resistant to government censorship and traffic manipulation/interception and also permits lawful intercept by law enforcement authorities, because these networks span borders, and one person’s lawful intercept is another person’s repressive government. There is a lot of technical literature out there on why this is an exceptionally hard problem and practically infeasible, so I won’t go  into detail on this. However, key escrow (the widely accepted “best” approach – though still highly problematic) has been attempted in the past by the NSA and the Clipper chip – and it failed spectacularly.

The Clipper chip die, implementing the short-lived SKIPJACK cipher and key escrow functionality, allowing in theory only the US Government to intercept and decrypt traffic. Within 3 years it had been comprehensively broken and abandoned.

These properties of anonymity and security also make the services attractive to certain types of criminals, of course, but in recent reports such as this one from RAND on DRL (US State Dept) funded Tor development, the general conclusion is that Tor doesn’t help criminals that much, because there’s better tools out there for criminal use than Tor:

There is little reported evidence that the Internet freedom tools funded by DRL [ie: Tor] assist illicit activities in a material way, vis-à-vis tools that predated or were developed without DRL funding…

… given the wealth and diversity of other privacy, security, and social media tools and technologies, there exist numerous alternatives that would likely be more suitable for criminal activity, either because of reduced surveillance and law enforcement capabilities, fewer restrictions on their availability, or because they are custom built by criminals to suit their own needs – RAND Corporation report

Law enforcement efforts to shut down darknet sites like Silk Road (and its many impersonators – there are by some estimates now several hundred sites like it that sprung up in the aftermath of its shutdown) tend to focus on technical vulnerabilities in the hidden service itself – effectively breaking into the service and forcing it to provide more information that can be used to identify it. Historically, however, most darknet site takedowns have been social engineering victories – where the people running a site are attacked, rather than the site itself.


I hope the above is useful for journalists and others trying to get a basic understanding of these tools beyond using scary terms like “the dark web” in reports without really knowing what that means. If you want to find out more then the links below are a good starting point.