the deep web

Deep Web, deep insights - why you need a librarian

March 8, 2017
Clare Brown

It seems timely to revisit this blog post which discusses the importance of library and information professions in their role of uncovering the deep web for the benefit of end-users.

Deep Web, Deep Insights

The original webinar was headed up by John DiGilio, Senior Director of Research & Intelligence at LibSource. He was joined by his colleague Katherine Henderson, a LibSource Research Analyst.

Quite aptly, John used the analogy of an iceberg to describe the different types of web - surface, deep and dark. It’s estimated that 80% of all web content actually sits below the surface, meaning that what we see through traditional engines such as Google and Bing is, quite literally, just the tip of the iceberg.

LibSource Deep Web Deep Insights Webinar

As a society, we’re throwing around the term ‘information overload’ like it’s going out of fashion but this statistic really puts that into perspective. As John points out, when people talk about ‘information overload’ they haven’t even experienced a quarter of what’s really available on the web. In fact, the issue of overload is so much more severe that one may initially realise.

How’s that for a reality check? 

What is the deep web?

When we talk about the surface web, we are referring to what one would naturally from a surface level search through, i.e. our most popular search engines. This makes up around 20% of the content out there.

There is more to the internet than meets the eye, with its three distinct layers of depth. The Surface Web, occupying 10% of the internet, contains those websites with visible contents resulting from search engine indexing. These searchable, publicly available pages can be accessed from a standard web browser and connect to other pages using hyperlinks. However, information is being overlooked that was never intended to be hidden. This information, invisible to regular search engines, requires persistence and specialized search tools to locate. Beyond the Surface Web exist the Deep Web and the Dark Web Library application of Deep Web and Dark Web technologies (May 2020)

The deep web, on the other hand, goes that step further to cover the parts of the web whose content isn’t indexed by the Googles and Bings of the world. This may be due to content being behind a paywall (such as video on demand services like Netflix), it could be online banking, or web mail or many other use cases. John cited the computer scientist Mike Bergman as being the man who created the term back in the year 2000, and we have seen its usage and relevance grow ever since.

I found it fascinating to note that no single search engine will index more than, approximately, 16% of the surface web. Yet, of course, your typical student - or indeed typical professional conducting their own research, I might add - is unlikely to use more than one search engine when searching for information and resources. According to BrightPlanet, the deep web is 5000 times larger and contains 1000-2000 times better quality information than the surface web. What’s more, John cites 95% of deep web information as being publically available. So, why aren’t we making better use of it?

Of course, some information is inaccessible due to paywalls and other barriers but, mostly, benefiting from the deep web mainly requires speciality searches. You can use the same standard web browser as usual, be it Firefox, Chrome, Internet Explorer or otherwise. Instead, the focus is on leveraging specialist knowledge. You may need to head to a particular deep web search engine, run a search behind a paywall or know exactly what resource to start with.

From reluctant end-users to technology superstars?

Why does the deep web matter?

Clearly, leveraging the deep web requires significantly more effort than your standard surface search. Plus, there’s even more information overload to contend with. What’s the point?

John answered this beautifully:

“Ultimately, the question is the answer” John DiGilio (2017)

Everyone is using the same set of standard search engines to search through the same set of information. Whether they be your child studying for a school research project or an attorney fact checking for a case, they are all doing one thing - sticking to the status quo.

“The status quo, or the average, is not what we strive for in this industry. We cannot afford to.” John DiGilio (2017)

In a knowledge based industry, sticking to the status quo could be a very expensive mistake. Here are some reasons from John as to why:

  • Competitive fast paced markets - in this industry, we are always striving to achieve that leading edge. Each day we hear more about competitive intelligence and to get the market edge, you need to go beyond the status quo.
  • Digital knowledge driven economy - if you’re only searching on the surface level, you’re only providing the same information that everybody else has access to. You need to go beyond this to add value.
  • Customer expectations - especially in the professions, your customers are hiring you because your knowledge and skillset is superior to their’s in your particular field. Your customers can conduct a surface search themselves, they are expecting you to go above and beyond what they are able to do.
  • Demonstrable value - quite simply, the deep web brings value that is far beyond what the typical web searcher is capable of. It is your role, as a Librarian, to go beyond ‘typical’.

LibSource Deep Web Deep Insights Webinar

How do I access the deep web?

Just a simple glance at the challenges awaiting deep web explores makes the need for an expert guide more clear than ever.

  1. Firstly, there is such a vast amount of information in the deep web, it can be incredibly easy to get lost. We know how distracting information overload can be in the surface web, in the deep web it becomes four times harder. Going off on a tangent and getting stuck there is more of a risk than ever before.
  2. Then there are the issues of reliability and trust, as with all content. As John asked, how do we know who is making these materials available online? Do they have a specific bias? Is the content even authorised to be shared publically?
  3. Thirdly, searcher limitations are a key consideration. The average user won’t even know how to begin to access the deep web, let alone navigate their way through it. Many crucial insights will be hidden behind paywalls and searches will often appear to be at a dead end. At times, deep web content may even require you to meet certain legal specifications to start using and sharing information from particular databases.

This is where we need information experts

We’ve written before about the power of Librarians on this blog, and this need is more prominent now than ever before. The information industry is continuously growing and evolving, and so too are the needs and requirements of the people working within it. There’s great accolade and advantage to be won for those who rise to its challenges.

Librarians are information experts who are able to master search whilst also leverage technology to their advantage. As John rightly points out, not only do you need to know how to find material and how to use it but you also need to let people know what it is you do, how you do it and that you are there to serve as a resource.

In fact, John went on to hypothesise that we are at a tipping point of actual information overload and do-it-yourself research effectiveness. He suggested that:

“[We] need the guidance of people trained in library and information science to help make sense of it all, to facilitate skilled information retrieval and to collaborate with information requesters for evaluation and further report.” John DiGilio (2017)

In 2014, the Australian Library and Information Association found there to be a $5.43 for every dollar invested in Librarians and library services. With our ever growing wealth of information to sort through, and new technologies waiting to be utilised at our fingertips, this figure may well become significantly higher.

I'm bored with technology   Why should it matter to me?