
Vol. 77, No. 9, September 
2004
Searching Smarter:
Finding Legal Resources on the Invisible Web
A great deal of information available on the Internet is found only 
on the "invisible Web," and is not searchable using a general search 
engine such as Google. Invisible Web content is considered dynamic 
because it exists as pieces of information within a database until you 
pull it together. Learn what strategies you can use to efficiently 
locate this content.
 
Sidebars:
by Bonnie Shucha
By now, most attorneys have discovered that the Internet can be a 
powerful tool for legal research. Increasingly, Web search engines like 
Google have moved up in the ranks of computer-assisted legal research 
tools alongside more expensive resources such as LexisNexis and Westlaw. 
Even some judges are using the Web to check facts and statistics 
presented by attorneys and are reporting their findings in written 
opinions.1
|  | 
| Shucha | 
Bonnie Shucha is the 
reference and electronic services librarian at the U.W. Law Library. She 
is past president of the Law Librarians Association of Wisconsin. She 
may be contacted at bjshucha@wisc.edu.
 
As the Web makes its way into the courtroom, it is important that 
legal practitioners know how to search it effectively. Unfortunately, 
this is not always the case. A new study reveals that while 
professionals are spending increasingly larger amounts of time doing 
computer-based searching, most are dissatisfied with their search 
experience.2
It is estimated that most searchers locate only 0.03 percent - or 1 
in 3,000 - of the Web pages available to them.3 Although such results may be due, in part, to a 
poorly constructed search, a large portion of the blame also falls on 
the search engine itself. Even the most experienced searcher, using the 
largest search engines, can access only about 16 percent of all Web 
content. Why? Because 84 percent of the information available on the 
Internet is found only on the "invisible Web," also known as the "deep 
Web," and is not searchable using a general search engine such as 
Google.4
By recognizing how the invisible Web differs from other Web content, 
you will understand how to alter your search strategies to find this 
information in a time-efficient manner. This article investigates the 
nature of the invisible Web and offers strategies for locating invisible 
Web content.
What is the Invisible Web?
To understand the concept of the "invisible Web," it may be helpful 
to first explore the nature of the "visible Web." A visible Web page is 
one that exists in "static" or unchanging form until its creator alters 
it. In this way, it is similar to a document that you might create in a 
word processor. Both types physically exist as files on a computer: the 
word processed document might be saved as a .doc or .wpd file on your 
hard drive whereas a visible Web page might be stored as a .htm or .html 
file on a Web server.
These static Web pages are considered visible because standard search 
engines are able to index them, that is display them as search results. 
Most search engines index new documents in one of two ways: 1) by using 
automated "spiders" or "crawlers" to follow links from other documents 
that are already indexed; or 2) when a webmaster registers a Web page. 
Because this method of indexing documents is well established and 
relatively inexpensive, most search engines draw primarily upon visible 
Web content.
As you now know, there is another type of Web content known as the 
"invisible Web." Most invisible Web content is considered "dynamic" 
because it consists of bits of information that are stored in a database 
and pulled together on-the-fly into a Web page at your request.5 Invisible Web pages don't actually exist until you 
submit a query to the database containing the information and the 
matching information is drawn together into a Web page. Usually, an 
invisible Web search is conducted via a specialized search interface, or 
search box, provided by the database creator.
This concept is somewhat similar to the mail merge feature in most 
word processors. In a mail merge, content is drawn from an outside data 
source, such as an Excel file, and inserted into a new, customized 
document. Like an invisible Web page, the mail merged document did not 
previously "physically" exist as a stored file on a computer. Rather, 
both types are created at the point of need.
Because it is dynamic, or "physically" nonexistent, most conventional 
search engines are unable to retrieve invisible Web content. Traditional 
methods of indexing that are based on following links from other 
documents or webmaster registration are inadequate because they rely on 
the existence of a static file. Once most search engine spiders hit a 
database's search form, they are forced to stop because user input is 
required. Conventional search engines are simply not capable of 
automatically generating that input in the form of a search. As one 
author notes, "It's not that the information is really hidden or 
invisible. It's there, freely available and waiting to be found. The 
problem is that general search engines are built in such a way that they 
just can't go into [a] database and search the information contained 
[there]."6
Besides dynamic Web pages, other types of content also are considered 
"invisible." Very recently created static Web pages are effectively 
invisible because search engines' spiders have not yet had a chance to 
index them. It is estimated that it takes three to four months before a 
static Web page is indexed by a search engine.7 Password protected information also may be 
considered part of the invisible Web because search engines are unable 
to access and index this content without the proper authorization. Such 
content might include information within subscription databases such as 
LexisNexis and Westlaw or confidential business databases.
What Type of Content Is (and Is Not) Freely Available on the 
Invisible Web?
Fortunately, the vast majority of invisible Web content, 95 percent, 
is publicly accessible, free information. Studies reveal that the 
quality of documents found on the invisible Web often exceeds that of 
documents that are accessible via conventional search engines.8 This includes a vast amount of legal and 
governmental documents such as case law, statutes, bills, regulations, 
patents, briefs, census data, government reports, treaties, and much 
more. A great deal of business and corporate data also is available on 
the invisible Web, including SEC filings, stock quotes, company 
profiles, and so on. More general types of information can be found on 
the invisible Web, such as address and phone number directories, flight 
schedules, dictionary definitions, maps, and more.
Searchers should be aware that much of the law-related information 
that is freely available on the invisible (and visible) Web is material 
that is in the public domain. Despite assertions by some novice 
researchers that "everything is free on the Web," there are certain 
types of content that are unlikely to be found at no cost on the Web. 
These include books and articles that are published for profit, public 
domain documents that have editorial enhancements, or other 
authoritative materials that are considered to be someone's intellectual 
property.9
Tips for Finding Invisible Web Content
The first step in locating any type of information is considering 
where an authoritative source of that information might be found. It may 
be a print source, the Web (visible or invisible), a subscription 
database, a phone call, and so on. If it is available from more than one 
source, you will need to consider what will be the quickest, most cost 
effective way to obtain it.
If you determine that the information you need might be available on 
the invisible Web, how do you find it? Fortunately, there is nothing 
magical about finding content on the invisible Web. It's simply a matter 
of knowing where to look. Consider that:
- a great deal of excellent legal and business information is freely 
available on the Internet;
- unfortunately, much of it is contained within databases and is, 
therefore, invisible or inaccessible to most conventional search 
engines;
- the most effective way to access this information is using the 
database's own search interface, or search box;
- fortunately, the search box is usually found on a static, visible 
Web page that is accessible using a conventional search 
engine.
The following scenario may help illustrate: You have been asked to 
locate Wisconsin statutes concerning livestock. You don't have a copy of 
the Wisconsin Statutes in print, but you think that they might be 
available on the Internet. You go to Google and do a search containing 
the keywords "livestock statutes Wisconsin." You find some interesting 
information about your topic and references to the statutes, but not the 
statutes themselves.
Take a moment to reconsider the search. If you were doing the 
research using print sources, you would first locate a copy of the 
Wisconsin Statutes, then search the index for your keyword, "livestock." 
The same strategy applies when doing research on the Web. Because 
providers of legal and business information often publish their 
collections within "invisible" databases, it is more effective to first 
limit your search to find the provider's "visible" search interface 
page. Once there, you could query the database using your specific 
keywords. As one author notes, "often the key to the answer is not 
locating the answer itself as the first step, but locating the right 
database in which to search for it."10
Back at Google, you decide to try your search again, this time using 
just the keywords "Wisconsin Statutes." The very first item in your 
search results is the freely available Revisor of Statutes Bureau's 
(RSB) Wisconsin Statutes database. In the RSB's search box, you proceed 
with your specific search for "livestock" and successfully locate the 
specific statutes that you need.
Chances are that you have already used invisible Web content without 
realizing it. Maybe a librarian directed you to the RSB's Wisconsin 
Statutes page or perhaps you saw a link on WisBar. With a better 
understanding of why some Web content is considered "invisible" and a 
knowledge of the strategies used to locate it, you will be able to 
search smarter and get maximize value from the time you spend 
researching on the Web.
Endnotes
1Michael Pena, Google's 
Domain Even Takes in Law Offices, East Bay Bus. Times, May 7, 
2004; Declan McCullagh, Search 
Engines Take the Stand, CNET News.com, May 13, 2004.
2Delphi 
Research Asks: Does Search Contribute to Productivity?, 
DelphiWeb.com Newsflash, May 5, 2004.
3Michael K. Bergman, The Deep Web: 
Surfacing Hidden Value, 7 J. Elec. Pub. 1 (August 2001).
4Id.
5One way to recognize if a Web page 
is static or dynamic is to look at its URL, or Web address. Static pages 
often contain the extension .htm or .html. Dynamic pages usually include 
a question mark indicating that the resulting Web page is based on a 
database query.
6Diana Botluk, Mining Deeper Into the 
Invisible Web, LLRX.com, Nov. 15, 2000.
7See Berman, 
supra note 3.
8Id.
9There are a few exceptions to this 
rule. Several news sources, such as the New York Times, offer 
full-text content on their Web sites. Usually, however, this includes 
current content only and the reader is subjected to numerous 
advertisements. Additionally, some authors and publishers have chosen to 
offer their content free on the Web for several reasons: to facilitate 
the distribution of scholarly information, to market themselves, or to 
put forth their own point of view. Beware the latter.
10See Botluck, 
supra note 6.
Wisconsin 
Lawyer