11/1/2004 - Archive
November / December 2004 Vol. 22 / No. 6
Google for Genealogy
Google TM has won the search engine wars, for the time being. Its complete
text search functionality for its database of over 4 billion webpages
has made Google extremely popular. Last year, the search engine processed
more than 112 million search requests per day. This year, Google accounts
for forty percent of all Internet searches performed in the United States.
It has an even larger slice of the search engine pie in other countries‚Äîperforming
sixty-five percent of all searches in the United Kingdom and eighty percent
of all searches in Germany.
Even with more than 4 billion webpages cataloged, Google only indexes
a portion of the Internet. It isn't a perfect tool, but it's the best
tool for the job right now. Perhaps one reason is because Google's catalog
of websites is kept very updated. Its bots do an excellent job of searching
throughout the Internet for new and updated webpages every day.
For genealogists, using Google is a must. Not only does the search engine
provide search capabilities for current webpages, it also provides historical
copies of old webpages that have changed or been removed from the Web.
When you view the results of a Google search, you will see a link labeled
"Cached" under the individual search results. Following this
link will take you to a copy of that webpage stored at Google. This version
shows the website as it appeared when Google last visited it. Faced with
the impermanence of information on the Internet, these cached copies can
be a great help in finding genealogical information that has otherwise
vanished from the Web.
A few tips and tricks for using Google effectively in your genealogical
research are noted in this article. You should visit Google Help Central
at <www.google.com/help/index.html> and educate yourself on using
Google to its full potential. Of particular value are the Basics of Search
and the Advanced Search Tips.
The Google Toolbar
Before the specifics of using the Google website are discussed, a digression
on using the Google Toolbar is in order. The Google Toolbar is a free
download from Google available at <http://toolbar.google.com>. It
allows you to add the search functionality of the Google website to your
Microsoft Internet Explorer browser. By downloading and installing the
toolbar, you are able to do Google searches of the Internet directly from
your browser without having to first visit the Google website. A search
box appears on your browser's toolbar in which you can directly type your
The Google Toolbar has some additional features that make it a handy companion
to Google searches. By customizing the Options on the Toolbar, you can
maintain a drop-down list of prior searches you have performed. This is
great for helping you keep track of what you've already searched for.
In addition, the Highlight button will automatically highlight the exact
individual words in your search parameters anywhere they appear in your
search results. This makes finding the most relevant results much easier.
When you visit a specific website, the Search Site button on the Google
Toolbar allows you to search specifically within that particular website,
based on its domain name. This feature is excellent for further drill-down
searching once you have located a likely website and want to search more
deeply on that site alone.
While not specific to genealogy, another very useful feature of the Google
Toolbar is its Popup Blocker. This will close down the popup advertisements
that appear when you visit some websites. Be aware that some websites
generate new windows for reasons other than advertisements (such as database
search results). These new non-advertisement windows may also sometimes
be blocked in error by the Popup Blocker. The Google Toolbar allows you
to individually unblock popups on specific websites by visiting that site
and clicking the Popup Blocker button.
The Google website can be customized to fit your needs. Information on
how to customize the Google website can be found at <www.google.com/help/customize.html>.
The customizations themselves are made at <www.google.com/preferences>.
Customization allows you to specify the language you want to have your
searches returned in. The Safe Search Filtering blocks pornographic search
results from being returned. Perhaps the most useful customization is
for changing the number of results displayed per page. The default number
of search results per page is ten, but the maximum allowed is one hundred.
Set your number of results per page to the number that best balances the
number of results per page versus the time it takes Google to render the
results page for you.
Google is not case-sensitive regarding the parameters of your search.
A search on the uppercase "SMITH" will generate the same 40
million results as a search on the lowercase "smith." But the
Boolean operators used to qualify search parameters‚ÄîAND,
OR, and NOT‚Äîmust always be uppercase. Boolean operators
are used to broaden or narrow a search by specifying how the keywords
in the search parameters must relate to one another.
Google automatically defaults to the Boolean operator AND when you use
multiple words in search parameters. Thus, "Smith genealogy"
is exactly the same search as "Smith AND genealogy." Google
will allow up to about ten individual words for search parameters. If
you use more than ten, the remaining words are ignored. Use distinct words
for search parameters whenever possible. When searching for a common name
such as "John Smith," adding a location and time to the search
parameter such as "John Smith Moonshine Holler Missouri 1902"
will produce a more effective result. In Google, the Boolean operators
may be represented by mathematical symbols as well as by conjunctions.
Thus + is the same as AND, | represents OR, and ‚Äì
The NOT Boolean operator is particularly useful if your genealogy includes
a famous surname. If you have Jefferson ancestry, Google may return a
great deal of information regarding the author of the Declaration of Independence.
To avoid this, you could search " Jefferson genealogy NOT Thomas"
to avoid information on the famous redhead. If you have a less famous
Thomas Jefferson in your family tree, you could avoid results including
the third U.S. president by searching for "Thomas Jefferson genealogy
NOT Virginia." Using a location parameter is a good way to focus
The OR operator allows you to waffle on genealogical searches when you
are not sure about your information. If you have a John Schmidt who may
have also gone by the first name Johannes, you can search Google using
"John OR Johannes AND Schmidt" to cover both possibilities.
As with non-Internet genealogical research, all possible combinations
need to be researched, including nicknames and abbreviations.
To keep a phrase together in the search parameters, surround the search
with quotation marks. For example, if you had an ancestor who reputedly
survived the Great Molasses Flood of 1919 and you wanted to learn more
about the event, you will get more relevant results by enclosing "great
molasses flood" in quotations than by letting Google search for them
as individual words located somewhere on the same page but not necessarily
The Google Advance Search page at <www.google.com/advanced_search>
provides a form that can be used to invoke the Boolean operators without
having to type them in the search parameters yourself. "Find results
with all the words" corresponds to using the AND operator. "Exact
phrase" is the equivalent of using quotation marks to keep words
together in a search. "At least one of the words" is OR and
"without the words" is NOT.
Of particular use for genealogists is the Date, Occurrences, and Domain
advance search features. The Date field allows you to search only for
the webpages Google has found to be updated in the past three months,
six months, or one year. If you are being consistent with your genealogical
searches over time, this feature can be very handy for repeating your
standard searches on a periodic basis but limiting the search to only
those websites that have been updated.
The Occurrences field allows you to specify searches anywhere in any page,
in only the title of the pages, in only the text, in only the URL, or
in only the links on the pages. Specifying these types of searches can
be useful if you already know a title of a webpage and are trying to find
it again or if you know of a particular webpage that references some genealogical
information in its text but does not include it in its title or URL.
The Domain field is great for searching genealogy sites with large amounts
of content. If you remember that the FamilySearch site has a German word
list full of German genealogical words but can't seem to find it at FamilySearch,
the Domain field to search for "German Word List" is at <www.familysearch.org>
only. Notice that in the search box on the results page for the advanced
search for the German Word List appears as ‚Äò"German
Word List" site:www.familysearch.org. This shows that the syntax
for the Domain Only search is "site:www.familysearch.org."
Other advance search syntax you can type directly into the search box
includes "daterange:" for Date, and "intitle:", "allinurl:",
and "intext:" for Occurrences. Whether you use the Advanced
Search page or type in the syntax yourself is up to you. Either way, if
you remember visiting a website whose URL included the phrase "smith
genealogy" but you can't remember the exact URL, you can search Google
with the syntax "allinurl: smith genealogy" to find it again.
In the search results returned by Google, you will notice that there are
Sponsored Links on the upper righthand corner of the results page. These
are paid advertisements placed there based on one or more of the keywords
you searched on. Be aware that these sponsored links may or may not be
relevant to your search.
As with your offline research, you should be keeping a research log of
what you have searched for through Google. If you keep your word processor
open at the same time you use your browser to search with Google, you
can easily copy and paste your search syntax from every search and thus
keep an exact record of past searches. A research log with a listing of
past searches helps you resubmit identical searches three months or six
months later when webpages have changed, new webpages have appeared, and
Google's index is updated with this new information. Remember to use the
Advanced Search Date field or the "daterange:" syntax to avoid
getting results you have already seen.
Always try alternate word choices in your searches. Use not only genealogy
but also "family history" and "family tree." Different
webmasters will title similar webpages differently so try to out-think
them. Misspellings are also common on the Internet so don't forget geneology
for genealogy and cemetary for cemetery. Abbreviations need to be taken
into consideration in your searches as well. Remember CO for county and
Reg. or Reg't for regiment. Finally, initials and nicknames in place of
full names are typical so try searching on F.X. for Francis Xavier and
Tom for Thomas.
Google is an extremely powerful search engine, and you'll need to educate
yourself in its various features to use it most effectively in your family
research. Just as you had to learn how to use the microfilm readers when
you first visited your local library or Family History Center, you must
familiarize yourself with the ins and outs of Google's capabilities. By
just typing in a name and hoping for the best, you are not letting Google
do the heavy lifting for you.
Mark Howells Googles at firstname.lastname@example.org.
Access the table of contents for this issue of Ancestry Magazine
GOTO: 10 Common Mistakes in Genealogy
RETURN TO GENEALOGY