|
Ancestry Magazine
11/1/2004 - Archive
November / December 2004 Vol. 22 / No. 6
Google for Genealogy
Google TM has won the search engine wars, for the time being. Its complete
text search functionality for its database of over 4 billion webpages has
made Google extremely popular. Last year, the search engine processed more
than 112 million search requests per day. This year, Google accounts for forty
percent of all Internet searches performed in the United States. It has an
even larger slice of the search engine pie in other countries—performing
sixty-five percent of all searches in the United Kingdom and eighty percent
of all searches in Germany.
Even with more than 4 billion webpages cataloged, Google only indexes a portion
of the Internet. It isn't a perfect tool, but it's the best tool for the job
right now. Perhaps one reason is because Google's catalog of websites is kept
very updated. Its bots do an excellent job of searching throughout the Internet
for new and updated webpages every day.
For genealogists, using Google is a must. Not only does the search engine
provide search capabilities for current webpages, it also provides historical
copies of old webpages that have changed or been removed from the Web.
When you view the results of a Google search, you will see a link labeled
"Cached" under the individual search results. Following this link
will take you to a copy of that webpage stored at Google. This version shows
the website as it appeared when Google last visited it. Faced with the impermanence
of information on the Internet, these cached copies can be a great help in
finding genealogical information that has otherwise vanished from the Web.
A few tips and tricks for using Google effectively in your genealogical research
are noted in this article. You should visit Google Help Central at <www.google.com/help/index.html>
and educate yourself on using Google to its full potential. Of particular
value are the Basics of Search and the Advanced Search Tips.
The Google Toolbar
Before the specifics of using the Google website are discussed, a digression
on using the Google Toolbar is in order. The Google Toolbar is a free download
from Google available at <http://toolbar.google.com>. It allows you
to add the search functionality of the Google website to your Microsoft Internet
Explorer browser. By downloading and installing the toolbar, you are able
to do Google searches of the Internet directly from your browser without having
to first visit the Google website. A search box appears on your browser's
toolbar in which you can directly type your Google searches.
The Google Toolbar has some additional features that make it a handy companion
to Google searches. By customizing the Options on the Toolbar, you can maintain
a drop-down list of prior searches you have performed. This is great for helping
you keep track of what you've already searched for. In addition, the Highlight
button will automatically highlight the exact individual words in your search
parameters anywhere they appear in your search results. This makes finding
the most relevant results much easier.
When you visit a specific website, the Search Site button on the Google Toolbar
allows you to search specifically within that particular website, based on
its domain name. This feature is excellent for further drill-down searching
once you have located a likely website and want to search more deeply on that
site alone.
While not specific to genealogy, another very useful feature of the Google
Toolbar is its Popup Blocker. This will close down the popup advertisements
that appear when you visit some websites. Be aware that some websites generate
new windows for reasons other than advertisements (such as database search
results). These new non-advertisement windows may also sometimes be blocked
in error by the Popup Blocker. The Google Toolbar allows you to individually
unblock popups on specific websites by visiting that site and clicking the
Popup Blocker button.
Customizing Google
The Google website can be customized to fit your needs. Information on how
to customize the Google website can be found at <www.google.com/help/customize.html>.
The customizations themselves are made at <www.google.com/preferences>.
Customization allows you to specify the language you want to have your searches
returned in. The Safe Search Filtering blocks pornographic search results
from being returned. Perhaps the most useful customization is for changing
the number of results displayed per page. The default number of search results
per page is ten, but the maximum allowed is one hundred. Set your number of
results per page to the number that best balances the number of results per
page versus the time it takes Google to render the results page for you.
Boolean Operators
Google is not case-sensitive regarding the parameters of your search. A search
on the uppercase "SMITH" will generate the same 40 million results
as a search on the lowercase "smith." But the Boolean operators
used to qualify search parameters—AND, OR, and NOT—must
always be uppercase. Boolean operators are used to broaden or narrow a search
by specifying how the keywords in the search parameters must relate to one
another.
Google automatically defaults to the Boolean operator AND when you use multiple
words in search parameters. Thus, "Smith genealogy" is exactly the
same search as "Smith AND genealogy." Google will allow up to about
ten individual words for search parameters. If you use more than ten, the
remaining words are ignored. Use distinct words for search parameters whenever
possible. When searching for a common name such as "John Smith,"
adding a location and time to the search parameter such as "John Smith
Moonshine Holler Missouri 1902" will produce a more effective result.
In Google, the Boolean operators may be represented by mathematical symbols
as well as by conjunctions. Thus + is the same as AND, | represents OR, and
– means NOT.
The NOT Boolean operator is particularly useful if your genealogy includes
a famous surname. If you have Jefferson ancestry, Google may return a great
deal of information regarding the author of the Declaration of Independence.
To avoid this, you could search " Jefferson genealogy NOT Thomas"
to avoid information on the famous redhead. If you have a less famous Thomas
Jefferson in your family tree, you could avoid results including the third
U.S. president by searching for "Thomas Jefferson genealogy NOT Virginia."
Using a location parameter is a good way to focus a search.
The OR operator allows you to waffle on genealogical searches when you are
not sure about your information. If you have a John Schmidt who may have also
gone by the first name Johannes, you can search Google using "John OR
Johannes AND Schmidt" to cover both possibilities. As with non-Internet
genealogical research, all possible combinations need to be researched, including
nicknames and abbreviations.
To keep a phrase together in the search parameters, surround the search with
quotation marks. For example, if you had an ancestor who reputedly survived
the Great Molasses Flood of 1919 and you wanted to learn more about the event,
you will get more relevant results by enclosing "great molasses flood"
in quotations than by letting Google search for them as individual words located
somewhere on the same page but not necessarily together.
Advanced Search
The Google Advance Search page at <www.google.com/advanced_search> provides
a form that can be used to invoke the Boolean operators without having to
type them in the search parameters yourself. "Find results with all the
words" corresponds to using the AND operator. "Exact phrase"
is the equivalent of using quotation marks to keep words together in a search.
"At least one of the words" is OR and "without the words"
is NOT.
Of particular use for genealogists is the Date, Occurrences, and Domain advance
search features. The Date field allows you to search only for the webpages
Google has found to be updated in the past three months, six months, or one
year. If you are being consistent with your genealogical searches over time,
this feature can be very handy for repeating your standard searches on a periodic
basis but limiting the search to only those websites that have been updated.
The Occurrences field allows you to specify searches anywhere in any page,
in only the title of the pages, in only the text, in only the URL, or in only
the links on the pages. Specifying these types of searches can be useful if
you already know a title of a webpage and are trying to find it again or if
you know of a particular webpage that references some genealogical information
in its text but does not include it in its title or URL.
The Domain field is great for searching genealogy sites with large amounts
of content. If you remember that the FamilySearch site has a German word list
full of German genealogical words but can't seem to find it at FamilySearch,
the Domain field to search for "German Word List" is at <www.familysearch.org>
only. Notice that in the search box on the results page for the advanced search
for the German Word List appears as ‘"German Word
List" site:www.familysearch.org. This shows that the syntax for the Domain
Only search is "site:www.familysearch.org."
Other advance search syntax you can type directly into the search box includes
"daterange:" for Date, and "intitle:", "allinurl:",
and "intext:" for Occurrences. Whether you use the Advanced Search
page or type in the syntax yourself is up to you. Either way, if you remember
visiting a website whose URL included the phrase "smith genealogy"
but you can't remember the exact URL, you can search Google with the syntax
"allinurl: smith genealogy" to find it again.
Other Considerations
In the search results returned by Google, you will notice that there are Sponsored
Links on the upper righthand corner of the results page. These are paid advertisements
placed there based on one or more of the keywords you searched on. Be aware
that these sponsored links may or may not be relevant to your search.
As with your offline research, you should be keeping a research log of what
you have searched for through Google. If you keep your word processor open
at the same time you use your browser to search with Google, you can easily
copy and paste your search syntax from every search and thus keep an exact
record of past searches. A research log with a listing of past searches helps
you resubmit identical searches three months or six months later when webpages
have changed, new webpages have appeared, and Google's index is updated with
this new information. Remember to use the Advanced Search Date field or the
"daterange:" syntax to avoid getting results you have already seen.
Always try alternate word choices in your searches. Use not only genealogy
but also "family history" and "family tree." Different
webmasters will title similar webpages differently so try to out-think them.
Misspellings are also common on the Internet so don't forget geneology for
genealogy and cemetary for cemetery. Abbreviations need to be taken into consideration
in your searches as well. Remember CO for county and Reg. or Reg't for regiment.
Finally, initials and nicknames in place of full names are typical so try
searching on F.X. for Francis Xavier and Tom for Thomas.
Google is an extremely powerful search engine, and you'll need to educate
yourself in its various features to use it most effectively in your family
research. Just as you had to learn how to use the microfilm readers when you
first visited your local library or Family History Center, you must familiarize
yourself with the ins and outs of Google's capabilities. By just typing in
a name and hoping for the best, you are not letting Google do the heavy lifting
for you.
Mark Howells Googles at markhow@oz.net.
Access the table of contents for this issue of Ancestry Magazine
GOTO: 10 Common Mistakes in Genealogy Research
RETURN TO GENEALOGY
Imagine all the people... the possibilities of what you can
do online are limitless!
About Us | Newsletter
| Contact
Us | Archives
| Resources
| President's
Message | Reviews
| Home
© 2006 SVCG
http://www.vom.com/svcg.html
Disclaimer: Hear ye, hear ye, hear ye... this site is
strictly for the dissemination of information only for the enrichment
and betterment of the public.
|