How Google Works – Are You A Googler?

January 05, 2005

By Matt Jacks

Google is one of those words that "spring up", and it’s now known everywhere. Google was founded in 1998 by Larry Page and Russian born Sergey Brin, both students at Stanford University at the time. It has rapidly mushroomed so that it stands proudly today as the largest, best known and most often used, available search engines.

Affiliate Marketing Training
Without expensive advertising, Google began its march to the top spot, by innovating, and focusing on producing exactly what was wanted by the users; the maximum relevancy of results, with the minimum waiting time.

Its effectiveness soon became its watchword, as well as a trust in the fact that the results pages are purely down to the ranking system, and that no one can buy their way to a higher rung on the ladder.

A Time to Dance

What about the way in which the Google search engine actually works? Exactly what goes into the mathematical formulae, known as algorithms, that govern the results, are a closely guarded secret. It is said that there are a hundred or so different factors that influence how well, or how not so well, a web site performs in the monthly ritual known as the Google Dance. This is that worrying time for webmasters when the billions of pages and documents in the Google directory are updated and refreshed.

During the Google Dance, new URL’s (Uniform/Universal Resource Locators - the `address’ of a web page) will make their appearance, old ones will be removed, perhaps to reappear later, while others still, shift around in their ranking positions in the Google index that decides how high up on the results pages they appear for particular search queries.

Nothing is necessarily fixed for more than a month; all is fluid movement to keep webmasters on their toes!

The Index is King, but a Spider is the Kingmaker

The index positioning is all important with search engines, because this is how they work. They don’t immediately start a web sweep for the information related to the search query (whatever is typed into the search box) every time a user comes along - this would be impossible.

Instead, automated robot programs called spiders (sometimes crawlers) are near constantly `crawling’ the web, browsing from site to site via the hyperlinks in a similar way to the browser (probably Microsoft Internet Explorer) does that you are using to view this site now.

These spiders, or in Google’s case the Googlebots, then deliver all that they find (and accept) to an indexer program which then sorts and ranks this data into a usable database of billions of URL’s, which is then searched in less than a second to present the user with relevant site options on the search results pages.

Sounds simple in principle, but it is a very complicated business indeed.

Where does that page rank on the PageRank?

PageRank is the name of one of the most important evaluations in Google’s algorithm, which measures something called “link popularity,” this is a measure of how many “inbound links” a site has (these are hyperlinks from other sites which lead to the one in question - as oppose to “outbound links” which are the reverse).

And this way of obtaining a higher ranking position on Google is widely known about, so webmasters try to get as many inbound links as possible, sometimes by fair means and sometimes by foul.

Old Macdonald had a Link Farm ee-i-ee-i-o

One of the ways this may be attempted is to post a link on a so-called “link farm”. These are sites with practically nothing on them apart from hundreds or thousands of links to other sites. However, this tactic is regarded by Google and most other search engines as highly objectionable, as it is an artificial value, and therefore sites that are caught using them will be penalized (have their ranking lowered).

Such things are one type of “spamdexing” (derived from; spamming and indexing) that the search engines spiders are constantly on the look for, and getting better at detecting all the time.

The Invisible Keyword

Hidden or invisible text is another attempt at unfairly boosting a search engine ranking; this is where sites have keywords all over them that are precisely the same color as a background, so web users cannot see them but they can still be read by spiders.

Again, those spiders are now keen to such trickery, and trying this out now, is considered highly risky.

Be Careful with that Optimization

There are many different ways of link hoarding or otherwise spamdexing and none of them are welcomed by the search engines. In order for Google and others to stay on the go; they must be relevant, and accurately point to quality sites. So any means of gaining an undeserved higher place in the results is not only unfair to the honest websites, but also damaging to the reputation of the search engines themselves.

Hence they are determined to punish offenders, which may even involve the removal of the guilty sites from the index, in severe cases.

SEO (search engine optimization) is a service that is now increasingly popular, as webmasters try to use the `expertise’ of others to gain their sites a higher place on the results pages. But, not all of these providers are above resorting to spamdexing for their clients, and so should be chosen with great care.

Ways and Means

There are perfectly fair ways to optimize, or improve, a site of course and search engines like Google have rules which should be obeyed.

Here are some do’s and don't guidelines which may be of help:

Note: The following is only general advice, and cannot be taken as instruction on improving a ranking position.

  • Do include plenty of keywords (words relevant to a site which are likely to be part of a search query) in the text.
  • Don’t overdo this by flooding however, or using hidden text.
  • Do design things with visitors in mind and not particularly to attract spiders.
  • Don’t make use of shadow domains and other cloaking techniques (where the search engine spider sees different content from that which a user will see). This is most often regarded as the worst form of deceit and is usually severely punished.
  • Do be careful who you link to, because if they are spammers, it may affect you too.
  • Don’t use automated submission or query software, as Google and most other search engines regard this as a break of their submission rules.
  • Do make use of a site map.
  • Don’t use too much text within images for important words, as spiders will not recognize them.
  • Do get rid of dead links.
  • Don’t try any form of spamdexing, you’ll probably be found out and penalized.
  • Do have static links leading to all pages.
  • Don’t use irrelevant words.
  • Do use titles that are accurate and representative of the pages they are on.
  • Don’t use auto-redirects as spiders will ignore them.
  • Do carefully read the submission rules to make sure you’re not breaking any accidentally, and research how to be spider friendly whilst still being true.

Remember that the best way to attract the spiders is simply to create a quality site, and you can always submit it yourself (for free) to Google if you don’t have too many inbound links to start off with.

But that will not necessarily guarantee inclusion into the index, it’s just like waving a flag to attract the spider’s attention, but it can’t do any harm (unless you break the rules).

Another good trick is to use a text only browser to look over your site with, to see that everything is clear and easily navigable; such things are appreciated by the Googlebot.

Help them to help you, in other words, and don’t try to trick them as it probably will do more harm than good.

Searching on Google

Of course most will approach Google not as a worried webmaster but as an eager searcher, looking for just about anything that can be imagined.

And everyone knows how to do a search; just type in a query and that’s it! But there are ways to help make thing easier here as well.

One reason why Google is the most popular search engine because it’s the biggest, but that is not the only reason for its success. Another essential factor to be weighed in to the equation is its effectiveness.

Following a Stem

All the keywords in the query need to be matched with a document (not the common words which are removed from the terms), so this helps to bring in a greater sense of relevancy. Though if a common word is needed then a plus sign (+) can be utilized before that word to keep it in the query.

The closeness or adjacency of those words also is part of it, with pages where the keywords enjoy a higher proximity being promoted up the search results. What is called “stemming” is part of it as well. This is an intelligent system that will also include closely related variants of the words in the search terms, like `churches’ to be included if you enter `church’ for example.

Another advantage of Google is a morsel of the matched text showing up in the results, with your keywords included in bold; taken from the site so that the user can himself / herself have an idea of whether it is indeed in the right context and if this is what was required.

This can save time by preventing a visit to a similar site that is not quite what the user was after, and if they like what they see, then the “similar pages” option might yet help to close focus.

Sometimes sites will not available of course, either by the bane of a search engines life: a dead link; or by temporary technical problems on the web page. In the latter eventuality Google’s “cache” link can help by supplying the pages that the spider first found, so that it can be seen then revisited later when the problem has been solved.

Bring on the Operators

So with these and other clever nuances Google can find what you want, but help still may be needed to narrow things down quicker.

You can choose the “search within results” option of course, but there are many other ways of tinkering with the search terms, using “advanced operators.” Most of these ways are there as a part of the “advanced search” page and it‘s simple to use with all the differing features clearly marked, but there is another way of doing this if you wish; by adding the various operators to the query text in the regular search box.

The word “and” which is one of the Boolean operators used on other search engines, is not needed on Google, because all of the keywords are included anyway, but let’s have a quick look at some of the special operators which can be used on Google.

The terms allintitle: and intitle: (with the colon) can be used thus:

A search query of intitle:NASA moon (with no space between intitle: and the first word) will show results where the term NASA (or NASA - no capitalization is required in a google search) will be in the title of a web page and the term moon will be anywhere (perhaps also in the title, but also anywhere else on the page).

Whereas search terms of allintitle: NASA moon will only produce results where both NASA (NASA) and the word moon is in the title.

Another way of doing the allintitle: is to go to the “advanced search” and alter “occurrences” accordingly to what you want.

The title is that text which appears on the top bar of a browser window whenever a web page is viewed, as a title for the content of that page obviously. And as can be appreciated, use of these operators can significantly change what comes up on the results pages.

The terms allinurl: and inurl: can be similarly utilized to adapt the search query to given requirements.

Options, options

If the site is already known, but not the page, then a search, for example, of robots site:www.NASA.gov (with no space after the colon) will bring results of pages about robots but only in the NASA site. This can also be brought about by use of the “domain” option in the “advanced search” function.

Links to a given site can be found as well, by either keying in the name to the link box in the “advanced search” or by use of the regular search box with a query of link:www.NASA.gov for example.

Many other features can help as well; phrase searches can be used by enclosing the entire query in speech marks, or the minus (-) sign for keeping things out of the results. If you want to know about fish for example but not salmon or trout then key in fish -salmon -trout and this will exclude those particular fish species.

Alternatively a search of fish +salmon +trout will tell Google that the given types of fish must be included in all the top results but that other fish can also be present as well.

The term info: can be used to see what Google knows about a site or page and the link: operator can be made use of to show links (obviously) to a particular site, and this also can be accessed from the aforementioned “advanced search.”

The “similar pages” and “cached” links that come up with the results can also be employed at the original query by using the operator’s cache: and related: respectively. And the latter there can be accessed in a third way as well, yep, you guessed it; on the “advanced search.”

More Stuff

The operator define: will bring up various definitions of a word as described by various pages in the Google index, and you can even check the value of your corporate shares with stocks: (as long as you know the ticker symbol for the company involved).

You can specify languages, or dates, or file types, or other things as well in your use of Google, use the SafeSearch to keep out the porn (from the “preferences”) especially useful when running an image search.

Or why not check out mail order catalogs with Google Catalog, or view products with Froogle?

Travel and flight information, street maps and phone numbers (US only), spell checkers, news services, calculators and weights and measures converters, not to mention parcel tracking, all this and more can be easily accessed via the Google search box.

Download the Google toolbar, for quick access and shortcuts to all search features on Google, or the Google Deskbar to run a search without first launching a browser.

And with Google Buttons added onto your browser, you can even highlight any text on a page you’re visiting to run a search, with no need to key it in.

Google it!

And we still haven’t mentioned everything that Google offers as the worldwide need for searching continues ever onwards.

But whatever it is that users are seeking, if they take a little time to learn exactly how Google works, the chances are good that they'll find exactly what they were looking for.

Related Items

James Martell's Affiliate Marketers Handbook - '2006'
A downloadable, 8-step training manual could offer the keys to YOUR success with affiliate programs on the Net!

Click now to learn more!

About The Author

Matt Jacks is a successful freelance writer providing tips and advice on a multitude of Internet-related topics including a look at the Google search engine and how it works.  His numerous articles offer valuable insight and moneysaving tips on interesting topics.
 

Related Articles

  • The History Of Google – Searching The World
    They say the Internet is vast; and it truly is. With seemingly limitless enthusiasm, humanity marches forever onwards (and hopefully upwards as well, though unfortunately not in many cases) in its quest and thirst for knowledge and information.