Log Hunting: Learning More About Your Visitors through the Referrer Field

Every server that hosts websites generates a visitor log. In it, an entry is made every time a visitor requests a file from the server. The date and time of the request is captured, along with the IP address of the visitor. It's a goldmine of information about the visitors to your site, but half of all website owners ignore them. Let's hope you're on the right side of that statistic.

Looking at a Log

Here's an example of a typical log entry in the extended log format.

211.32.64.10 - - [09/Dec/2000:06:04:53 -0500] "GET /directory/index.htm HTTP/1.1" 200 17645 "http://search.yahoo.com/bin/search?p=search+engine+positioning" "Mozilla/4.0 (compatible; MSIE 4.01; Windows NT)"

or in broken out form:

1) 211.32.64.10
2) -
3) - or USERNAME
4) [09/Dec/2000:06:04:53 -0500]
5) "GET /directory/index.htm HTTP/1.1"
6) 200
7) 17645 or -
8) "http://search.yahoo.com/bin/search?p=search+engine+positioning"
9) "Mozilla/4.0 (compatible; MSIE 4.01; Windows NT)"

Here's what each section means.

No. 1 is the IP address of the visiting computer.
No. 2 on this line denotes the Remote logname, which is derived from identd services. In this case, this feature has not been turned on because of the impact on performance and so the place is filled with a dash.
No. 3 would indicate the user name of the individual if the section accessed was password protected.
No. 4 is the date and time stamp of the request.
No. 5 is the action taken, in this case "GET" and the directory path of the file requested, as well as the protocol version.
No. 6 is the server's return code. In this case "200" is a successful request.
No. 7 is the number of bytes sent to the visitor. If a dash appears and the server return code is 304, this means the visitor pulled the file from their local cache.
No. 8 is the referrer field, something we'll be talking about a lot more. It shows the site the visitor came from, in this case Yahoo, and the keywords searched for.
No. 9 is the user agent and operating system of the visitor.

According to a recent SEO poll conducted by Iconocast, only 53% of website owners analyzed the traffic they receive from search engines and portals. 49% track which search terms brought visitors. And 42% keep track of the rankings of their competition. This means over half of all website owners pay little or no attention to the most important piece of marketing intelligence they can get about their website and it's visibility.

Hits Vs. Visits

First of all, let's once again address a common misconception that can severely distort our perception of a site's performance. Hits and Visits are not the same thing! Please don't use the word hits when examining site traffic. In most cases, it's a meaningless measure. Here is a quick run down of the meaning of each term.

Hits - When you visit a page on a site, there are a number of files that go to make up that page. Each graphic file, as well as the actual HTML file, has to be downloaded to create the page. Every file downloaded generates a hit. Therefore, a single page that has 18 graphic files will generate 19 hits. This is the reason that hits are an unreliable indicator of traffic.

Visits - This is an actual person coming to see your site. They may look at one page, or several. The entire session counts as one visit.

Page Views - This measures the number of pages seen by visitors.

Unique Visits - These are visits from an individual visitor as determined by an IP address, domain name and persistent cookies.

Turning On Your Logs

The first step in making sure you're taking full advantage of your logs is to determine if your hosting company is capturing all the information you require. Many hosting companies use stat analysis packages to provide basic visitor information. Often, this information does not include referrer details. Ideally, you want access to the raw log files and then use a log analysis tool such as WebTrends to prepare your report. When requesting this information, request the extended log format and make sure referrer information is captured.

If your hosting company will not give you access to the logs, or won't provide the logs in an extended format, consider switching hosts. This information can make a huge difference to the effectiveness of your online marketing campaign. There are plenty of hosts that have no problem at all in providing you access to your logs. One word of caution, though. If you have a busy site, log files can become quite large. It's reasonable for a hosting company to only offer you access to the latest logs, otherwise their servers will become clogged with huge log files. Make sure you download your logs regularly and keep your server clean.

Analyzing Your Logs

Once you have access to the logs, you'll need a tool that can take the raw data and pull the reports you need out of it. We've already mentioned WebTrends. Other log analysis tools include the freeware tools Analog and Webalizer, NetTracker, FlashStats, and subscription based services like SuperStats and HitBox. Personally, we use WebTrends and quite like it.

Using the Referrer Field

The information gathered through the referrer field is important for two very important reasons.

First of all, it shows you just which search engines and search terms are pulling the most traffic for you. This is important in evaluating where to spend your time improving your rankings. It will also help show which search terms are the ones being used by your visitors. A look at the most popular search terms section of a report almost always generates a surprise or two.

Secondly, it will show other sites that link to yours that you weren't previously aware of. With link popularity becoming an increasingly important factor in determining relevancy in search engines, you need to know every site that currently links to you. Once you determine the URLs, make sure these linking pages are submitted to all the search engines.

Log Building

We use visitor logs as an integral part of our search engine visibility program. Unfortunately, a relatively small percentage of our clients have access to their raw extended log formats. Without this information, we have no idea how effective the SEO campaign is. We can determine rankings and visibility, but we can't determine if that visibility is being translated into visitors. And that's really the whole point, isn't it?

Gord Hotchkiss
President and CEO
Enquiro Full Service Search Engine Marketing
Search Engine Positioning by Searchengineposition
-------------------------------------------------------------------------------
Copyright 2005 - Enquiro Search Solutions.
This article can be reproduced in its entirety, if the author credit is retained and there is a prominent source link to www.enquiro.com.
Visit our technical and news site www.searchengineposition.com.