White Papers
Web Site Performance Analysis
Introduction
One of the primary ways by which the Internet users search web sites is through search engines. That is why a web site with a good search engine listing may see a dramatic increase in traffic. A search engine query often turns up hundreds or thousands of matching web pages. In most cases, only the ten most "relevant" matches are displayed first.
When someone queries a search engine for a keyword related to your site's products or services, does your page appear in the top 10 matches? It is quite natural for anyone who runs a web site to see his web site in the listing of "top ten" results. If you are listed, but not within the first two or three pages of results, you lose, no matter how many engines you submitted your site to.
There are two obstacles to solving this problem. First you’ll have to know the various techniques that will help you move into the top 10 position. Then you’ll have to monitor your progress - a crucial step, that normally consumes a lot of your precious time.
Product Overview
WeSSAT
Where does WeSSAT come into picture? If a web-developer wants to know the ranking of his page for different sets of keywords then he has to go to each search engine and submit his keyword and manually find the position of his web site after traversing number of pages. Just imagine the amount of time involved if his web page is in tenth page of the result. The same process needs to be repeated for all the other search engines.
This is exactly were WeSSAT comes into picture. WeSSAT- Web Site Situation Analysis Tool helps you to automates the process of finding the ranking of your page in the search engines. You can also find the ranking of your competitors. What if you don’t know about your competitors? WeSSAT also provides you with options in finding your unknown competitors.
Features
Multiple Keyword Searches
WeSSAT offers you with the provision of multiple keyword searching. The user may input a number of keywords for simultaneous searching. WeSSAT uses these keywords for searching in the search engines.
Multiple Search Engines
WeSSAT supports simultaneous searching in nine popular search engines namely Altavista, Excite, HotBot, Infoseek, Linkstar, Lycos, PlanetSearch, WebCrawler and Yahoo. By default all the search engines are involved in the search process. The user may select any number of search engines to search for.
Multiple Domain Search
WeSSAT helps in searching your multiple domains simultaneously. These domains are the domains whose position in the respective search engines is found. The user may type in either the complete URL or just the domain whose ranking has to be found.
Competitors Domain Search
WeSSAT also helps to find the position of the competitor domain in same way as the user domain.
Number of Hits
The user can specify the number of hits to be retrieved which ranges between 10 and 200. WeSSAT looks for the domains up to the hits specified by the user
Number of Threads
The user has the options to select the number of threads, which in turn determines the simultaneous searches that WeSSAT does. By default, WeSSAT uses 6 threads for searching though the range is between 1 and 15. If the user has multiple keywords to look for then setting the number of threads to the maximum will speed up the search process.
Identifying Hot Competitors
WeSSAT also lists the hot competitors for the keywords you have submitted. This can be done by checking “Find unknown competitors“ option. WeSSAT uses a comprehensive ranking scheme and depending upon the weightage given to each search engines it lists out the top competitors. The user may give weightage to each of the search engine and also select the number of competitors to be listed. The user may also list some of the domains, which has to be excluded from the competitors (like .com, .gov, .net etc.)
Saving Search Queries
WeSSAT provides options to save your search queries so that same query may be used for future searches. The details like keywords, domains, competitor’s domain are stored in search query. These queries may be used for future searches.
Report
The search result is a report generated in simple html format, which provides all information about the search, which can be viewed in the default browser.
Architecture
The basic architecture of WeSSAT consists of ViewManager, AppManager, Analyser and WebQueryEngine modules.
ViewManager
The basic function of this module is to handle the user input with appropriate UI. It then passes the user input to AppManager.
AppManager
The main module is the AppManager, which initializes, coordinates all the activities of all the other modules. Depending upon the input provided by the ViewManager it initializes WebQueryEngine component for searching.The result retrieved from WebQueryEngine is passed on to Analyser component for analyzing. The analyzed result is retrieved from the Analyser and then output.
WebQueryEngine
This is the module that queries a search engine for the URL for the specific position. This component encapsulates all functionality required for connecting to a search engine, downloading the page and retrieving the results.
Analyser
This component analyzes the result obtained from WebQueryEngine.

Proposed Technique
WeSSAT has been developed using MFC classes. The basically the UI has been developed using CPropertySheet and CPropertyPage classes. For searching a connection has to be established to the server that is the search engine. This is done using WinInet classes like CInternetSession, CHttpConnection etc. As connecting to a search engine and downloading a page may take time, so multithreading has been used. Each thread takes care of connecting to the search engine and downloading
The basic algorithm for searching is as
- The user inputs all values like keywords, search engines, domains, hits, threads etc for searching which are passed on to AppManager.
- The AppManager then schedules the searches, creates search items, initializes WebQueryEngine with search items and starts the threads.
- The WebQueryEngine takes care of posting to the particular search engine, downloading the page, scanning through the page and retrieves the result URLs.
- The results are passed on by AppManager to AnalyserReporter for analyzing.
- The AnalyserReporter analyses and passes the result to the AppManager which depending upon the result retrieved further schedules the search.
Algorithms
Scheduling And Coordinating The Search
AppManager is the main module, which coordinates the searches and retrieves result. AppManager first creates search items, which consists of the keyword, search engine, pageno and depth. These search items are placed in Que. Then the AppManager schedules and initializes the threads.
- Initialize SearchItems queue to null
- When to add: If remaining_items <(numberOfThreads+1) 2 then add numberOfThreads items to list using AddItem.
- Repeat
- Allot item to waiting thread. Request thread to begin searching.
- When thread returns information, first remove items in the SearchItems queue that are no longer valid.
- If remaining_items <(numberOfThreads+1)/2 then add numberOfThreads items to list.
- Until no more items to be searched.
AddItem
Algorithm used to add items to SearchItems queue
- First Criterion: Select page having lowest depth.
- Second Criterion: Select keyword phrase highest from the start of the keywordphrase_searchEngine_status stack.
Querying Search Engine
WebQueryEngine takes the major part of search engine specific work. Hence WebQueryEngine has to take into account the major difference between the search engines. The three major area where the search engine differ is
- The URL: For each of the search engines the posting URL has to be created, depending upon keyword and the page to be retrieved. For all search engine except Yahoo this can be done easily by keeping the keyword and page or hit number as parameter. But incase of Yahoo there is major difference because in its URL it has parameter for number of categories and sites, which would depend upon on search result. So for yahoo the next page URL is scanned and then stored.
- Posting: Posting to particular search engine can be either a Get Method or Post Method. In all the search engines supported by WeSSAT except Linkstar posting is done by Get method. But incase of Linkstar posting is done by post method. Hence in Linkstar a Post is done with default form values.
- Scanning: There is major difference while scanning for URLs in the downloaded page in each of the search engine. Depending upon the layout of the page for each of the search engine the scanning and parsing for the URLs is done.
- cStore past analysis data and compare/track rankings of web pages over a period of time.
- Add new search engines. Remove search engines
- Ability to modify the tool if search engines modifies the format for accepting data.
All these differences are taken care by WebQueryEngine.
Analyzing and Ranking
Analyser analyses the results for finding the position of the domain. Finding the position of the domain is done by scanning through URLs returned. It also analyses to find the hot competitors and rank them. The ranking scheme is done as follows.
Algorithm For Ranking
Report Generation
After analyzing the results are passed on to the AppManger which generates a simple html report. The report contains details of the position of the domains found in the search engines. It also lists out the top competitors for the keywords.
Conclusion
WeSSAT is a step towards automating the process of finding the position of the domain in search engines. There are many possible extensions that can be done to WeSSAT, which will increase it performance as well as its usability. The possible extensions may be listed as follows
Appendix
Web Site Situation Analysis Tool (ver 1.0)
Report generated on 06 August 1999
Search phrase : Enterprise Javabeans Container for transaction management

Search phrase : Enterprise Javabeans Container for transaction management
Key: implies URL wasn't found within the specified number of results
E : implies Error connecting to or retrieving results from the search engine
Y : implies domain was found in the search engine
Note: To arrive at the top competitors, Wessat uses the comprehensive ranking scheme that takes care of the position of the returned result search engine and the weightage attached to the search engine.
Search Engine Weightage
Altavista: Very High
Yahoo: Very High
Excite: High
HotBot: High
Lycos: High
Infoseek: High
Planetsearch:Medium
WebCrawler:Medium
Linkstar:Medium
Excluded domain: .net,.gov
