Dhundo search engine is a program for research of documents stored in a computer or a computer network such as the World Wide Web is stored. Web Search Engines have their origin in information retrieval systems. To create a keyword index for the document base to searches using keywords with a claimed relevance to answer parent hit list. After entering a search term, a search engine returns a list of references to potentially relevant documents, usually shown with title and a short summary of the document.
The essential components or tasks of a search engine are:
a) Creation and maintenance of an index (data structure with information about documents)
b) Processing of queries (finding and ordering of results)
c) Processing of the results in a useful form possible.
In general, the carried data collections automatically, on the web by web crawlers, on a single computer by regularly reading all files in user-specified directories in the local file system.
Characteristics of search engine
Search engines can be categorized according to a number of characteristics. The following features are largely independent. One can create the design of a search engine so deciding which option from each of the feature groups, without this affecting the choice of the other features.
Type of Data
Various search engines can search different types of data which includes “document types” such as text, image, sound, video and many more. Results are designed in response to these groups. When searching for text documents usually a piece of text is displayed containing the keywords. Image search engines display a thumbnail of matching images. A large proportion of all searches on the Internet refer to date on people and their activities. A people search engine is publicly available information about names and people that are displayed as a link list. More specialized types of search engines, for example, job search engines, Search industry or product search engines.
Another feature for categorization is the source from which originate the data collected by the search engine. In most cases, already the name of the search engines type describes the source.
Web search engines collect documents from the World Wide Web, vertical search engines consider a selected area of the World Wide Web, and only capture web documents on a specific topic such as football, health or law.
This section describes the differences in the implementation of the operation of the search engine. Today’s most important group are index-based search engines. This reading a matching documents and creates an index. This is a data structure that is used in a subsequent query. Disadvantage is the complicated care and storage of the index, advantage is to speed up the search process. Most common form of this structure is an Inverted Index.
Interpretation of the input
A user’s query is interpreted before the actual search and put in a form understandable to the search algorithm used internally. This serves as simple as possible to keep the syntax of the request and still allow complex queries. Many search engines support the logical combination of different search words using Boolean operators. This websites can be found which contain certain terms, but not others.