Search engines

  • Web Technology

    HTML
    Form
    Search engines
    Client server and peer to peer
  • What is a search engine and how does it work

    All a search engines does is looks and fetches different resoucres depending on the input of the serach. Some examples of the resources that can be collected from the internet are webpages, documents, images and videos. Search engines depend heavily on databases that have a large collection of pages that a user may be looking for. in order for a database of webpages to be made a software called a web crawler (spider) must be used. What a web crawler will do is will look through all of it's current webpages and fetch any other links and sites that are associated with the current webpages. The way a search engine works is that depending on the keywords that are inputted into the search engine the search engine will go and find any webpages or links that have some sort of assosciation with the keywords. The way that most people succeed in getting their webpage to the top of a search engine is by including as many words as possible in their meta tag within their html code. The more keywords that are included in the meta tag the more likely that your webpage will be higher in the search engine list. The words in the meta tag don't just account for the positioning of the webpage in the search engine, this is becuase the use of title's in your html code can also help. Also it depends on how reguraly the webpage is updated and how relevent the domain name is to the subject of the webpage.

    Page rank algorithm

    The PageRank algorithm was intoduced by a man called Larry Page. It is an algorithm that is used by google in order for them to be able to rank the results from their search engines. The PageRank algorithm is a way of measuring the importance of websites. In order for a webpage to have a high page rank it has to have lots of webpages that have inbound links to the specific website. The more inbound links the site will have will mean that high the page rank will be for that site. It is all good a website having lots of inbound links, but how important are those sites that have the inbound link. This means that some inbound links can carry more weight thann other inbound links and visa versa. The way an inbound link is measured in terms of it's weight is on how high their page rank is and on how may outbound links that, that inbound link site has.
    To be able to calculate the rank of a page (PageRank algorithm) there is a formula :
    PR(A) = (1-d) + d (PR(T1)/C(T1) + … + PR(Tn)/C(Tn))d
    where
    PR(A) is the PageRank of page A
    C(Tn) is the total count of outbound links from webpage n including the inbound link to page A.
    PR(Tn)/C(Tn) is the share of the vote that page A gets from pages T1...Tn. Each of these vote fractions is added together and multiplied by d.
    d is the damping factor set to prevent PR(Tn)/C(Tn) from having too much influence. Normally set to 0.85