rulururu





post GigaBlast

March 19th, 2008

Filed under: Internet — Unggul_USA @ 9:40 am — View blog reactions


Gambar berikut adalah screenshot dari halaman untuk mengetahui user yang online pada website ini, ada yang menarik pada waktu hari ini saya buka yaitu ada penambahan GigaBot, apa itu GigaBot? Gigabot merupakan spider/web crawler dari serach engine Gigablast.

GigaBlast

Gigablast versi pertama dilaunching pada tahun 2002. Secara cepat dikenal sebagai search engine yang sangat efisien. Dalam 2 tahun terakhir Gigablast telah mengalami revamping, redesigning, dan re-architecting pada fungsi search-nya dan di “re-launch” pada tahun 2008 dengan tujuan menyediakan kepada user sebuah serach engine yang high-quality dan highly-relevant. Berikut ini adalah screesshotnya:

GigaBlastSS

Inovasi yang ada dalam GigaBlast antara lain:

  • Hyper Scalable - Scales to 200 billion full pages and 100,000 servers.
  • Efficient - Uses very few computers to support a huge index and a large number of queries per second.
  • Simple Interface - Get the search results via an XML feed
  • Web Directory - Allows searching of all the sites in a particular topic, not just the pages. All directory pages can be returned through the XML feed, too.
  • Gigabot - Gigablast’s fast and feature-rich spider is highly configurable.
  • Document Injection - Bypass the spider and inject your document directly into Gigablast using simple HTTP POSTs or GETs
  • Real-Time - URLs are indexed in real-time. Link analysis is done on the fly.
  • Intelligent Update - Determines the update cycle of each document and tries to spider it at that frequency.
  • Duplicate Detection - Spider can detect and discard duplicate web pages.
  • Dual Mode - Uses idle cycles to spider and index documents, but will quickly yield resources to handle incoming queries.
  • Maintainable - Comprehensive web-based GUI controls make it easy to administer.
  • Spam Protection - Features a large array of anti-spam tools and algorithms used to keep spam out of the index.
  • Document Cache - Has a cache to hold user-viewable copies of the pages it spiders and indexes. Obeys nocache meta tags as specified.
  • Historic Cache - Spider can hold documents in the index long after they are 404.
  • HTTPS support - Can spider and serve HTTPS pages.
  • robots.txt - The Gigablast spiders support the robots.txt standard as well as certain related meta tags.
  • Multiple Formats - Indexes PDF, Microsoft Word, Power Point, Excel, and Postscript documents. Supports user-definable filters.
  • Dynamic Summaries - Search result summaries are generated so that they contain the query terms.
  • Term Highlighting - Performs query term highlighting on the view of cached pages and on the dynamic summaries.
  • Robust Query Syntax - Features many different field searches, + and - operators.
  • Family Filter - Removes pages with undesirable content from the search results.
  • Sort by date - Sorts search results by date, very fast and with high accuracy.
  • Query Refinement - Search within a set of search results.
  • Advanced Search - Allows users to perform power searches quickly and easily.
  • Site Clustering - Can optionally cluster away results from the same web site, so the list of search results is not dominated by any one site.
  • Fuzzy Deduping - Automatically removes search results that are X% similar to an above result, where X is adjustable.
  • Query Weighting - Custom weight the query terms exactly how you want.
  • Super Recall - Returns extra results which only have some of the query terms.
  • Spell Checking - Performs spell checking based on a dictionary that is constructed from the index. Add your own words and phrases, too.
  • Huge Document Support - Index documents that are hundreds of Megabytes in size.
  • Huge Result Sets - Receive hundreds of thousands of results per page.
  • Related Topics - Dynamically generated on a per query basis. (aka GigaBits)
  • Reference Pages - Generates sets of expert web sites which contain lists of links relevant to the query.
  • Custom Topic Search - Constrain searches to a list of up to 500 sites.
  • Default AND Capable - Can easily limit search results to only pages that have all the query terms.
  • Boolean Queries - Supports complex nested boolean queries using AND, OR and NOT operators.
  • Turing Test - Uses simple Turing test to prevent real-time addurl abuse.
  • Redundancy - If one server goes down then its twins take over for it.
  • Error Correction - Corrupted data is automatically detected and patched from a mirror host.
  • Load Balancing - Gigablast intelligently distributes load evenly among all hosts in the network.
  • Collections - Allows the administrator to partition the index into many sub indexes.

Inlinks :

(No Ratings Yet)
392 Views

No Comments »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a comment

Most Viewed Post/Page:

  • Tips : Mengatasi Komputer Bermasalah - 15,551 Views
  • Tips : Membuat Jaringan Wi-Fi - 6,757 Views
  • Download - 6,101 Views
  • Memilih Anti Virus - 5,248 Views
  • Apa itu Multimedia ? - 5,230 Views
  • SORTING ALGORITHM ANALYSIS - 4,864 Views
  • Tips : Merawat Komputer - 3,789 Views
  • Sejarah Kriptografi - 3,612 Views
  • Database - 3,383 Views
  • Power Builder 11.0 Launching - 3,374 Views
  • Most Rated Post/Page:

  • Tips : Membuat Jaringan Wi-Fi - 8 Votes
  • Tips : Mengatasi Komputer Bermasalah - 7 Votes
  • Download - 5 Votes
  • Tips : Komputer Aman Dari Virus - 5 Votes
  • Memilih Anti Virus - 4 Votes
  • Menjalankan Banyak Account Yahoo Messenger - 3 Votes
  • Aplikasi Web Atau Aplikasi Desktop ? - 3 Votes
  • Aplikasi Untuk Amankan Data Penting - 3 Votes
  • Tips : Merawat Komputer - 2 Votes
  • Database #2 - 2 Votes
  • ruldrurd
    porn movies buy online pharmacy viagra soft tabs viagra or levitra order cialis soft tabs online information on viagra for woman cheap cialis soft tabs levitra cheap generic viagra online viagra levitra purchase uk free cialis order online cialis cream for women levitra for women online viagra soft tabs
    Powered by WordPress, Web Design by Laurentiu Piron
    Entries (RSS) and Comments (RSS)