GigaBlast
March 19th, 2008
Gambar berikut adalah screenshot dari halaman untuk mengetahui user yang online pada website ini, ada yang menarik pada waktu hari ini saya buka yaitu ada penambahan GigaBot, apa itu GigaBot? Gigabot merupakan spider/web crawler dari serach engine Gigablast.
Gigablast versi pertama dilaunching pada tahun 2002. Secara cepat dikenal sebagai search engine yang sangat efisien. Dalam 2 tahun terakhir Gigablast telah mengalami revamping, redesigning, dan re-architecting pada fungsi search-nya dan di “re-launch” pada tahun 2008 dengan tujuan menyediakan kepada user sebuah serach engine yang high-quality dan highly-relevant. Berikut ini adalah screesshotnya:

Inovasi yang ada dalam GigaBlast antara lain:
- Hyper Scalable - Scales to 200 billion full pages and 100,000 servers.
- Efficient - Uses very few computers to support a huge index and a large number of queries per second.
- Simple Interface - Get the search results via an XML feed
- Web Directory - Allows searching of all the sites in a particular topic, not just the pages. All directory pages can be returned through the XML feed, too.
- Gigabot - Gigablast’s fast and feature-rich spider is highly configurable.
- Document Injection - Bypass the spider and inject your document directly into Gigablast using simple HTTP POSTs or GETs
- Real-Time - URLs are indexed in real-time. Link analysis is done on the fly.
- Intelligent Update - Determines the update cycle of each document and tries to spider it at that frequency.
- Duplicate Detection - Spider can detect and discard duplicate web pages.
- Dual Mode - Uses idle cycles to spider and index documents, but will quickly yield resources to handle incoming queries.
- Maintainable - Comprehensive web-based GUI controls make it easy to administer.
- Spam Protection - Features a large array of anti-spam tools and algorithms used to keep spam out of the index.
- Document Cache - Has a cache to hold user-viewable copies of the pages it spiders and indexes. Obeys nocache meta tags as specified.
- Historic Cache - Spider can hold documents in the index long after they are 404.
- HTTPS support - Can spider and serve HTTPS pages.
- robots.txt - The Gigablast spiders support the robots.txt standard as well as certain related meta tags.
- Multiple Formats - Indexes PDF, Microsoft Word, Power Point, Excel, and Postscript documents. Supports user-definable filters.
- Dynamic Summaries - Search result summaries are generated so that they contain the query terms.
- Term Highlighting - Performs query term highlighting on the view of cached pages and on the dynamic summaries.
- Robust Query Syntax - Features many different field searches, + and - operators.
- Family Filter - Removes pages with undesirable content from the search results.
- Sort by date - Sorts search results by date, very fast and with high accuracy.
- Query Refinement - Search within a set of search results.
- Advanced Search - Allows users to perform power searches quickly and easily.
- Site Clustering - Can optionally cluster away results from the same web site, so the list of search results is not dominated by any one site.
- Fuzzy Deduping - Automatically removes search results that are X% similar to an above result, where X is adjustable.
- Query Weighting - Custom weight the query terms exactly how you want.
- Super Recall - Returns extra results which only have some of the query terms.
- Spell Checking - Performs spell checking based on a dictionary that is constructed from the index. Add your own words and phrases, too.
- Huge Document Support - Index documents that are hundreds of Megabytes in size.
- Huge Result Sets - Receive hundreds of thousands of results per page.
- Related Topics - Dynamically generated on a per query basis. (aka GigaBits)
- Reference Pages - Generates sets of expert web sites which contain lists of links relevant to the query.
- Custom Topic Search - Constrain searches to a list of up to 500 sites.
- Default AND Capable - Can easily limit search results to only pages that have all the query terms.
- Boolean Queries - Supports complex nested boolean queries using AND, OR and NOT operators.
- Turing Test - Uses simple Turing test to prevent real-time addurl abuse.
- Redundancy - If one server goes down then its twins take over for it.
- Error Correction - Corrupted data is automatically detected and patched from a mirror host.
- Load Balancing - Gigablast intelligently distributes load evenly among all hosts in the network.
- Collections - Allows the administrator to partition the index into many sub indexes.
Inlinks :
- Blog, Weblog, atau Web Log?
- Choose Web Browser Wisely
- Chrome - Browser Dari Goolge
- Chrome Menggerus Pangsa Pasar Search Engine Lain
- E-Commerce
- Free CMS Untuk Space Terbatas
- GigaBlast
- Hitung Mundur IPv4
- Ilmu-Komputer.net Toolbar
- Internet Service Provider
- Lynx : Text Web Browser
- Manfaat Ilmu-Komputer.net Toolbar
- Maxthon Web Browser
- Memilih Domain Name
- Menjalankan Banyak Account Yahoo Messenger
- Pengantar SVG
- Safari Browser For Windows
- Safari Browser Paling Handal
- Search Engine
- Search Engine Results Page Analysis
- Setting Icon Website
- SMS Murah Dengan YM Tiny
- Tip : Analisa Result Page #1
- Tip : Analisa Result Page #2
- Tips : Cek Yahoo Messenger Invisible User #1
- Web Dictionary
- Web Hosting Dimana?
- What is Web ?






