Apache Lucene – PHP Implementation VS Java version

10 May

I was excited to know about Lucene’s inclusion into Zend’s Framework, but a bit of Googling brought up some serious performance issues with the PHP implementation. The Java implementation works faster and I would consider using it instead.

By the way, the Xapian project offers much superior performance as shown by my Recoll personal search system. I have indexed more than 7GB of data which consists of large PDF files, IMAP mailboxes, Text files, DOC files, etc.

The search results are quick, considering the size of the data indexed and the resources on my machine.

Good work Xapian team !!


Cuil – The Elegant Search Engine

15 Aug

I stumbled upon while browsing for some high-tech stuff. Believe me, it made me forget what I was looking for. An excellent user interface combined with very intelligent search semantics, very impressive indeed. I was amazed at the relevance of the results. Although the number of results shown at once is lesser than the contemporary search engines, it makes up for it by ensuring two things. First, the search results are extremely relevant: Second, the search results carry tabs for each of the related possible suggestions, the user may effortlessly click any of the tabs to get results for that tab. Cuil is a brain child of Anna Patterson and Tom Costello. It has been designed to be the largest search engine on the web.

DBSight – The Ultimate Faceted Database Search

02 Aug

DBSight Logo

Information Retrieval” as a science has attained a certain level of maturity and we are seeing a lot of companies offering nice products making use of the most advanced techniques. I recently came accross such a product that’s worth mentioning here. I was able to set it up on my pc in less than 3 minutes. It comes with a very easy to use web interface. It makes use of JDBC to retrieve data from virtually any database. You can specify an SQL query to specify the search able records. DBSight automatically retrieves the records from database, creates the search index and it can automate the process as per a schedule specified by you. The figure below shows the possibilities for search results retrieval once an index has been created.

DBSight Overview

