Web development , php , ajax , symfony, framework, zend
No, it’s not Spiderman’s latest web slinging tool but something that’s more real world. Like the World Wide Web.
The Invisible Web refers to the part of the WWW that’s not indexed by the search engines. Most of us think that that search powerhouses like Google and Bing are like the Great Oracle…they see everything. Unfortunately, they can’t because they aren’t divine at all; they are just web spiders who index pages by following one hyperlink after the other.
But there are some places where a spider cannot enter. Take library databases which need a password for access. Or even pages that belong to private networks of organizations. Dynamically generated web pages in response to a query are often left un-indexed by search engine spiders.
Search engine technology has progressed by leaps and bounds. Today, we have real time search and the capability to index Flash based and PDF content. Even then, there remain large swathes of the web which a general search engine cannot penetrate. The term, Deep Net, Deep Web or Invisible Web lingers on.
To get a more precise idea of the nature of this ‘Dark Continent’ involving the invisible and web search engines, read what Wikipedia has to say about the Deep Web. The figures are attention grabbers – the size of the open web is 167 terabytes. The Invisible Web is estimated at 91,000 terabytes. Check this out – the Library of Congress, in 1997, was figured to have close to 3,000 terabytes!
How do we get to this mother load of information?
That’s what this post is all about. Let’s get to know a few resources which will be our deep diving vessel for the Invisible Web. Some of these are invisible web search engines with specifically indexed information.

Infomine has been built by a pool of libraries in the United States. Some of them are University of California, Wake Forest University, California State University, and the University of Detroit. Infomine ‘mines’ information from databases, electronic journals, electronic books, bulletin boards, mailing lists, online library card catalogs, articles, directories of researchers, and many other resources.
You can search by subject category and further tweak your search using the search options. Infomine is not only a standalone search engine for the Deep Web but also a staging point for a lot of other reference information. Check out its Other Search Tools and General Reference links at the bottom.

This is considered to be the oldest catalog on the web and was started by started by Tim Berners-Lee, the creator of the web. So, isn’t it strange that it finds a place in the list of Invisible Web resources? Maybe, but the WWW Virtual Library lists quite a lot of relevant resources on quite a lot of subjects. You can go vertically into the categories or use the search bar. The screenshot shows the alphabetical arrangement of subjects covered at the site.

Intute is UK centric, but it has some of the most esteemed universities of the region providing the resources for study and research. You can browse by subject or do a keyword search for academic topics like agriculture to veterinary medicine. The online service has subject specialists who review and index other websites that cater to the topics for study and research.
Intute also provides free of cost over 60 free online tutorials to learn effective internet research skills. Tutorials are step by step guides and are arranged around specific subjects.

Complete Planet calls itself the ‘front door to the Deep Web’. This free and well designed directory resource makes it easy to access the mass of dynamic databases that are cloaked from a general purpose search. The databases indexed by Complete Planet number around 70,000 and range from Agriculture to Weather. Also thrown in are databases like Food & Drink and Military.
For a really effective Deep Web search, try out the Advanced Search options where among other things, you can set a date range.

Infoplease is an information portal with a host of features. Using the site, you can tap into a good number of encyclopedias, almanacs, an atlas, and biographies. Infoplease also has a few nice offshoots like Factmonster.com for kids and Biosearch, a search engine just for biographies.

DeepPeep aims to enter the Invisible Web through forms that query databases and web services for information. Typed queries open up dynamic but short lived results which cannot be indexed by normal search engines. By indexing databases, DeepPeep hopes to track 45,000 forms across 7 domains.
The domains covered by DeepPeep (Beta) are Auto, Airfare, Biology, Book, Hotel, Job, and Rental. Being a beta service, there are occasional glitches as some results don’t load in the browser.

IncyWincy is an Invisible Web search engine and it behaves as a meta-search engine by tapping into other search engines and filtering the results. It searches the web, directory, forms, and images. With a free registration, you can track search results with alerts.

DeepWebTech gives you five search engines (and browser plugins) for specific topics. The search engines cover science, medicine, and business. Using these topic specific search engines, you can query the underlying databases in the Deep Web.

Scirus has a pure scientific focus. It is a far reaching research engine that can scour journals, scientists’ homepages, courseware, pre-print server material, patents and institutional intranets.

TechXtra concentrates on engineering, mathematics and computing. It gives you industry news, job announcements, technical reports, technical data, full text eprints, teaching and learning resources along with articles and relevant website information.
Just like general web search, searching the Invisible Web is also about looking for the needle in the haystack. Only here, the haystack is much bigger. The Invisible Web is definitely not for the casual searcher. It is a deep but not dark because if you know what you are searching for, enlightenment is a few keywords away.
Do you venture into the Invisible Web? Which is your preferred search tool?
Image credit: MarcelGermain
Did you like the post? Please do share your thoughts in the comments section!
Related posts
In: web resources
24 Jan 2010
Your groggy mornings aren’t over yet. A few months ago we wrote about WakeMate, a Y Combinator-funded startup that makes a small gadget designed to help you sleep better. Last time we talked to them, the WakeMate team had a planned ship date of January 25. Unfortunately, that’s not going to happen. Yesterday, the WakeMate team sent an Email to the thousands of customers who had pre-ordered the device (including a $5 down payment) to inform them that they wouldn’t be getting their WakeMates on schedule. Now the first batch of orders will ship “as early as next month”. But when I asked if the WakeMate team had an idea when the majority of customers would be receiving their devices, they said they were reluctant to give an estimate since they want to avoid disappointing people again. In other words, it may be a while.
The WakeMate device, which costs $50, consists of a small wristband that you wear during the night. It tracks your movements throughout the night, which you can analyze from your computer, and can also work in tandem with your phone alarm to wake you up in the lightest phase of sleep (which is supposed to help eliminate grogginess). There are competitors in this space, like the Zeo Sleep Coach, but WakeMake is about $200 cheaper.
Dru Wynings has posted the Email in its entirety. The reason for the delay? WakeMate says it wanted to make more improvements:
We’ve experienced numerous breakthroughs over these short months. We’ve improved our hardware by leaps and bounds, making it sleeker, smarter and more efficient by taking advantage of the latest technological developments. We’ve also researched new and better algorithms to power our sleep analytics software which will further increase the accuracy and usefulness of your WakeMate. Unfortunately, these improvements have taken time. While this means we will not be shipping the first batch of WakeMates on the 25th as planned, when we do you’ll be getting a much better product!
WakeMate’s letter stirred up quite a bit of unrest among users who had preordered, in part because WakeMate offered access to premium analytics features as compensation for the delay. That would be all well and good, but users who ordered before now were not informed that they would have to pay for premium subscriptions to unlock the full potential of the device.
WakeMate says that the premium features referred to are actually in addition to the features they had previously announced, so customers aren’t dealing with a bait and switch. But in light of the confusion caused, today they’re sending users a follow up Email to announce that all pre-order customers will have access to all premium software features free of charge.
Here’s the second Email, which will be going out this afternoon, in its entirety:
Sorry guys!
In our our previous email we warned about a delay in shipping
WakeMates to our pre-order customers. To compensate customers for
the delay we proposed to give them some future premium features for
free. Unfortunately we weren’t clear enough about this, and some
customers thought it was an attempt to charge them more.So let’s try this again:
1. The current delay is due entirely to changes in our hardware and
software, and manufacturing issues caused by the volume of orders
we’ve received; it is in no way related to the development of premium
features.2. Regardless of future premium add-ons, customers who have pre-ordered
WakeMates will never have to pay to use online analytics, nor will
any customer receive an inferior product for not doing so. We
apologize for giving the impression that anyone might.3. We hope we will be able to start shipping the first units next
month, but since we’re still learning the ropes of large-volume
manufacturing, we don’t want to make any promises we’re not sure
we can keep. We appreciate your patience and we’ll keep you updated.4. To make up for the frustration and confusion, we’re going to
give all pre-order customers all future premium software features
free of charge.If you have any further questions, please feel free to email
preorder@wakemate.com directly.Sincere apologies for the confusion,
The WakeMate Team
Delays are nothing new when it comes to startups that are building hardware. Fitbit, a startup that makes a exercise-tracking gadget, took a year to launch after its debut at TechCrunch50, and new customers still face lengthy waits to receive their devices.
Image by HilaryAQ.
This blog delivers stylish and dynamic news for designers and web-developers on all subjects of design, ranging from: CSS, Ajax, Javascript, web design, graphics, typography, advertising & much more. Our goal is to help you communicate effectively on the web with an engaging website or functional interface.



