Web Pages and their Complexity

Blogged under General by Administrator on Wednesday 30 July 2008 at 12:27 pm

HTML’s forgiving nature is responsible for dramatic increases in the expressive power of web pages and in the overall usability of the Internet at large. As the saying goes, though, there’s no such thing as a free lunch. The flexibility of the HTML standard encourages a loose relationship between the HTML tags structure (DOM) of a web page and its visualization (rendering) on the screen. This creates challenges for search engines to find relevant matches between user queries and web pages.

Today’s web pages can be described as a hodgepodge of many (and quite often unrelated) elements – articles, comments, posts, ads, banners, tables of contents, hyperlinks, etc. Printed newspapers (which still constitute a serious challenge for OCR systems) are dwarfed in complexity by the complexity of web pages. Of course, HTML DOM is of some help, but its hierarchical structure is no match to lateral links between web page elements with their dynamic rendering.

Let’s consider the following page: http://richlabonte.net/exonews/xtra/whales_dying.htm

web page example

This page is one of the top-10 Google Search results for the query whales warning of tsunami. The page consists of 9 separate articles discussing different subjects, including an article about tsunami warning systems (#3) and an article on preservation of whales (#1), but not what was asked – the rather unusual topic of whales warning of tsunamis. As can be seen from this example, the relevancy of a page to a query is not necessarily well defined.

Various means have been employed to deal with web page complexity. The most widely accepted one is to use the distance between keywords occurrences as a measure of relevancy – this is often referred to as proximity measure. This approach works extremely well in the case of a classic Information Retrieval problem of matching keywords to texts. However, on a web page, close proximity of two words does not necessarily mean that they are related. They can be very close on a page while belonging to completely different contexts — for instance, think of a newspaper layout where two unrelated topics appear on different sides of the line separating two columns. The problem is exacerbated with dynamic rendering of web pages, where ads and links to outside pages are inserted inside an article.

All this leads to reinterpretation of the concept of a relevant page. It is no longer a relevant page but rather a relevant context we are after. As soon as we can split a page into contexts, the methods that are applicable to pure texts such as NLP, semantic analysis, proximity measures, etc. become much more reliable.

Glendor Search is built on this principle. We developed a patent-pending technology to analyze arbitrary web pages and divide them into independent contexts. On average, the analysis and indexing of a page requires less than one second. This translates into the ability to crawl and index the entire surface web (~20 billion pages) once a quarter using 1,000 standard PCs.

Inexpensive Viagra
Buy Cheap Propecia
Cialis Pfizer
Buy Pfizer Viagra
Buy Cialis Online Without Prescription
Cialis Tablets Vs Viagra
Cialis No Rx
Cialis From Canada
Buy Propecia Cheap
Cialis On Women
Buy Propecia Online
Pfizer Viagra Online
Canadian Pharmacy Online
How To Get Cialis No Prescription
Buy Viagra Cialis Levitra
Cialis In India
Viagra On Line
Levitra Online Pharmacy
Online Pharmacy
Cheap Cialis
Cheap Levitra Tablets
Cheap 25mg Viagra
Cialis Once Daily
Viagra Soft Tabs 100 Mg
Buy Viagra 100mg
Cialis Canada Online Pharmacy No Prescription
Cialis Cheap Delivery
Cialis Professional 100 Mg
Cialis Online Doctor
Buying Viagra Online
Canadian Rx Viagra
Cheap Generic Viagra Online
Cialis In Canada
Brand Cialis
Cialis Daily Availability
Cialis Professional
Viagra Soft Gel
Viagra Tablets
Generic Cialis
Cialis Soft Pills
Best Quality Viagra
Canadianpharmacy
Canada Cialis Online
Cialis Generic
Buy Levitra Online No Prescription
Cialis On Sale
Generic Cialis Soft Tabs
5 Mg Propecia
Buy Viagra In Canada
Canadian Healthcare Viagra
Cialis Costs
Canadian Pharmacy Viagra Scam
Dose Cialis
Best Recognized Pharmacy In Canada For Viagria
Canadian Pharmacy Cialis 5 Mg
Levitra For Cheap Canadian Pharmacy
Canadian Pharmacy Viagra
Buy Online Prescription Propecia
Buy Viagra Mexico
Pfizer Viagra Canada
Buy Generic Cialis
Cialis Online For Canadian
Cialis100mg
Levitra For Women
Online Viagra
Best Cialis Price
Cialis Quick
Buy Levitra Us
Buy Viagra Online Australia
Discount Cialis And Viagra
Order Propica
Best Price Levitra Online
Real Cialis
Canadamedscom
Overnight Viagra Delivery
Generic Cialis Next Day Delivery
No Prescription
Buy Cheap Generic Propecia
Alternatives To Cialis
Minoxidil Propecia Nizoral
Cheap Generic Levitra
Cheap Onlin Viagra In Usa
Purchase Of Viagra Or Cialis Etc
Online Pharmacy Shop Canadian Healthcare Pharmacy
Brand Name Cialis
Cialis Levitra
Canadian Pharmacy Levitra Value Pack
Buy Viagra From China
Canada Generic Propecia
Levitra 20 Mg
Best Deal For Propecia
Soft Cialis

Welcome to Glendor Search

Blogged under General by Administrator on Wednesday 11 June 2008 at 4:29 pm

We are throwing our hat into the ring and launching our own web search service. What makes us unique? We cut to the chase and let the users instantly see query-relevant information and quickly decide which of the top ten search results will be most interesting.

Curious? Register for a Private Beta at www.glendor.com. We are excited to know what you think.

3, 2, 1 LAUNCH
Cialis Next Day Delivery
Generic Viagra In Canada
Buy Generic No Online Prescription Viagra
Viagra Order
Propecia Canada Pharmacy
Cialis Delivered Overnight
Buy Real Cialis Online
Buy Levitra Online Us
Cheap Viagra 100mg
Cheapest Prices On Viagra
Canadian Generic Viagra On Line
Ordering Cialis Gel
Buy Viagra Australia
Buy Levitra Now
Levitra 20mg
Buy Propecia 5mg
Best Viagra And Popular In Uk
Herbal Viagra
How Much Is Viagra 50
Buy Cheap Generic Levitra
Buying Viagra
Canadian Viagra Scam
Canadian Pharmacy Scam
Viagra Of Pfizer
Cialis At Real Low Prices
Buy Levitra Lowest Prices
Female Viagra Pills
Best Prices For Propecia
Buy Levitra Online From Canada
Is Viagra Different From Levitra
Buy Cialis Once Daily
Canadian Viagra Sales
Best Online Levitra
Viagra Side Effect
Ordering Cialis Online
Buy Levitra Online Viagra
100mg Viagra
Canadian Pharmacy Viagra Prescription
Pharmacy Support Viagra
Fill Viagra Perscription
Viagra Professional
Buy Levitra Vardenafil
Cialis 20 Mg
Online Cialis
Buy Viagra Cheap
Viagra And Three Day Delivery
Buy Cheap Online Propecia
Official Canadian Pharmacy To Buy Levitra
Best Shop For Viagra
Mexico Viagra
Viagra
Best Place Cialis
Cheap Viagra No Prescription
Bestellen Levitra Online
Generic Propecia Mastercard
Buy Viagra Online Canada
How To Buy Cialis In Canada
Online Viagra Scams
Canad Ian Pharmacy
Get Viagra
Cialis Gel
Buy Drug Propecia
Cheap Canadian Propecia
Generic Viagra From Canada
Viagra 50 Mg
Buy Cialis Professional
Viagra 100 Mg
Viagra Online Cheap
Mail Order Propecia
Buy Cialis Without A Prescription
Buy Generic Propecia
Overnight Viagra
Brand Viagra Canada
How Do U Buy Propecia In Canada
Buy Generic Levitra
Mexico Pharmacy
Best Price For Levitra
Original Cialis
How Much Does Cialis Cost
Buy Cialis Online Canada
Canada Levitra
Canada Meds Viagra
Best Canadian Pharmacy
Cialis Low Price
Generic Viagra Canada
Best Doses For Propecia
Canadian Pharmacy Ed
Buy Cialis On Line
Cialis Daily Cost
Cialiscom
Best Prices On Viagra
Combine Cialis And Levitra

Recent Coverage

Blogged under General by Jeff Clavier on Wednesday 7 September 2005 at 4:48 am

Glenbrook Networks got some interesting coverage recently:

I look forward to when Glenbrook or Google will help us find information from these previously unavailable sources. It will mean billions more pages of relevant information available to the world.

 

Express Viagra Delivery
Get Cialis
Buying Cialis
Canadian Pharmacy With Lowest Generic Viagra
Best Viagra Soft Prices
Canadian Health Care Pharmacy Order Viagra
Bestellen Levitra
Cialis Canadian
Best Price For Generic Cialis
Propecia 5mg
Cialis Cheap
Buy Viagra
Cialis 20mg One A Day
Daily Cialis For Sale
Generic P Ropecia Finasteride
Buy Fast Propecia
Cialis 20 Mg 10 Pills
Canadas 1 Pharmacy
Canadian Health Care
Canadian Pharmacy Shop
Levitra Prices
Cialis Canada
Propecia Without Perscription
Canada Prescriptions Levitra
Buy Viagra Pills
Canadian Viagra For Sale
Infopharmcom
Get Levitra
Purchase Cialis Cheap
5 Mg Propecia Buy
Generic Levitra Overnight Delivery
Canadian Pharmacy Online Cialis
Cialis Daily Price
Buy Levitra With No Prescription
Get Viagra Fast
Low Cost Canadian Viagra
Best Levitra Prices
Canadian Phamacy
Levitra Tablets
Buy Viagra Online Paypal Vipps
Levitra Without Prescription
Buy Discount Viagra
Cialis Professional 20 Mg
Price Cialis Canada
Propecia Discount
Canada Meds
Propecia No Prescription
Natural Viagra
What Is Cialis
Cialis Women
Levitra Uk
Cialis Canada Online Pharmacy
Cialis Daily In Canada
Cialis 20 Mg Tablet
Cialis Buy Purchase Fast Delivery
Cialis Usa Women
Buy Levitra Online Without Prescription
Buy Viagra In New Zealand
Levitra Vs Cialis
Cost Of Viagra
Cialis Without Prescription Brand Name
Best Way To Take Cialis
Discount Propecia
Order Levitra Online
Buy Viagra Online Cheap
Best Price For Propecia
Order Cialis Online Canada
Cialis Online Canada No Prescription
Generic Cialis India Discount
Pharmacy Selling Viagra In Israel
Viagra Brand
Cialis Order
Levitra Discount
Cialis Canadian Pharmacy
Cialis Tablets
Cialis Online Canada
Canadian Pharmacy Discount Code Viagra
Levitra Purchase
Canadian Levitra
Buy Viagra Germany Canadian Meds
Pfizer Viagra Uk
Cialis Delivery In 5 Days Or Less
Viagra Pfizer Online
Cnadian Viagra India
Low Price Propecia
Cialis Overnight
Levitra Online
Cialis From India
Best Propecia Prices
Buy Cialis Usa
Canadian Pharmacy Viagra Legal
Female Viagra

Trawling the Deep Web

Blogged under General by Jeff Clavier on Sunday 21 August 2005 at 2:04 pm

The majority of web pages one can access through search engines were collected by crawling the so-called Static or Surface Web. It is a smaller portion of the Internet reportedly containing between 8 and 20 billion pages (Google vs. Yahoo index sizes). Though this number is already very large, the total number of pages available on the Web is estimated to 500 billion pages. This part of the Internet is often referred to as Deep Web, Dynamic Web, or Invisible Web. All these names reflect some of the features of this gigantic source of information - stored deep down in databases, rendered through DHTML, not accessible to standard crawlers. Pages in the Deep Web typically might not have a standard URL, and cannot be addressed in a standard fashion. In many cases, they actually do not even exist until a user asks a question by filling up fields in a form, and a response (page) is generated. Typical examples of deep web applications are airline reservation, online dictionaries, etc.

It is supposedly quite easy for a human to navigate through the Deep Web. One just needs to fill up a form by choosing one of several options like destinations and dates a on travel site, or entering a word to search for a meaning or a translation. It is much more difficult for a machine to do so automatically and generically. Because the Deep Web contains a lot of factual information, it can be seen metaphorically as an ocean with a lot of fish. That is why we call the system that navigates the Deep Web a trawler.

There are two major problems with navigating Deep Web automatically. First, the trawler needs to understand what questions to ask through aforementioned forms, and ask them exhaustively. Second, the trawler can not easily navigate from one page to another since pages do not have set URLs or might not even exist. That’s why the trawler needs to remember where it came from and return to the surface (like a whale) before “diving” again to ask the next question.

If the number of sites is relatively small, say a few thousands, each set of forms could be described manually through a templating system. Its major limitations are scalability, and non resilience to changes in page formats. 

There is a third problem that is related to the size of the Deep Web. It is so big that one needs to focus on a particular subset (vertical) to have a chance to trawl it with some level of success, especially if high precision is an important factor. Since the task of determining what questions to ask includes understanding of semantics and context, the focus on a vertical comes handy.

Glenbrook’s approach to building a trawler is based on mimicking the behavior of a (human) user. It is a useful approach since the “doors” opening the Deep Web were built with a human in mind and reflect the standards (no matter how loose) that humans use to navigate the Web.

The Trawler consists of five layers:

  1. Discoverer - locates perspective target home pages in Surface Web
  2. Scout - navigates Surface Web part of a web site and finds the “doors” - DHTML pages that contain forms leading to the Deep Web part of a web site
  3. Locksmith - fills up the forms with various requests and collects responses
  4. Assessor - analyses responses and makes a decision to use this door as candidate to query the Deep Web part of the site or move elsewhere
  5. Harvester - collects all relevant pages from Surface and Deep Web parts of the web site

After all potentially relevant pages are harvested the Extractor takes over. The Extractor is a hybrid system that applies Pattern Recognition, Natural Language Processing and other AI techniques to extract facts, combine them and populate a database that is used to provide factual answers to search queries.

The Extractor will be the subject of another post.

Tag:

Cross-posted from Software Only


Best Price For Levitra
Buy Propecia 5mg Online Uk
Generic Viagra
Buy Viagra Cheap
Canadian Cialis United Pharmacy
Bestellen Levitra
Brand Viagra Canada
Cheapest Prices On Viagra
Canadian Pharmacy Scam
Cost Of Daily Cialis
Buy Viagra Online No Prescription
Purchase Viagra Etc From Canada
Cialis 5 Mg Italia
Buy Levitra Online Without Prescription
Bio Viagra Herbal
100 Mg Viagra
Canadian Pharmacy Online Cialis
Order Propecia
Order Cialis Online Canada
Buy Cialis Without Prescription
Cialis Delivery In 5 Days Or Less
Propecia 5mg
Diuretics And Viagra
Buy Levitra Without Prescription
Levitra 10mg
Viagra Canada
Brand Viagra Over The Net
Cialis Fast Delivery Usa
Best Way To Use Cialis
Cialis Professional 20 Mg
Cheap Viagra No Prescription
5mg Propecia
Canadian Female Viagra
Cialis Women
Cialis 20 Mg Tablet
Generic Levitra Overnight Delivery
Canadian Phamacy
Dose Cialis
Canadian Healthcare Pharmacy
Generic Cialis Next Day Delivery
Purchase Of Viagra Or Cialis Etc
Buying Real Viagra Without Prescription
Cialis Online No Prescription
Generic Levitra Canadian Healthcare
Generic Cialis India Discount
Canadian Pharm Propecia Online
Buy Cialis Online Uk
Cialis Brand Name
Buy Cialis On Line
Brand Cialis
Generic Viaga Canada
Cialis By Mail
Canadianpharmacy
Pharmacy Support Viagra
Non Prescription Viagra
Cialis Online Without Prescription
Cialis Samples
Canada Viagra Generic
Buy Cialis Professional
Online Cialis
Cialis Alternative
Viagra On Line
Fast Delivery Canada Cialis
Purchase Cialis
Generic Propecia Mastercard
Cialis 20 Mg
Buy Generic Levitra Online
Buy Generic No Online Prescription Viagra
Buy Cialis From Canada
Best Prices On Viagra
Buy Cialis Generic
Buy Viagra Australia
50 Mg Cialis
Cialis Daily In Canada
Buy Viagra Online Canadian Phamacy
Mexico Pharmacy
Canadian Pharmacy Viagra Prescription
Buy Real Cialis Online
Cialis Gel
Cheap Viagra Or Cialis
Prescription Viagra
Ordering Viagra Overnight Delivery
Viagra Soft Gel
Canada Meds Viagra
5 Mg Propecia
Branded Viagra
Cialis 20mg
What Is Cialis
Official Canadian Pharmacy
Online Pharmacy Viagra Ottawa Canada
Buy Cialis Without Rx
Cheap Generic Viagra Online

Glenbrook Networks in the San Jose Mercury News

Blogged under General by Jeff Clavier on Tuesday 16 August 2005 at 8:07 am

SiliconBeat’s Michael Bazeley featured Glenbrook Networks co-founders Julia and Edward Komissarchik, and the Glendor showcase, in a great piece about “Deep Web” search and information extraction. Michael summarized it quite well:

Komissarchik and her father, Edward Komissarchik, say they have figured out how to analyze the forms on Web pages and understand the type of information the sites are looking for. Then, Glenbrook’s Web crawlers use artificial intelligence to walk themselves through sometimes complex Web forms, answering questions, such as the location of their desired job, in the same way a human would.

Julia Komissarchik likens the process to cracking a safe.

“The way to think of it is, you case the joint,'’ she said. “The scout goes through the form and tries a few options to see what the results will be. Then you have a mastermind or safecracker who gets all this information from the scout and devises a method to open the forms.'’

Finally, she said, the “harvesters'’ spring into action to gather up all the information.

Just to clarify: the “safe” analogy does not imply that the company is breaking passwords, and accessing private information. It relates to getting a machine to access generically information stored beyond interactive forms.

We announced the launch of the Glendor showcase a couple of month ago. This features the first (and still I guess, only) mashup involving jobs listings positioned on GoogleMaps.

Longer post about the concept of “web trawling” implemented by the company on its way.

Thanks to all of you who emailed us since this morning, we are grateful for reports of issues with different browser/OS combination, sorry we are not hiring at this time, and yes we can build large scale custom search and aggregation data solutions. And we are delighted that you like this showcase

Update: Gary Price, who was also quoted by Michael, posted an analysis on Search Engine Watch, that I wanted to briefly comment on. First Glenbrook’s technology does not (and can not) extract information directly from corporate databases, it goes through the public, manual, interface that companies have setup to access that data.The innovation lies in a suite of algorithms that figure out automatically the parameters to be used to extract that data, not requiring any templating of the sites to be targeted.

On server load, queries are made in a sensible way to avoid overloading servers based on response times, etc. And data can be refreshed daily, and maybe multiple times a day if the dataset is small enough. But extracting and caching data that change too frequently would not be appropriate.

On usability and searchability of the data, this is actually where the aggregation of structured data delivers its value: being able to apply on a position, a location, across a wide range of sources (in this case, jobs listings across companies).

Delighted to show you the technology at your convenience Gary…

Tag:

Cross-posted from Software Only.

Is It Legal To Bye Viagra From Canada
Pfizer Viagra Online
Buy Viagra Germany Canadian Meds
Order Levitra Online
Online Viagra Scams
Discount Canadian Cialis
Buy Pfizer Viagra
Buy Generic Cialis
Real Viagra Gel
How To Get Cialis No Prescription
Cialis Generic
Buy Cialis Once Daily
Cialis Online Canada
Best Online Generic Levitra
How To Get Cialis In Canada
Viagra Made In India
Cialis Levitra Viagra
Discount Cialis And Viagra
Buy Propecia 5mg
Get Cialis
Buy Cialis Online In Usa
Viagra And Three Day Delivery
Cnadian Viagra India
Generic Viagra Online
Cialis Soft Canada
Pfizer Viagra 50 Mg Online
Propecia For Sale
Cialis Com
Buy Viagra Without Prescription
Best Levitra Price
Cialis Pills
Cialis Tablets
Buying Cialis
Levitra Discount
Purchase Cialis Soft Tabs
Cialis Dosage
Levitra On Sale
Canada Cialis Online
Generic Viagra Propecia
Buy Levitra Vardenafil
Price Cialis
Best Price For Propecia Online
Canadian Pharmacy Cialis 5 Mg
Levitra Online
Cheap Generic Viagra India
Levitra Vs Cialis
Best Canada Meds
Canadian Cialis Uk
Buy Propecia
Cialis For Daily Use
Australia Healthcare Online Viagra
Pfizer Viagra Uk
Viagra Brand
Is Viagra Different From Levitra
Viagra Professional
Fill Viagra Perscription
Levitra
Buy Cialis 5 Mg
Viagra Canadian Pharmacy
Buy Levitra Us
Buy Viagra On The Internet
Canada Viagra Pharmacies Scam
Buy Viagara From Canadian Pharmacy
Canada Price Cialis
Best Deal For Propecia
Propeci A Sale
Ordering Cialis Online
Get Viagra Without A Prescription
Non Pescription Cialis
Cheapest Cialis
Propecia 1mg
How Can I Get Viagra Overnight
Best Viagra And Popular In Uk
Online Pharmacy Levitra
Cheap Viagra
Cialis No Rx
Soft Cialis
Best Canadian Pharmacy
100 Mg Cialis
Canada Viagra No Prescription
Generic Viagra In Canada
Viagra Of Pfizer
Viagra Pills
Viagra Oral Gel
Cialis Soft Pills
Cialis Professional
Best Prices For Propecia
Propecia Without A Prescription
Cialis Viagra
Buy Cheap Propecia Online
Obtain Viagra Without Prescription
I Need Viagra Now

Investments flowing into job search engines

Blogged under General by Jeff Clavier on Tuesday 9 August 2005 at 8:18 pm

Congratulations to SimplyHired for raising a $3M Series B from a great group of angel investors last Thursday, and to Indeed for following suit on Monday, scoring $5M from Union Square Ventures, the NY Times Company and Allen & Company. Fred Wilson shared interesting insights about the deal on his blog.

The consolidation in the jobs vertical search begins: Jobster acquires Workzoo

Blogged under General by Jeff Clavier on Tuesday 12 July 2005 at 4:45 am

I was reading Charlene Li’s excellent account of the launch of HotJobs crawling capability when I spotted that Jobster is buying WorkZoo. According to Charlene:

I spoke with Jobster CEO Jason Goldberg on Monday, and he described their vision of how WorkZoo will allow users to expand their search beyond their network of jobs on Jobster proper and see “every” job. WorkZoo has its cut out for them – in previous testing, they lagged significantly in their parsing ability compared to Indeed.com and Simply Hired. But this combination of Jobster and WorkZoo makes sense as a combined service – it’s also is similar to the partnership that currently exists between professional social networking service LinkedIn and SimplyHired.

The consolidation has already begun. Interesting.

Yahoo HotJobs is also a jobs search engine

Blogged under General by Jeff Clavier on Tuesday 12 July 2005 at 3:27 am

John Battelle said it best in “A Good Idea, Indeed. You’re Simply Hired “: Yahoo Hotjobs is entering the Jobs search arena.

“Yahoo seems to be taking a cue from Indeed and Simply Hired. Ouch. (Thanks, Richard)”

Joel Cheesman actually posted on the topic before John, and there is an interesting discussion in the comments of his post.

Let’s see what Monster.com and CarreerBuilder’s next moves are in this new segment.

Update: SiliconBeat added their take on the news

Bay Area zip codes

Blogged under General by Jeff Clavier on Wednesday 6 July 2005 at 1:42 am

Francois Gossieaux over at Emergence Marketing very rightly pointed out that our readers, and showcase testers, might not be familiar with our zip codes. Apologies for that.

I should mentiong that leaving the “Location” field empty uses San Carlos as the reference point for searches (it is sort in the center of Silicon Valley). And here are a few Bay Area zip codes: 94301 for Palo Alto, 94111 for San Francisco and 95113 for San Jose.

Glendor.com is a mashup

Blogged under General by Jeff Clavier on Tuesday 5 July 2005 at 7:04 pm

Om Malik has pointed this morning to a few applications using Google Maps to geolocate “stuff”, stuff being wireless-enabled cafes, wireless hot-spots in cities, and the now famous Craigslist meets Google Maps for having started the whole movement.

Michael Bazeley then pointed to Redfin, which combines satellite maps and MLS homes data for the Seattle area.

The O’Reilly Radar also referred to the Google Maps + Yahoo Traffic mashup that was taken down, and then brought back up.

So Glendor.com is a mashup as well then!

Finally, I found Google Maps Mania in our referrer logs:  An unofficial Google Maps blog tracking the websites, ideas and tools being influenced by Google Maps.

Mapping job listings

Blogged under General by Jeff Clavier on Tuesday 5 July 2005 at 2:57 am

Glendor Showcase

And this was developed before the Google Maps API was released! Which means that we might not have used all the capabilities now available.
Also make sure to zoom in the map to display the different companies with less overlap.

.

A few search examples

Blogged under General by Jeff Clavier on Tuesday 5 July 2005 at 2:37 am

The following searches will give you an idea of what can be accessed on Glendor.com:

  • Development jobs available 25 miles around Palo Alto, CA:  search map rss
  • Software jobs listed on company websites that includes the keywords (kernel, networking, file system): search map rss
  • Contract or temporary admin jobs published in the last 7 days, within 10 miles of San Francisco, CA: search map rss

Don’t be surprised if some jobs are outside of the Bay Area: we are restricting the sources to companies having operations, or their headquarter, in the Bay Area, but the jobs themselves might be anywhere in the US, or actually abroad.

Also, the precision of the mapping is at the level of the city since only rarely is the actual address of the company mentioned in the job listing. That’s why multiple jobs may overlap on one city, and clicking on one character does not display all jobs available for that city in the “bubble”.

A word about this blog

Blogged under General by Jeff Clavier on Monday 4 July 2005 at 1:48 am

Besides keeping you up to date on the developments of Glenbrook Networks, and the Glendor showcase, this blog will also talk about vertical search in general, and some of the technology issues that we had to solve when building our vertical search and information extraction platform.
Please tune in the RSS feed.

Welcome to the Glendor Showcase

Blogged under General by Jeff Clavier on Monday 4 July 2005 at 1:10 am

Glendor.com is the showcase of Glenbrook Networks, the search and information extraction platform provider.

We have chosen jobs as a vertical for this showcase because extracting listings from company web sites exercises all aspects of our technology to produce quality, structured results: surface and dynamic web crawling, layout recognition, natural language processing,…
We have also integrated a few additional features like the mapping job listings onto Google Maps, the ability to subscribe to search results via RSS feeds, and to syndicate searches on blogs or other web sites.

The showcase is providing job listings extracted from a few hundred Bay Area company web sites, and one large job board. Using it is pretty straightforward, but check out the Help section for typical queries.

Proudly powered by Wordpress - Theme Glendor