GISP Google Image Search Processor

Motivation

When executing a Google search, one is presented with a statistic that reports the number of results that were found and the time it took to find them. For example:

This statistic is frustrating because it shows that there is a whole universe of results out there but only the smallest fraction of results is presented for examination.

GISP Google Image Search Processor

This frustration has motivated me to develop an Excel/VBA process that attempts to expand the collection of search results.

The process is built on two main ideas:

Formulate a “Composite Topic Search”

Make use of the Google Image Search Engine.

A “Composite Topic Search” is a search that starts with a collection of search phrases exploring various aspect of the topic of interest.  The individual search results are collected and organized into a composite report.

Making the individual searches using the Google Image Search Engine returns an interesting collection of results.

I have joined these two ideas together into the Google Image Search Processor (GISP).

How to start GISP – the Topic Search Workbook

GISP begins by preparing an Excel Workbook containing a collection of search phrases that capture the essence of the desired Topic. A preliminary set of search strings can be extracted from Google.

For example, consider the following topic:

how to exchange worksheet data with VBA arrays

After the search string is entered into the Google search box, Google can then display a collection of suggested searches complementing the original search. For example:

Using a screen capture tool to copy this list as an image, one can then take advantage of the OCR functionality that is built into Microsoft One Note 2016 to extract the list as text.

OCR capabilities are also available within the GreenShot freeware utility. (http://getgreenshot.org/)

The GISP Input Workbook for this topic search looks like this:

Note the report title must not duplicate the search string.

Note also the search strings shown here could have used advanced Google search constructions.

Link to the Input Workbook

GISP Reports

Two Important Definitions

Before describing the details of the GISP Reports, there are two terms that need to be defined.

Direct URL is a URL collected by GISP pointing directly to the web page that was discovered by GISP.

Primary URL is the leftmost part of a Direct URL with the HTML prefix HTTPS:// or HTTP:// removed.

The VBA code to extract a Primary URL from a Direct URL is as follows:

Reporting Search String Counts

The Search String Counts report shows the effectiveness of each search string.

Link to Search String Counts Report

This Topic Search returned 172 Primary URLs.

The first page of the Search String Counts report is shown here.

All counts greater than or equal to 5 are highlighted.

The three main features of this report are:

  1. It identifies the more effective search strings and it gives a starting point to expand the Topic Search by selecting more search strings.
  2. By sorting the Primary URLs from the most frequently found to the ones that are not frequently visited or not widely known, it identifies the outstanding resources to study a Topic.
  3. Reviewing the list of Primary URLs and looking for those that are not frequently visited or not widely known, lead to the discovery of new resources.

For example some new resources:

www.bluepecantraining.com

www.guru99.com

www.aakaksatop.club

Since the report is presented as a PDF, each URL reported is a hyperlink.

Clicking on the URL will then open your browser to the web page.

Note that not all the Primary URLs will link back to a web page, it depends on the setup of the web site.

Reporting Direct URLs

The Direct URLs report show the full URL with the different search strings that returned the URL.

Link to the Direct URLs Report

The list of Direct URLs is sorted to follow the list of the Primary URLs.

This Topic search returned 399 URLs.

The first few rows of the Direct URLs report are shown here:

Each Direct is reported along with the Search Strings associated with the URL as well as the title of the web page.

Note the web pages titles are most important when reviewing the YOUTUBE URLs.

Both reports were exported from the GISP Composite Topic Search Report Workbook. This workbook can be used to generate other customized reports.

Link to the Composite Topic Search Report Workbook

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s