When executing a Google search, one is presented with a statistic that reports the number of results that were found and the time it took to find them. For example:
This statistic is frustrating because it shows that there is a whole universe of results out there but only the smallest fraction of results is presented for examination.
GISP Google Image Search Processor
This frustration has motivated me to develop an Excel/VBA process that attempts to expand the collection of search results.
The process is built on two main ideas:
Formulate a “Composite Topic Search”
Make use of the Google Image Search Engine.
A “Composite Topic Search” is a search that starts with a collection of search phrases exploring various aspect of the topic of interest. The individual search results are collected and organized into a composite report.
Making the individual searches using the Google Image Search Engine returns an interesting collection of results.
I have joined these two ideas together into the Google Image Search Processor (GISP).
How to start GISP – the Topic Search Workbook
GISP begins by preparing an Excel Workbook containing a collection of search phrases that capture the essence of the desired Topic. A preliminary set of search strings can be extracted from Google.
For example, consider the following topic:
how to exchange worksheet data with VBA arrays
After the search string is entered into the Google search box, Google can then display a collection of suggested searches complementing the original search. For example:
Using a screen capture tool to copy this list as an image, one can then take advantage of the OCR functionality that is built into Microsoft One Note 2016 to extract the list as text.
OCR capabilities are also available within the GreenShot freeware utility. (http://getgreenshot.org/)
The GISP Input Workbook for this topic search looks like this:
Note the report title must not duplicate the search string.
Note also the search strings shown here could have used advanced Google search constructions.
Two Important Definitions
Before describing the details of the GISP Reports, there are two terms that need to be defined.
A Direct URL is a URL collected by GISP pointing directly to the web page that was discovered by GISP.
A Primary URL is the leftmost part of a Direct URL with the HTML prefix HTTPS:// or HTTP:// removed.
The VBA code to extract a Primary URL from a Direct URL is as follows:
Reporting Search String Counts
The Search String Counts report shows the effectiveness of each search string.
This Topic Search returned 172 Primary URLs.
The first page of the Search String Counts report is shown here.
All counts greater than or equal to 5 are highlighted.
The three main features of this report are:
- It identifies the more effective search strings and it gives a starting point to expand the Topic Search by selecting more search strings.
- By sorting the Primary URLs from the most frequently found to the ones that are not frequently visited or not widely known, it identifies the outstanding resources to study a Topic.
- Reviewing the list of Primary URLs and looking for those that are not frequently visited or not widely known, lead to the discovery of new resources.
For example some new resources:
Since the report is presented as a PDF, each URL reported is a hyperlink.
Clicking on the URL will then open your browser to the web page.
Note that not all the Primary URLs will link back to a web page, it depends on the setup of the web site.
Reporting Direct URLs
The Direct URLs report show the full URL with the different search strings that returned the URL.
The list of Direct URLs is sorted to follow the list of the Primary URLs.
This Topic search returned 399 URLs.
The first few rows of the Direct URLs report are shown here:
Each Direct is reported along with the Search Strings associated with the URL as well as the title of the web page.
Note the web pages titles are most important when reviewing the YOUTUBE URLs.
Both reports were exported from the GISP Composite Topic Search Report Workbook. This workbook can be used to generate other customized reports.