elasticsearch

Learn to use the Elasticsearch with patent data.

Enric Escorsa https://github.com/wipo-analytics
04-09-2021

Elasticsearch

Introduction

Elastic Search is an open source data management platform that is interesting primarily because of its rapid ingestion and indexing of different data types and its fast, powerful search capabilities. It is based on Lucene, that is an open source search library developed by Apache, that enables indexing and searching throughout all textual elements in our data, and it does that nearly in real time.

Elastic also includes specific modules such as Kibana for data visualization; With Kibana we can create and personalize dashboards from our searches and analysis.

The combination of Elastic and Kibana is known as the Elastic Stack (ELK).

How to work with Elastic and Kibana

We first need to install Elastic and Kibana. We will have to go to the Elastic website and download the installation files for Elastic search and Kibana.

We will unzip and save the Elastic and Kibana folders respectively in a convenient directory in our machine.

To start Elastic we need to go to the bin folder inside the Elastic folder and execute the file named Elastic.bat We will verify that the program is running by opening our browser and writing localhost:9200. A white screen will appear showing some set up details so we will verify that Elastic is running.

To run Kibana we will follow the same steps: we will go into the Kibana folder and execute Kibana.bat file.

Similarly, to see Kibana in action we will need to write this time in our browser’s tab:localhost:5601. Kibana will load and we will see it in our screen. This is were we will work on our searches and analysis and were we will be able to create dashboards.

Reading our data from a CSV

Let’s imagine that we have a CSV file containing patent data that we have obtained from a search and we want to quickly visualize the data and contents that are included.

As an example, we will look at the Cannavioid Edibles dataset, available here

There is an upload file button at the bottom right corner of the inicial screen. Let’s go ahead and click on it and then drag our file or browse our directories to locate it and import it. We will see a pre-importing page looking like this:

When importing, Elastic indexes the data, so variables are automatically identified and counted.

Here we can reassign names to column variables if needed.

Next step is importing our data by clicking on the import button.

We will have to name our data and save it. Our data is now ready for exploration on visualization.

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/wipo-analytics, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Escorsa (2021, April 9). WIPO Patent Analytics: elasticsearch. Retrieved from https://wipo-analytics.github.io/posts/2021-04-09-elasticsearch/

BibTeX citation

@misc{escorsa2021elasticsearch,
  author = {Escorsa, Enric},
  title = {WIPO Patent Analytics: elasticsearch},
  url = {https://wipo-analytics.github.io/posts/2021-04-09-elasticsearch/},
  year = {2021}
}