Chapter 1 Introduction

Patent analytics is a growing field that encompasses the analysis of patent data, analysis of the scientific literature, data cleaning, text mining, machine learning, geographic mapping and data visualisation. The WIPO Patent Analytics Handbook provides an introduction to advanced methods and tools for patent analytics. The Handbook complements the WIPO Manual on Open Source Patent Analytics which provides an introduction to tools and methods in patent analytics.

The Handbook focuses on more advanced methods and approaches using commercial and free tools and databases. The fields of patent search, patent statistics and patent analytics have been transformed in recent years by the growing availability of free and commercial databases and software for data mining, data visualisation and geographic mapping. The increasing availability of a wide range of web services or Application Programming Interfaces for access to patent data, the scientific literature and cloud computing services for machine learning or geocoding mean that today a patent analyst has access to an unprecedented and cost effective range of tools to facilitate their work.

Chapter 2 focuses on researching the scientific literature as a foundation for in depth patent research and analysis. This chapter begins by highlighting the growing accessibility of scientific publications and data arising from an increasing emphasis on open access publication. The chapter then focuses on the role of exploratory searches of the scientific literature in defining key word search strategies. The chapter then explores the main issues that arise when working with the scientific literature and how they can be addressed. The chapter concludes by considering strategies for joining together the scientific literature and patent literature.

Chapter 3 geocoding of patent data to develop geographic maps of patent and related data to geographic maps. Increasingly, it is possible to link different types of data on the same map using online geolocation services and to present the results in interactive maps. This chapter will discuss the principal patent data fields that are available for mapping and provide illustrations from services such as the USPTO and the ASEAN marine patent landscape report. The strengths and weaknesses of geolocation services such as the Google Maps API will be discussed such as the noisy nature of patent names and address fields, methods for regularising address data and the challenges involved in validating the georeferenced data returned from geolocation web services.

Chapter 4 addresses methods for counting patent data as a basis for creating descriptive patent statistics and statistical models. Methodologies for patent counts has received remarkably little attention outside a highly specialised literature and this chapter aims to provide a step by step introduction to the issues involves in developing descriptive patent statistics. The chapter concludes by illustrating how trends in demand for patent rights can be identified across multiple countries.

Chapter 5 addresses the importance of understanding patent classification systems as the key tool for supporting patent analytics. The patent system is supported by a range of classification schemes that are designed to assist patent examiners with identifying and retrieving patent documents. These classification schemes commonly take the form of alphanumeric codes organised from general to specific categories. Chapter 5 discusses the use of the International Patent Classification (IPC) and the closely related Cooperative Patent Classification (CPC) in patent analytics. This chapter provides an in depth introduction to the International Patent Classification (IPC) with a case study of using the IPC to examine patent activity for animal genetic resources and concludes with a discussion of the growing use of classification systems in technology mapping.

Chapter 8 discusses the important role that patent citations play in patent analytics and the strengths and weaknesses of different approaches to patent citation analysis. The chapter begins with a description of the two types of patent citation (backwards and forward citations), the sources of patent citations and their impacts before considering different approaches to citation counts based on citations of individual documents and citations of patent families.

Chapter 6 provides an introduction to text mining as a powerful tool in the patent analysts toolbox. Building on the discussion in Chapter 2 the chapter moves through the basics of text mining with patent data and concludes with a growing emphasis on machine learning approaches such as the popular Word2Vec algorithm.

Chapter 7 focuses on the opportunities presented by machine learning to advance patent analytics. Machine learning or artificial intelligence approaches are increasingly being applied to text classification and named entity recognition and image classification. The application of machine learning in patent analytics remains at an early stage with the USPTO pioneering the application of machine learning algorithms to inventor and applicant name cleaning while Clarivate Analytics has recently applied machine learning to enhance the cleaning of applicant names. In future years we are likely to see the application of machine learning across the spectrum of patent analysis tasks. However, it can be very difficult to separate the hype around machine learning and artificial intelligence from the reality of what is available and achievable now. This chapter aims to assist with navigating these exciting but at times confusing and over hyped opportunities.

Chapter ?? concludes this edition of the WIPO Patent Analytics Handbook with a discussion of the possible future(s) of patent analytics in the context of increasing access to patent and related data at scale and the rise of machine learning.