The WIPO Patent Analytics team have developed a number of datasets for use in patent analytics publications and training workshops. The datasets are freely accessible through a repository on the Open Science Framework here.
Datasets are organised around publications and themes. The datasets can be downloaded as individual zip files at the links provided below.
drones. A legacy dataset of Clarivate Analytics patent records for drone technology. This is not updated and is used purely for training in approaches to patent counts and the use of Derwent Innovation data. Download the zip file here. The data is also available in the drones
data package for R users that can be accessed at https://wipo-analytics.github.io/drones/.
dronesr: A recent dataset of scientific and patent publications from the Lens open access database. This dataset contains a full spectrum of scientific and patent literature on drone technology and is current until December 2021. To download all files use this link. Use the following links for individual zipped sets: literature, patents. Public collections are also available on the Lens for these sets to allow for exploration.
The dronesr
data package for R users can be installed from https://poldham.github.io/dronesr/.
Handbook: The datasets used in each chapter of the WIPO Patent Analytics Handbook including large patent datasets from the US PatentsView are available from the repository here and organised by chapter. Many of these datasets are in the R .rda format. These sets include large files for the classification and text mining chapters and you may wish to download them individually.
Manual: The datasets used in the WIPO Manual on Open Source Patent Analytics are available for the chapters on Tableau Public and Gephi. They can be downloaded as a single zip file here.
If you see mistakes or want to suggest changes, please create an issue on the source repository.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/wipo-analytics, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".