Tutorial 4

Obtaining accurate Google Trends data

Date: Wednesday, 26 June
Time: 16:15-18:15
Room: Aula 2.1

In recent years, Google Trends (GT) has been used to predict real-world variables in a multitude of fields such as finance, medicine, politics or economics, among others. However, despite its widespread use in the literature, several authors have expressed concerns about the accuracy issues of the data reported by Google and how these issues might affect the reproducibility of studies which use GT data. This tutorial presents a method to understand and identify these accuracy issues and how to treat them.

During the tutorial this method is explained in detail, first from a theoretical standpoint and later a case study with real life data is performed so that participants have the chance to test and learn this method. Finally, the main purpose of this workshop is to help establish a set of good practices when handling GT data in order to enhance the reproducibility of future scientific work which makes use of it.

Outline

  1. Introduction
    1.1. What is Google Trends?
    1.2. Sampling process
  2. Methodological Issues in Google Trends
    2.1. Selection of search terms
    2.2. Changing patterns of total searches
    2.3. Changes in Google’s algorithms
    2.4. Comparability across terms
    2.5. Inconsistencies across time frequencies
    2.6. Inconsistencies derived from sampling
  3. Reducing inconsistencies by averaging extractions
    3.1. Method and graphical evidence
    3.2. Practical case part I: Retrieving Google Trends data (R-Studio)
    3.3. Practical case part II: Processing and averaging Google Trends data
  4. Calculating the necessary extractions
    4.1. Practical case part III: Obtaining the necessary extractions
    4.2. Limitations
    4.3. Discussion

Target Audience

This tutorial is mainly directed to researchers who are interested in working with GT data in all of its applications, as it is a methodological workshop intended to improve the use of GT data and help provide attendants with a set of good practices when handling this type of data.

No prerequisites or previous knowledge are strictly needed, as all the materials will be provided and explained by the presenter. However, a basic background in R, Rstudio and/or any kind of programming language will help the participants understand the tutorial better.

Presenter

Eduardo Cebrián has a background in Economics (B.Sc., M.Sc., and Ph.D. Candidate at the Universitat Politècnica de València). Since 2022, he has been an associate professor of Economics, Statistics and Econometrics at the Business Department of the Universidad Europea de Valencia. Moreover, he has collaborated in various research projects regarding the use and applications of online information in the field of economics. His research interests include Google Trends and the use of online information, tourism economics and public policy evaluation, among others.