This ensures consistency across the three frequencies. First, to disaggregate the monthly series of GSV to weekly data and, second, to disaggregate these weekly results to daily data. For this purpose, we apply Chow and Lin's ( 1971) disaggregation routine twice. In a second step, we combine the information from monthly and weekly series into a single daily series that is consistent with the weekly and monthly series. 2 In a first step, we address the sampling variation by drawing multiple random samples of GSV data and averaging the series. To overcome these issues, we present a two-step procedure to construct frequency-consistent daily, weekly, and monthly series of GSV over a long period, which at the same time reduces sampling noise. 1 For example, in an area with one million inhabitants, the standard deviation of the daily results for “recession” can be as large as its mean value. Even in large administrative units, there can be substantial sampling noise for keywords with limited search volume. We show that in small countries or sub-national regions, the sampling variation of returned GSV series turns out to be substantial. (iii) For a chosen time frame, GSV data are based on a random sub-sample drawn by Google. In combination, (i) and (ii) imply that it is not possible to directly extract long-run daily GSV data from Google Trends that is consistent with the long-run trend captured by the monthly data. (ii) GSV default to weekly (monthly) data for time spans longer than 9 months (5.25 years). Only long time windows allow comparing magnitudes of specific events over time or studying long-run trends. Scaling of the index varies with the chosen time window: within each window, the index lies in the range between zero and 100. The two limitations arise from a combination of the following three factors: (i) For privacy reasons, Google only provides an index of search volumes, rather than the actual number of searches. While some of the prior research based on GSV data has addressed this issue in practice (e.g., D'Amuri & Marcucci, 2017 Matsa et al., 2017 McLaren & Shanbhogue, 2011 Narita & Yin, 2018 Vosen & Schmidt, 2012), we document the magnitude of the problem (as in Carrière-Swallow & Labbé, 2013), and how it varies with population size. Depending on the underlying “population size,” this introduces substantial sampling variation, affecting the results. Business cycle analysis and forecasting models, however, typically require data spanning more than a decade.Ī second limitation of the publicly available GSV data arises from random sampling by Google. Researchers therefore face a trade-off between using high-frequency daily series versus time-consistent series, where search volumes are comparable at different and distant points in time. As a result, daily data fail to capture long-run trends. ![]() This may explain why so far the literature has failed to identify an important limitation of GSV data: the raw data at the daily level is inconsistent across different time frequencies (e.g., daily vs. A recent example is the outbreak of the COVID-19 pandemic, during which macroeconomic conditions sometimes changed daily.Īlthough GSV data is in principle available on a daily basis for any country or region of the world, much of the prior research has focused on large countries and used monthly rather than weekly or daily GSV data. For instance, GSV are useful during crises, typically characterized by rapid changes in the real economy and an increased demand for the latest information about the economy. These features are especially useful when other data are lacking or are only available with important time lags. Compared to data traditionally used in business cycle analysis, GSV data are available in real-time, on a daily basis, and for many countries, regions, and even some large cities. ![]() In economics, GSV is used to forecast private consumption (e.g., Vosen & Schmidt, 2011 Woo & Owen, 2019) or unemployment (e.g., Smith, 2016, for the United Kingdom González-Fernández & González-Velasco, 2018, for Spain and Maas, 2019, for the United States). Researchers across disciplines, including political science (Stephens-Davidowitz, 2014), sociology (Gross & Mann, 2017 Swearingen & Ripberger, 2014), or health sciences (Tkachenko et al., 2017) have used this data to answer a series of questions (for a recent overview, see Jun et al., 2018). Google search volumes (GSV), commonly known as Google Trends, provide a readily available and free source of real-time data.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |