youtube-survey
This data source is based on a short survey about COVID-19-like illness run by the Delphi group at Carnegie Mellon. Youtube directed a random sample of its users to these surveys, which were voluntary. Users age 18 or older were eligible to complete the surveys, and their survey responses are held by CMU. No individual survey responses are shared back to Youtube.
This survey was a pared-down version of the COVID-19 Trends and Impact Survey (CTIS), collecting data only about COVID-19 symptoms. CTIS is much longer-running and more detailed, also collecting belief and behavior data. CTIS also reports demographic-corrected versions of some metrics. See our surveys page for more detail about how CTIS works.
The two surveys report some of the same metrics. While nominally the same, note that values from the same dates differ between the two surveys for unknown reasons.
As of late April 2020, the number of Youtube survey responses we received each day was 4-7 thousand. This was not enough coverage to report at finer geographic levels, so these indicators only report at the state level. The survey ran from April 21, 2020 to June 17, 2020, collecting about 159 thousand responses in the United States in that time.
We produce influenza-like and COVID-like illness indicators based on the survey data.
Table of ContentsThe survey contains the following 5 questions:
We define COVID-like illness (fever, along with cough, or shortness of breath, or difficulty breathing) or influenza-like illness (fever, along with cough or sore throat) for use in forecasting and modeling. Using this survey data, we estimate the percentage of people (age 18 or older) who have a COVID-like illness, or influenza-like illness, in a given location, on a given day.
Signals Descriptionraw_cli
and smoothed_cli
Estimated percentage of people with COVID-like illness
raw_ili
and smoothed_ili
Estimated percentage of people with influenza-like illness
Influenza-like illness or ILI is a standard indicator, and is defined by the CDC as: fever along with sore throat or cough. From the list of symptoms from Q1 on our survey, this means a and (b or c).
COVID-like illness or CLI is not a standard indicator. Through our discussions with the CDC, we chose to define it as: fever along with cough or shortness of breath or difficulty breathing. From the list of symptoms from Q1 on our survey, this means a and (c or d or e).
Symptoms alone are not sufficient to diagnose influenza or coronavirus infections, and so these ILI and CLI indicators are not expected to be unbiased estimates of the true rate of influenza or coronavirus infections. These symptoms can be caused by many other conditions, and many true infections can be asymptomatic. Instead, we expect these indicators to be useful for comparison across the United States and across time, to determine where symptoms appear to be increasing.
Estimation Estimating Percent ILI and CLIEstimates are calculated using the same method as CTIS. However, the Youtube survey does not do weighting.
SmoothingThe smoothed versions of all youtube-survey
signals (with smoothed
prefix) are calculated using seven day pooling. For example, the estimate reported for June 7 in a specific geographical area is formed by collecting all surveys completed between June 1 and 7 (inclusive) and using that data in the estimation procedures described above. Because the smoothed signals combine information across multiple days, they have larger sample sizes and hence are available for more locations than the raw signals.
These indicators have a lag of 2 days. Reported values can be revised for one day (corresponding to a lag of 3 days), due to how we receive survey responses. However, these tend to be associated with minimal changes in value.
LimitationsWhen interpreting the signals above, it is important to keep in mind several limitations of this survey data.
Whenever possible, you should compare this data to other independent sources. We believe that while these biases may affect point estimates – that is, they may bias estimates on a specific day up or down – the biases should not change strongly over time. This means that changes in signals, such as increases or decreases, are likely to represent true changes in the underlying population, even if point estimates are biased.
Privacy RestrictionsTo protect respondent privacy, we discard any estimate that is based on fewer than 100 survey responses. For signals reported using a 7-day average (those beginning with smoothed_
), this means a geographic area must have at least 100 responses in 7 days to be reported.
This affects some items more than others. It affects some geographic areas more than others, particularly areas with smaller populations. This affect is less pronounced with smoothed signals, since responses are pooled across a longer time period.
Source and LicensingThese indicators aggregate responses from a Delphi-run survey that is hosted on the Youtube platform. The data is licensed as CC BY-NC.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4