Dynamic Demographic Filter

In How Much Mandatory Disclosure is Effective?, I introduce a dynamic demographic filter for empirical studies conducted on Amazon’s Mechanical Turk platform. This filter reduces sampling bias by automatically screening potential participants on demographic criteria before they engage in the study and rejecting those who would render the participant sample unrepresentative. The filter dynamically tracks the demographic characteristics of participants who have successfully completed the study and determines in “real-time” whether each worker may participate.

Unlike traditional prescreening, researchers pay nothing for rejected workers, who are warned in advance and advised not to proceed unless they accept the risk of rejection and nonpayment after completing the short demographic survey. Experience shows that workers will take the risk in exchange for an immediate decision on eligibility, and a sufficiently large payment for participation nonetheless attracts a substantial number. The filter is freely available to the research community and seamlessly integrates into Qualtrics with no programming knowledge required (support for other survey platforms depends on their technical capabilities).  Contact me to use the filter in your study.

Filter Operation

The filter operates as follows:

  1. The researcher chooses the demographic categories over which he or she wishes to obtain a representative sample, i.e., age, sex, education, and income, as well as a target sample size for the study.
  2. Population proportions for either the joint or marginal distributions are obtained from U.S. census microdata (see below for a discussion of the distinction between joint vs. marginal distributions).
  3. The researcher easily integrates the filter into his or her Qualtrics survey.
  4. Mechanical Turk workers wishing to participate the study are warned that participation and payment are contingent upon being found eligible after answering a few short questions, but the qualification criteria are not disclosed in advance.  Workers are particularly advised not to proceed if they are not willing to assume this risk.
  5. For each aspect of a worker’s demographic profile, the filter examines whether the number of participants with those characteristics who have completed the study exceeds the “cap” for this study.  The “cap” is computed by multiplying the target sample size by the population proportion (again, see below for the distinction between joint vs. marginal distributions).
  6. The worker is rejected if the “cap” is exceeded and accepted if not.
Joint vs. Marginal Distributions

The filter can operate in either “joint” or “marginal” distribution mode. In the former, population proportions are obtained for each unique combination of demographic characteristics, which are used to determine the specific “cap” that applies to a given worker attempting to participate.  Cross-tabulations of the U.S. population along several demographic characteristics may be obtained from the American Community Survey microdata. By using the filter in “joint” distribution mode, researchers ensure that the participant sample precisely reflects the exact combinations of demographic characteristics in the population and that correlations between characteristics in the population–i.e., education and income–are preserved in the participant sample. The downside is that the number of unique combinations grows substantially with each new demographic category (age, sex, etc.) and response group (i.e., 18-25, 26-40, etc.), necessitating the combination of response groups to ensure that they are not too small for practical use on Mechanical Turk.

In “marginal” distribution mode, the filter considers only the population proportions for each of the demographic categories and requires that participants’ demographic characteristics remain under the “cap” for each of these cumulatively.  This mode sacrifices an exact representation of the population across each unique combination of demographic categories for a greater number of participants overall in each of the categories and response groups.  It can potentially lead to sampling bias if correlations between demographic characteristics in the Mechanical Turk population are not representative of those of the U.S. population as a whole, but nonetheless as a practical matter permits filtering on many more response groups than the joint approach.  In How Effective is Mandatory Disclosure? An Experimental Evaluation of Term Substantiation, I employ the filter in “marginal” mode to ensure that each overall demographic group is sufficiently represented, but researchers utilizing the filter are free to choose which mode they prefer.

If you would like to use the filter in your study, please contact me.