Home » 08 Quantitative Data Acquisition and Management

08 Quantitative Data Acquisition and Management

Home » 08 Quantitative Data Acquisition and Management

Summary

Many measures of political science concepts already exist and are ready for you to obtain and use. Use theory to identify the unit of analysis for your study, then determine the population and sample for your study. Be sure to capture appropriate variation in the DV and be alert for selection bias in how cases enter the sample. Issues of validity and reliability can potentially cause major problems with your analysis. Again, use your theory to carefully match indicators to concepts to minimize the risk of these problems. Think through the data collection process and plan ahead to maximize efficiency; gather all data for control variables and robustness checks in a single sweep, if possible. Consider using replication data sets to avoid reinventing the wheel. Much data, particularly for standard indicators of common concepts, is freely available online through a variety of sources, and your library probably also subscribes to other quantitative databases. Collecting new data is substantially more time-consuming than using previously-gathered data, but it is often necessary to test novel theories. Whether you use pre-gathered data or novel data, be sure to define your data needs list before beginning data collection, allow sufficient time, and document and back up everything.  

Articles

Web Extras: Pretesting Your Data Collection Procedure and Coding Rules
Web Extra: Sample Codebooks & Data Collection Forms
Web Extra: CowPlus Online
creating a data set

Vocab Flashcards

[qdeck random=”true”]

[q] Measure that indicates the ‘spread’ of a distribution; varies by level of measurement

[a] Dispersion

[q] Assumption, often implicit, that all other variables are held at constant values when we consider the effect of changing one IVs value on the DV

[a] Ceteris paribus assumption

[q] Document accompanying a published or replication dataset explaining coding rules, variable values, and related information for that data; sometimes called documentation

[a] Codebook

[q] Process of applying measurement rules to evidence to produce data

[a] Coding

[q] Author designated in a coauthored piece as the contact point for inquiries

[a] Corresponding author

[q] Dataset structure where observations are each country in each year; common in comparative and international politics research

[a] Country-year

[q] Cutoff point in the distribution of a test statistic such as χ2 or t to determine statistical significance

[a] Critical value

[q] Centralized storage facility for collected data; best known are ICPSR and Harvard Dataverse

[a] Data archive

[q] Large-scale study of the American population conducted yearly (1972–1994) or in alternating years (1996-present)

[a] General Social Survey

[q] Visual depiction of information and relationships

[a] Graphic organizer

[q] Measure calculated on data where multiple coders review a given case; indicates how well the coders agree on the values of the variables; sometimes called inter-reader reliability rating

[a] Inter-coder reliability ratings

[q] Multiple units observed at multiple points in time; also called time series cross section data

[a] Panel data

[q] The complete set of relevant cases for a theory

[a] Population

[q] Evaluating data collection or coding instruments against a small sample of cases or sources prior to beginning full-scale data collection

[a] Pretesting

[q] Subset of the population obtained/used for analysis

[a] Sample

[q] Exploiting someone else’s data by publishing analysis based on it before the collector is able to do so; considered very inappropriate professional behavior

[a] Scooping

[q] Variables that we believe influence the DV along with our IV(s) of interest; must be accounted for in any qualitative or quantitative analysis to obtain accurate results

[a] Control variables (CVs)

[q] Pair of states or other actors (used as a unit of analysis)

[a] Dyad

[q] The item that constitutes one observation in a quantitative study: decision, individual opinion, country, dyad, state-year, etc.

[a] Unit of analysis

[q] Spatial and temporal scope of a theory or study

[a] Domain

[q] Characteristic of an indicator: the indicator captures the concept of interest and nothing else

[a] Validity

[q] Pair of states or other actors (used as a unit of analysis)

[a] Dyad

[q] The item that constitutes one observation in a quantitative study: decision, individual opinion, country, dyad, state-year, etc.

[a] Unit of analysis

[q] Spatial and temporal scope of a theory or study

[a] Domain

[q] Characteristic of an indicator: the indicator captures the concept of interest and nothing else

[a] Validity

[q] Process of identifying a valid observable indicator for the concepts expressed in a theory

[a] Operationalization

[q] The result of analyzing data that suffer from a selection effect

[a] Selection bias

[q] Natural or man-made processes produce an observed sample that is a biased subset of the underlying population; all cases do not have an equal effect of entering the observed sample

[a] Selection effect

[q] A single instance of the phenomenon under investigation

[a] Observation

[q] Additional model specifications estimated using alternate indicators of key concepts to determine that findings hold across different operationalizations of those concepts

[a] Robustness checks

[q] Mathematical altering of the scale of a variable to create a more linear relationship

[a] Transformation

[q] Summarize and describe characteristics of data (mean, median, mode, standard deviation, etc.) without reference or generalization beyond the available data itself

[a] Descriptive statistics

[q] Branch of statistics that generalizes from samples to populations

[a] Inferential statistics

[q] Characteristic of continuous distributions; concerned with distance between mean and median values

[a] Skew

[/qdeck]

Review Quiz

[qwiz random=”true” random_mc=”true”]]

[q] A)(n) _____ indicator captures only the concept of interest and nothing else.

[c]IHByZWNpc2U=[Qq]

[c]IHJlbGlhYmxl[Qq]

[c]IGltbXV0YWJsZQ==[Qq]

[c]IHZh bGlk[Qq]

[c]IHJvYnVzdA==

Cg==

wqA=[Qq]

[q] A)(n) _____ indicator produces values that can be replicated over time, across cases, and across coders.

[c]IHByZWNpc2U=[Qq]

[c]IHJlbG lhYmxl[Qq]

[c]IGltbXV0YWJsZQ==[Qq]

[c]IHZhbGlk[Qq]

[c]IHJvYnVzdA==

Cg==

wqA=[Qq]

[q] Factors that we know influence the outcome of interest but which are not of research interest to us are called ____; they must be included in our models to avoid omitted variable bias.

[c]IGNvbnRyb2wg dmFyaWFibGVz[Qq]

[c]IGR1bW15IHZhcmlhYmxlcw==[Qq]

[c]IHJvYnVzdG5lc3MgY2hlY2tz[Qq]

[c]IGluZmxhdGVkIHZhcmlhYmxlcw==[Qq]

[c]IGxhZ2dlZCB2YXJpYWJsZXM=

Cg==

wqA=[Qq]

[q random_mc=”false”] Your ____ is one of the thing you observe as a row in your dataset.

[c]IGxldmVsIG9mIGFuYWx5c2lz[Qq]

[c]IG1vZGUgb2YgYW5hbHlzaXM=[Qq]

[c]IHVuaXQgb2Yg YW5hbHlzaXM=[Qq]

[c]IG5vbmUgb2YgdGhlIGFib3Zl

Cg==

wqA=[Qq]

[q] In a robustness or sensitivity check, an analyst ____.

[c]IHN1YnN0aXR1dGVzIGRpZmZlcmVudCBtZWFzdX JlcyBvZiBhIGNvbmNlcHQgaW50byBhIG1vZGVs[Qq]

[c]IHVzZXMgZGlmZmVyZW50IHNvZnR3YXJlIHRvIHJlLWFuYWx5emUgdGhlIGRhdGE=[Qq]

[c]IGRyb3BzIG9ic2VydmF0aW9ucyBvbmUgYXQgYSB0aW1l[Qq]

[c]IGNhcmVmdWxseSBkb2N1bWVudHMgYWxsIHRoZSBkaWZmZXJlbnQgdmFyaWFibGUgdmFsdWVzIHByZXNlbnQgaW4gdGhlIGRhdGE=[Qq]

[/qwiz]

Archives

Categories

Site contents (c) Leanne C. Powner, 2012-2026.
Background graphic: filo / DigitalVision Vectors / Getty Images.
Cover graphic: Cambridge University Press.

Powered by WordPress / Academica WordPress Theme by WPZOOM