Sep 30, 2009

What do the values in time method mean?

  • Longitudinal - Data collected repeatedly over time to study change in a population.
  • Longitudinal:Cohort/Event-based - Data collected over time about a group of individuals that are connected in some way or have shared some significant experience within a given period. Examples: birth, disease, education, employment, family formation.
  • Longitudinal:Trend/Repeated Cross-section - Studies different samples /groups of people from the same population at several points in time; conclusions are drawn for the population. Examples: public opinion polls, elections studies, etc.
  • Longitudinal:Panel - Data collected over time from, or about, the same sample of respondents.
  • Longitudinal:Panel:Continuous - Reports from the panel are collected on a regular basis.
  • Longitudinal:Panel:Interval - Measurements are taken only when information is needed.
  • Time Series - Data are collected repeatedly over time to study change in observations.
  • Time Series:Continuous - Phenomena are measured at every instant of time. Examples: lie detectors, electrocardiograms, etc.
  • Time Series:Discrete - Measurements are taken at (usually regularly) spaced intervals. Examples: macroeconomics (weekly share prices, monthly profits, sales, etc.), meteorology (daily rainfall, hourly temperature, etc.), sociology (crime figures, employment figures, etc.).
  • Cross-sectional - Data about a population are obtained only once.
  • Cross-sectional ad-hoc follow-up - Data collected at one point in time to complete information collected in a previous cross-sectional study; the decision to collect follow-up data is not included in the study design.

Please note that ICPSR has not yet retroactively applied the time method field to all 8000+ studies we archive.

Sep 1, 2009

How do I use the Recode Syntax analysis tool?

The Recode Syntax tool is found with the online analysis utilities. Users must first define a variable name to generate the recode. In this example, we are choosing to create a recode that cross-classifies race and age. It creates a dummy variable that takes on the value of 1 if the respondent identifies themselves as Black and over 18 years of age.

Screen Shot

Users can specify the content of the recode with a series of dropdown variables, logical operators, and mathematical functions. The choices are available in the dropdown values linking the expressions.

Screen Shot

To add additional variables, click the "Add" button on the right side of the screen. This will continue to add variables with which the user can build recodes. Below, race is added to age to generate the recode of interest. Users must complete both sides of the recode, in order to create two sets of recode statements.

Screen Shot

Users can continue to add recodes to the syntax by clicking the "Recode Syntax" button.

Users must then choose a statistical package of choice and click the "Submit Code" button. The code will then appear in the window and can either be printed or downloaded. Again, this tool does not generate recodes but rather the software code needed to generate them. They must be run in the appropriate software using the dataset from which they are generated.

Screen Shot

Creative Commons License This work is licensed under a Creative Commons Attribution-Noncommercial 3.0 United States License.

Jun 26, 2009

How does ICPSR manage versioning?

  • What triggers a new edition or version of a study?

    • A change in any of the data and/or documentation files.

    • The addition of withdrawal of data and/or documentation files.

  • How and where is such a change to a study documented?

    In the metadata record:

    • Version field (in version notation "ICPSRXXXXX-v3")

    • Version history field (collect.changes, which provides a text description of what has changed, and a datestamp)

    • Citation display includes the version statement

  • What happens to unchanged files (if changes don't apply to all files)?

    ICPSR does not currently version at the individual file level - our version statement references the collection as a whole. If only one file of a multiple file collection changes, the collection version changes.

  • Are previous editions/versions kept?

    Yes, through a back-up system and a searchable 'browse archive' feature available to authorized staff.

  • Are these made available to users?

    Upon request only, previous versions can be made available to users.

  • Clarification on terminology, do we use 'edition', 'version', or other terms?

    • ICPSR uses 'version' exclusively. (Historically, ICPSR used three different terms: edition, version, and release, but these have all been rolled into the single term "version" and the notation "ICPSRXXXXX-v3").

    • What do we mean by "version" : A form or variant of the original ICPSR-archived data collection.

Jun 23, 2009

Why and how should I cite data?

Why should I cite data?

Citing data files in publications based on those data is important for several reasons:

  • Other researchers may want to replicate research findings and need the bibliographic information provided in citations to identify and locate the referenced data.

  • Citations appearing in publication references are harvested by key electronic social sciences indexes, such as Web of Science, providing credit to the researchers.

  • Data producers, funding agencies, and others can track citations to specific collections to determine types and levels of usage, thus measuring impact.

Where do I find the citation?

Citations for ICPSR data can be found in the following locations:

  1. Study descriptions that appear on the Web site
  2. File manifest
  3. PDF study description file

Both the file manifest and the PDF study description file are automatically included with every download. Thus, every download is accompanied by a copy of the standard citation that can be copied and pasted with ease.

What do the citations look like?

Here are some examples:

ABC News, and The Washington Post. ABC News/Washington Post Poll, May 2007 [Computer file]. ICPSR24588-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2009-04-17. doi:10.3886/ICPSR24588

United States Department of Commerce. Bureau of the Census, and United States Department of Labor. Bureau of Labor Statistics. Current Population Survey: Annual Demographic File, 1987 [Computer file]. ICPSR08863-v2. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2009-02-03. doi:10.3886/ICPSR08863

Johnston, Lloyd D., Jerald G. Bachman, Patrick M. O'Malley, and John E. Schulenberg. Monitoring the Future: A Continuing Study of American Youth (12th-Grade Survey), 2007 [Computer File]. ICPSR22480-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2008-10-29. doi:10.3886/ICPSR22480

Hall, David, Clement Leduka, Michael Bratton, E. Gyimah-Boadi, and Robert Mattes. Afrobarometer Round 3: The Quality of Democracy and Governance in Lesotho, 2005 [Computer file]. ICPSR22203-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2009-05-19. doi:10.3886/ICPSR22203

Note that we also include a DOI (Digital Object Identifier) at the end of each citation. A DOI is a unique persistent identifier for a published digital object, such as an article of a study, providing a link to the article or study. This means that if you publish an article using ICPSR data and you include the DOI in the data citation, you make it easy for other researchers to get back to the original data.

How can I let ICPSR know about my publication?

Users of ICPSR data are required to send us bibliographic citations for each completed manuscript or thesis abstract. This allows us to provide funding agencies with essential information about use of archival resources and facilitates the exchange of information about the research activities of principal investigators.

Email bibliography@icpsr.umich.edu to submit citations for inclusion in our Bibliography.

How do I submit a citation for a publication I have written using your data?

If you have published work based on our data, or if you know of data-related literature that is not in our bibliography, please send the citation to bibliography@icpsr.umich.edu

View the citations in the Bibliography of Data-Related Publications.

Jun 18, 2009

When I attempt to uncompress the files I downloaded from your site, WinZip complains that the file name is insensible. How can I uncompress the file?

The total path length (not file name length) has to be less than 255 characters. Our file names can be lengthy. If the path to which you wish to extract your files is also lengthy, then WinZip will fail.

Extract your files to the root directory of your hard drive. I.e., extract the files to c:/ instead of c:/User/My Documents/Various Social Science Projects On Which I Work/ICPSR Data/.

How do I use a SAS setup file to import ASCII data?

Setup files contain the syntax or program code to read raw data (ASCII) into a statistical package. The instructions below demonstrate how to use SAS setup files in a Windows environment.

These instructions assume that you have already downloaded the ASCII data and SAS setup file from the Internet. If you have a compressed version of a file, you will have to decompress it before using the setup file.

Note: In order to successfully use setup files, you must know the exact location (i.e., full pathname, such as C:\My Documents\Data) and filename (e.g., da9999.txt) of the files that you obtained from ICPSR.

Instructions

  1. Download the SAS setup file from the ICPSR Web site.

  2. Most of the files downloaded from the ICPSR Web site will be compressed. You will have to decompress the files using WinZip or other decompression software. More information about decompressing files can be found at the help page, How do I decompress the files I download from your site? Once the SAS setup file has been downloaded and decompressed, rename the file to add a '.sas' extension. This will allow SAS to recognize the file as a SAS syntax file.

    Screen Shot

  3. Open SAS for Windows.

    Screen Shot

  4. Open the SAS setup file in the SAS Program Editor window.

    • Click on File and then Open to get an Open File dialog box.

    • At the top of the box, where it says Look In, choose the path where the SAS setup file is located.

    • At the bottom of the box, set Files of Type to All Files.

    • You will then see a list of all files in the directory you selected. Either double-click on the SAS setup file or click once on the name of your chosen file (the name will appear after File Name) and then click on Open.

    Screen Shot

    • Since the SAS setup file is a text file, SAS will display the file in the SAS Program Editor.

    Screen Shot

  5. Most ICPSR setup files contain a header that describes the contents of the file. Once you have opened the setup file in the SAS Program Editor, read the ICPSR header, if present, for important information about the file.

    Screen Shot

  6. After reading the header, scroll to the DATA command. Add a dataset name for your data to this command line, if you want it located in the temporary SAS library 'work.' Please consult SAS documentation if you want the dataset saved in a permanent SAS library.

    Screen Shot

  7. Scroll to the INFILE command. Replace the text that says physical-filename or file-specification with the full path and name of the data file you extracted from the downloaded file.

    • It is important that you include the full path (e.g., C:\My Documents\Data); otherwise SAS may not be able to locate the file. For example, if you downloaded the data for ICPSR 2992 into the directory C:\My Documents\Data and you called the file da2992.txt, then the INFILE command should read:

      INFILE 'C:\My Documents\Data\da2992.txt' LRECL=30;

      (Note that the LRECL varies by study and the correct number will already be provided in the SAS setup file.)

    Screen Shot

  8. If there are PROC FORMAT, FORMAT, or MISSING VALUE RECODE commands in the setup file, ICPSR usually places SAS comment delimiters before (/*) and after (*/) the appropriate section, which means that SAS will not automatically read these commands. If you want SAS to read and execute these commands, you should remove the set of comment markers for each section.

    Screen Shot

  9. Scroll to the end of the setup file. If a RUN command is not already there, then type one in. Make sure the command ends with a semicolon.

    Screen Shot

  10. You are now finished editing the SAS setup file. Run the statements by clicking on Run > Submit.

    Screen Shot

  11. The log file will show the commands that SAS processed, as well as any error messages.

    Screen Shot

  12. The data can now be used for analysis. If you are using SAS System for Windows Release 7.0 or higher, you can view the data file in the SAS Table Editor. Go to the Tools menu bar and select Table Editor. Once the Table Editor window appears, click on File and Open to open your newly-created data file. The data file will be located in the Work library unless you changed the library reference prior to running the setup file. Click on the data file and then on Open to see the data displayed in the Table Editor.

    Screen Shot

  13. Users should be aware that a SAS dataset created in the Work library will be discarded at the end of the SAS session. To save a SAS dataset for subsequent SAS sessions you must assign the file a two-level name. The first level is the library name and the second level is the dataset name. This can be done in Windows by selecting Save As... under the File menu in the VIEWTABLE window, creating a new library using the Create New Library icon, then specifying a data table name and clicking on Save. Please refer to your SAS manual or SAS System Help for more information about saving SAS data sets.

    Screen Shot

  14. For further help with the SAS System for Windows, consult the HELP menu on the top toolbar of SAS or refer to your SAS manual.

Creative Commons License This work is licensed under a Creative Commons Attribution-Noncommercial 3.0 United States License.

How do I use an SPSS setup file to import ASCII data?

Setup files contain the syntax or program code to read undelimited data (ASCII) into a statistical package. The instructions below demonstrate how to use SPSS setup files in a Windows environment.

These instructions assume that you have already downloaded the ASCII data and SPSS setup file from the Internet. If you have a compressed version of a file, you will have to decompress it before using the setup files.

Note: In order to successfully use setup files, you must know the exact location (i.e., full pathname, such as C:\My Documents\Data) and filename (e.g., da9999.txt) of the files that you downloaded.

Instructions

  1. Download the SPSS setup file from the ICPSR Web site.

  2. Most of the files downloaded from the ICPSR Web site will be compressed. You will have to decompress the files using WinZip or other decompression software. Once the SPSS setup file has been downloaded and decompressed, rename the file to add the '.sps' extension. This will allow SPSS to recognize the file as an SPSS syntax file. Do not save the setup file on your local machine with the ".txt" extension because SPSS for Windows will try to read it as a data file rather than a syntax file.

    Screen Shot

  3. Open SPSS for Windows.

    Screen Shot

  4. Open the SPSS setup file in the SPSS for Windows Syntax Editor.

    • If a dialog window of shortcuts opens, close it by clicking the 'Cancel' button.

    • Click on File and then Open to get an Open File dialog box.

    • At the top of the box, where it says Look In, choose the path where the SPSS setup file is located.

    • At the bottom of the box, set Files of Type to All Files to see a listing of all files in a particular directory or to Syntax (*.sps) if you saved the setup file with an '.sps' extension.

    • You will then see a list of files in the directory you selected. Either double-click on the SPSS setup file or click once on the name of your chosen file (the name will appear after File Name) and then click on Open.

    Screen Shot

    • Since the SPSS setup file is a text file, SPSS will open a new Syntax Editor window to display the file.

    Screen Shot

  5. If you try to open the SPSS setup file and you are prompted with a dialog box that says Opening File Options, then press Cancel. SPSS is trying to read the setup file as a data file rather than a syntax file. This is likely to happen if your setup file has a ".txt" extension. You can either rename the file and remove the ".txt" extension or you can open the setup file in an editing program and copy and paste the text into the SPSS for Windows Syntax Editor.

    Screen Shot

  6. Most ICPSR setup files contain a header that describes the contents of the file. Once you have opened the setup file in the SPSS for Windows Syntax Editor, read the ICPSR header, if present, for important information about what is contained in the file.

    Screen Shot

  7. After reading the header, scroll to the DATA LIST command. Replace the text that says physical-filename or file-specification with the full path and name of the data file extracted from the downloaded file.

    • It is important that you include the full path (e.g., C:\My Documents\Data); otherwise SPSS may not be able to locate the file. For example, if you extracted the data for ICPSR 2992 into the directory C:\My Documents\Data and you called the file da2922.txt, then the DATA LIST command should read:

      DATA LIST FILE="C:\My Documents\Data\da2992.txt" /

    Screen Shot

  8. If there is a MISSING VALUES command in the setup file, ICPSR usually places an SPSS comment delimiter (*) before the command line, which means that SPSS will not read this command. If you want SPSS to read this command, you should delete the asterisk and be sure that the command starts in the first column of the line.

    Screen Shot

    Some SPSS setup files also contain a missing value RECODE command. This command may also have an SPSS comment marker (*) at the beginning of the line. If you want SPSS to read this command, you should delete the asterisk and be sure that the command starts in the first column of the line. When both a MISSING VALUES and missing value RECODE command are present in the same SPSS setup file, only one of the two commands should be executed. Choose MISSING VALUES if you want to retain the missing values in the data, but have them designated as missing values by SPSS for analysis purposes. Choose missing value RECODE if you want missing values converted to system missing. Please note that the missing value RECODE command may collapse several different missing values for one variable into system missing.

  9. Scroll to the end of the setup file. If an EXECUTE command is not already there, then type one in. Start the command in the first column of a new line and end the line with a period.

    Screen Shot

  10. You are now finished editing the SPSS setup files. Run the statements by clicking on Run -> All. The status bar at the bottom of the screen will show the commands that SPSS is processing. When SPSS has completed executing the commands, the status bar will display the message "SPSS for Windows Processor is ready."

    Screen Shot

  11. When the processor is finished, go to the Window menu and choose SPSS for Windows Data Editor to see the data. Any error messages will be printed in a log file in the SPSS Output window.

    Screen Shot

  12. If you do not see the data appear in the SPSS Data Editor, check the status bar in the lower left corner of the screen. If the status bar says Transformations Pending, go to the Transform Menu and click on Run Pending Transformations. This is usually necessary when you do not have an Execute command at the end of the setup files.

    Screen Shot

  13. Once you have read the data into the SPSS Data Editor, you may then start subsequent sessions using the imported data. You can bypass having to import the data with the SPSS setup files every time you want to access the data by saving the imported data on storage media; go to the File menu and click on Save As ... to save the file as either an SPSS system or portable file. Specify the directory where you would like to store the file using the Save in: box, enter a Filename, and choose the type of file you would like the data saved as. You can then begin subsequent SPSS sessions by opening the saved file from the Data Editor window.

    Screen Shot

  14. For further help with SPSS for Windows, consult the HELP menu on the top toolbar of SPSS or refer to the SPSS for Windows Base System User's Guide.

Creative Commons License This work is licensed under a Creative Commons Attribution-Noncommercial 3.0 United States License.

What are setup files?

Many of our data collections that contain ASCII data files are accompanied by setup files that allow users to read the text files into statistical software packages. Since a visual interpretation of alphanumeric data files is inefficient, statistical software is needed to define, manipulate, extract, and analyze variables and cases within data files. We currently provide for many of our data collections setup files for SAS, SPSS, and Stata statistical software packages, three of the more commonly used analytical software packages for the social sciences.

The following instructions explain the different components of SAS, SPSS, and Stata setup files. Setup files for certain collections may not contain all of the commands listed below.

SAS Setup Files

SAS setup files can be used to generate native SAS file formats such as SAS datasets, SAS xport libraries, and transport files. Our SAS setup files generally include the following SAS sections. Click on each section to see an example taken from ICPSR 6512 (Capital Punishment in the United States, 1973-1993).

  1. PROC FORMAT: Creates user-defined formats for the variables. Formats replace original value codes with value code descriptions. Not all variables necessarily have user-defined formats.
  2. DATA: Begins a SAS data step and names an output SAS dataset.
  3. INFILE: Identifies the input data file to be read with the input statement. Users must replace the "physical-filename" with host computer-specific input file specifications. For example, users on Windows platforms should replace "physical-filename" with "C:\06512-0001-Data.txt" for the data file named "06512-0001-Data.txt" located on the root directory "C:\".
  4. INPUT: Assigns the name, type, decimal specification (if any), and specifies the beginning and ending column locations for each variable in the data file.
  5. LABEL: Assigns descriptive labels to all variables. Variable labels and variable names may be identical for some variables.
  6. FORMAT: Associates the formats created by the PROC FORMAT step with the variables named in the INPUT statement.
  7. MISSING VALUE RECODES: Sets user-defined numeric missing values to missing as interpreted by the SAS system. Only variables with user-defined missing values are included in the statements.

SPSS Setup Files

SPSS setup files can be used to generate native SPSS file formats such as SPSS system files and SPSS portable files. SPSS setup files produced by generally include the following SPSS sections. Click on each section to see an example taken from ICPSR 6512 (Capital Punishment in the United States, 1973-1993).

  1. DATA LIST: Assigns the name, type, decimal specification (if any), and specifies the beginning and ending column locations for each variable in the data file. Users must replace the "physical-filename" with host computer-specific input file specifications. For example, users on Windows platforms should replace "physical-filename" with "C:\06512-0001-Data.txt" for the data file named "06512-0001-Data.txt" located on the root directory "C:\".
  2. VARIABLE LABELS: Assigns descriptive labels to all variables. Variable labels and variable names may be identical for some variables.
  3. VALUE LABELS: Assigns descriptive labels to codes in the data file. Not all variables necessarily have assigned value labels.
  4. MISSING VALUES: Declares user-defined missing values. Not all variables in the data file necessarily have user-defined missing values. These values can be treated specially in data transformations, statistical calculations, and case selection.
  5. MISSING VALUE RECODE: Sets user-defined numeric missing values to missing as interpreted by the SPSS system. Only variables with user-defined missing values are included in the statements.

Stata Setup Files

Stata setup files can be used to generate native Stata DTA files. Stata setup files produced by ICPSR generally include the following Stata sections. Click on each section to see an example taken from ICPSR 6512 (Capital Punishment in the United States, 1973-1993).

  1. FILE SPECIFICATIONS: Assigns values to local macros that specify the locations of the files used to build a Stata system file. Users must replace the "physical-filename" with host computer-specific input file specifications. For example; users on Windows platforms should replace "raw-datafile-name" with "C:\06512-0001-Data.txt" for the data file named "06512-0001-Data.txt" located on the root directory of "C:\". Simarlarly, the "dictionary-filename" should be replaced with "C:\06512-0001-Stata_dictionary.dct". The "stata-datafile" specification should be named with the specification for where you wish to store the Stata system file.
  2. INFILE COMMAND: Reads the columnar ASCII data into a Stata system file.
  3. VALUE LABEL DEFINITIONS: Defines descriptive labels for the individual values of each variable.
  4. MISSING VALUES: Replaces numeric missing values (i.e., -9) with generic system missing ".". By default the code in this section is commented out. Users wishing to apply the generic missing values should remove the comment at the beginning and end of this section. Note that Stata allows you to specify up to 27 unique missing value codes.
  5. SAVE OUTFILE: This section saves out a Stata system format file. There is no reason to modify it if the macros in Section 1 were specified correctly.

How do I interpret a record from an ASCII data file?

Our data files are usually distributed as columnar ASCII files that consist of rows and columns of alphanumeric characters. Since ASCII data files are simply text files, they can be opened in any word processing program or Internet browser. However, the alphanumeric characters are not meaningful without the help of a codebook or setup files to identify the columns of the ASCII data file as particular variables.

This example illustrates how to interpret an ASCII data file for ICPSR 2737, Capital Punishment in the United States, 1973-1997.

The data file consists of 6,819 cases or observations, which in this example is inmates under sentence of death or those who were executed. Example 1 shows the first 10 lines of data in this file. The first observation, or line of data, is highlighted in red.

Example 1: The first case or line of data in the data file

Screen shot of columns of numbers, first row highlighted in red

The data file is a fixed format data file and is stored in a logical record length of 81. This means that each line is comprised of 81 characters. These 81 characters correspond to 37 variables or data items. Example 2 illustrates that each line of data in the file is 81 characters long.

Example 2: Each record is the same length (81 characters wide)

Screen shot of columns of numbers, first and last columns highlighted in yellow

In order to know which columns comprise particular variables, it is necessary to refer to the codebook (PDF 234K). The following examples illustrate how to read the first ten variables from this ASCII data file, beginning with the first record (row) and counting from left to right:

VARIABLE 1

V1-ICPSR STUDY NUMBER: This variable is positioned in column locations 1 through 4 and contains the value "2737" for each record. This value represents the 4-digit ICPSR archival study number assigned to this data collection.

Example 3: Variable 1 in Columns 1-4

Screen shot of columns of numbers, first four characters highlighted in yellow

VARIABLE 2

V2-ICPSR EDITION NUMBER: This variable is positioned in column location 5 and contains the value "1" for each record. This value represents the ICPSR edition number assigned to the data collection.

Example 4: Variable 2 in Column 5

Screen shot of columns of numbers, fifth character in each row highlighted in yellow

VARIABLE 3

V3-ICPSR PART NUMBER: This variable is positioned in column location 6 and contains the value "1" for each record. This value represents the ICPSR part number assigned to the data file within the data collection.

Example 5: Variable 3 in Column 6

Screen shot of columns of numbers, sixth character in each row highlighted in yellow

VARIABLE 4

V4-ICPSR SEQUENTIAL ID: This variable is positioned in column locations 7 through 10 and contains the value "1" for the first record. This value represents the first sequential case identification number and is used to uniquely identify a given record in the data file.

Example 6: Variable 4 in Columns 7-10

Screen shot of columns of numbers, second column highlighted in yellow

VARIABLE 5

V5-REPORT YEAR: This variable is positioned in column locations 11 through 14 and represents the reporting year. The first record, highlighted in red, contains the value "0", which represents a reporting year prior to 1973. The fifth record, also highlighted in red, contains the value "1973", which represents the actual year of the event.

Example 7: Variable 5 in Columns 11-14

Screen shot of columns of numbers, third column highlighted in yellow

VARIABLE 6

V6-INMATE ID: This variable is positioned in column locations 15 through 18 and contains the value "8" for the first record. This value represents a four-digit inmate identification number.

Example 8: Variable 6 in Columns 15-18

Screen shot of columns of numbers, fourth column highlighted in yellow

VARIABLE 7

V7-STATE: This variable is positioned in column locations 19 through 20 and contains the value "1" for all 10 records in this example. This value represents the FIPS state code for Alabama.

Example 9: Variable 7 in Columns 19-20

Screen shot of columns of numbers, first character of fifth column highlighted in yellow

VARIABLE 8

V8-Q3 SEX: This variable is positioned in column location 21 and contains the value "1" for the first 10 records. This code identifies the sex of these inmates as "male".

Example 10: Variable 8 in Column 21

Screen shot of columns of numbers, second character of fifth column highlighted in yellow

VARIABLE 9

V9-Q4A RACE: This variable is positioned in column 22 and contains the value "2" for the first record. This code identifies the race of this inmate as "Black".

Example 11: Variable 9 in Column 22

Screen shot of columns of numbers, third character of fifth column highlighted in yellow

VARIABLE 10

V10-HISPANIC ORIGIN: This variable is positioned column 23 and contains the value "2" for the first record. This code identifies the Hispanic origin of this inmate as "Non-Hispanic".

Example 12: Variable 10 in Column 23

Screen shot of columns of numbers, fourth character of fifth column highlighted in yellow

To locate the column positions for the remaining variables for this study, see the codebook for CAPITAL PUNISHMENT IN THE UNITED STATES, 1973-1997.

This example illustrates that a visual interpretation of the data record is inefficient. Commercially available statistical software packages such as SAS, SPSS, and Stata are available to interpret data files and to subset the variables and or cases as needed.

Creative Commons License This work is licensed under a Creative Commons Attribution-Noncommercial 3.0 United States License.

Jun 17, 2009

I've forgotten my password. How do I get a new one?

Just go to the MyData login page. At the bottom of the page is a link titled Forgot your password? You'll be asked to enter your email address, and then a new password will be sent to you. After you get the email, you'll probably want to change your password to something easily remembered.

Why can't I print my PDF codebook?

Some versions of the Acrobat software can't properly manage PDFs created in older versions of the software. Our processors test documentation files to ensure compatibility with the latest version of the free Acrobat Reader. We suggest downloading the latest version of the free Acrobat Reader from Adobe's Web site.

How do I use Excel to import tab-delimited ASCII data?

SAMHDA produces and makes available for download ASCII data files in two formats. The first of these is a fixed-format data file (da99999-9999.txt) to be used in conjunction with a setup file for SAS, SPSS, or Stata. The second format is a tab-delimited data file (da99999-9999.tsv).

Note: The Import Wizard for SAS, SPSS, and Stata can read the tab-delimited file into the statistical package. However, if using one of these statistical packages SAMHDA encourages you to use the fixed-format (.txt) data file to read in the data with its' accompanying setup file.

Warning: An error will occur if you try to read in a data file with more than 65,536 cases or 256 variables. These are the maximum limits that an Excel spreadsheet can handle.

Instructions

1. Download the tab-delimited ASCII data file from the SAMHDA Web site.

2. Most of the files downloaded from the ICPSR Web site will be compressed. You will have to decompress the files using WinZip or other decompression software. More information about decompressing files can be found at the help page, How do I decompress the files I download from your site?

3. Open Excel for Windows.

Screen Shot

4. Open the tab-delimited ASCII data file.

Screen Shot

  • Click on File and then Open to get an Open File dialog box.
  • At the top of the box, where it says Look In, choose the path where the tab-delimited ASCII data file is located.
  • At the bottom of the box, set Files of Type to All Files.
  • You will then see a list of all files in the directory you selected. Either double-click on the .tsv file or click once on the name of your chosen file (the name will appear after File Name) and then click on Open.

5. This will open Excel's text Import Wizard Step 1 of 3.

Screen Shot

  • Make sure the button for Delimited is marked and the box for "Start import at row" is set to 1.
  • Then click on Next.

6. Go to Import Wizard Step 2 of 3.

Screen Shot

  • Select Tab in the Delimiters option box.
  • Then click on Next.

7. Go to Import Wizard Step 3 of 3.

Screen Shot

  • Leave every column set to General. You do not have to do anything in this step. SAMHDA studies do not contain string or date variables.
  • Then click on Finish.

8. Review imported data file.

Screen Shot

You now have completed importing the data file. Row 1 will contain the names of the variables. Column A will be the CASEID variable. To confirm the import worked properly scroll across and down to check on the number of variables and cases imported. Compare these numbers against those provided by SAMHDA in the file manifest. This file can be accessed by going to the bottom of the study's Description and Citation or Browse Documentation pages.

Creative Commons License This work is licensed under a Creative Commons Attribution-Noncommercial 3.0 United States License.