Skip to Main Content

Datalab: SPSS

1. Getting started

The IBM SPSS software platform offers advanced statistical analysis, a vast library of machine learning algorithms, text analysis, open source extensibility, integration with big data and seamless deployment into applications.

 

 

Available for: Windows / Mac OS

License: Generally available to all students and faculty

Faculty can download and install SPSS from the Software Center on their BI laptop.

Students can download SPSS from the BI Software Pages

 

1. Go to www.portal.bi.no and log in with s-number and password.

2. Go to the menu in the top right corner and click Digital services.



3. Select Software


 

 

4. Scroll down until you find SPSS. Click the link «Download the software from here».


 

 

5. List of all available software for faculty and students at BI. Choose the newest version of SPSS: IBM SPSS Statistics 29.


 

6. You will then see this window with different tabs from left to right.
 



It is important that you read the information in these tabs to learn how to download the software.
In the first tab About, you will find information about how long the current license is valid.

 

 

7. Click the next tab - License agreement. Read the information and accept the license agreement.


 

 

8. When you have accepted the license agreement, select software for Windows or MacOS (choose one of the tabs).

 

9. FOR WINDOWS
Read the instructions and click the link Receive license-key and download the Windows 64-bit version of IBM SPSS Statistics 29.



 

 

10. You will then receive the license key and the zip-file. Again, remember to read the Installation instructions as seen in the picture below.


 

Opening SPSS

  • The first time you open SPSS you will see this welcome dialog. On the left, it shows recent files and sample files
     
  • If you don’t want this dialog to show up every time, click on the bottom left where it says “Don’t show this dialog in the future”. 
     

Welcome window in SPSS

 

 

New Dataset

If you click on [New Dataset] in the top left corner, you open a new dataset. You see the rows and the columns of the SPSS data window. By default, this is the window that shows up when you open the SPSS application. SPSS has normally two or three different windows that are open simultaneously. They show you different features of your dataset.

  • This is your data window (with data view and variable view):SPSS data window

In the data window, you have some important options up in the menu:
The toolbar buttons that are used most frequently:

  • “Recall recently used dialogs” (icon in red circle to the left) – if you click on that you see all the functions you have used  recently. 

  • One of the great things about SPSS is the "A-1" (value labels) (icon in red circle to the right)– this turns on the value labels.

 

Icons_recall recently used dialogs_value labels

Data View / Variable View

Probably the most important thing about the data window is the data view and the variable view where you can see information about each of the variables in the dataset – its name, its type, its labels and so on:

Data view_variable view

 

Output Window

The output window is where SPSS puts the results – numerical results, graphs and any logs or history of commands that are used to produce those. Over to the left in the output-file you have a navigation-window which makes it easy for you to go to the different elements of the output.

Output window

If you click on the icon with the big star in the output-window, it takes you back to the data-window:

Output window_star icon

Introduction

SPSS Syntax Files make more efficient and organized as a SPSS user.

What are SPSS Syntax Files? They are text files that contain a series of commands and instructions for SPSS to execute. Instead of manually clicking through menus and dialog boxes, you can automate your analyses by  running these syntax files.

Why you should get familiar with SPSS Syntax Files:

  1. Save Time: Once you've created a syntax file for a specific analysis, you can reuse it as many times as you need. You won't have to try to remember what you did days or weeks back. 

  2. Enhance Reproducibility: Syntax files serve as a record of your analysis steps. By sharing these files with others or future you, you ensure that your analyses can be easily reproduced, fostering transparency.

  3. Customize Your Analyses: SPSS Syntax Files give you  control over your analyses. You can fine-tune every aspect of your statistical procedures, allowing you to tailor them to your specific research needs.

 

Getting started is easy:

  • Open SPSS and perform your analysis as you normally would using the menus and dialog boxes.
  • Instead of running the analysis immediately, go to the "Paste" button in the dialog box.
  • SPSS will generate the corresponding syntax for your analysis. You can save this syntax file for future use.

See more detailed descriptions below:

 

Syntax Window

There is an optional third window in SPSS. In the output window you can see that you have a written command. This may or may not show depending on the way you have SPSS set up. You can get the syntax in the output window, but you can also have a syntax file. 

  • Go to FILE - NEW - SYNTAX. A blank window will open (this is a programming window). 

If you click PASTE when conducting an analysis, SPSS will put that as a written command in the syntax file that you can reuse several times. This allows you to recreate your commands, share them, copy and modify them. 

If you have saved the syntax used in a SPSS file, you have a tracking of everything you have done (statstical analyses, visualizations etc.) Later on you can run the syntax file to either remind yourself what you did and how, or copy from the syntax file in order to carry out the same procedure in another SPSS file. Your syntax file will also in many cases be relevant to attach to a publication or assignments as documentation of the steps in your analyses. 

 

 

Variable Labels in Syntax

  1. Capitalized command: VARIABLE LABELS
  2. Write the variable name followed by a space, and then the lable in quotation marks. 
  3. Period at the end of the command. 
  4. Hit PLAY to run the command. 
  5. Return to data window to see your changes.
     

 

 

 

Rename Variables in Syntax

  1. Capitalized command: RENAME VARIABLES
  2. Write the original variable name followed by a space, then =, followed by a space again and then Write the new variable name. 
  3. Period at the end of the command. 
  4. Hit PLAY to run the command. 
  5. Return to data window to see your changes.

 

 

 

 

 

Value Labels in Syntax (for categorical variables)

  1. Capitalized command: VALUE LABELS
  2. Write the variable name (as written in the data window), then press enter (new line). 
  3. Write the value followed by a space and then the label in quotation marks. 
  4. If you are to label values in several variables in the same commando: use a slash before you go to a new line. 
  5. Return to data window to see your changes. 

 

 

 

 

 

 

 

 

 

 

 

2. Preparing your Data

When you are working with data in any program it is important to specify what kind of data  you are working with. This affects the kinds of operations you can do with the variables. In SPSS, you get to specify this in a few different ways, for example by datatypes and measures. 

Data Types and Measures

You should first look at the row with the variable names on top of the spread sheet of the document. And next to each one there is an icon that indicates the level of measurement: 

  • The three-different-colors-circle is a nominal variableVariables
  • The three-steps-up with the different colors is an ordinal variable
  • The ruler is called a scale variable

 

If we go to variable view at the bottom left, we can see each of the variables in the data set. 

  1. The first thing you want to specify is the type of the variable. Usually, most of the variables in a table are numeric. But there can also be string variables (and also other types as well). String variables (or character variables) are variables with values that are treated as text.               






     
  2. Next, you have the option to specify the width of the column and the decimals. You can give a label to the variable and to the values.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      

Level of measurement

Scale: A quantitative variable

Ordinal: Ordered categorie

Nominal: Categorical


Recode Variables

In data window: 

  • TRANSFORM - RECODE INTO DIFFERENT VARIABLES


     
  • Choose variable and type a new name under NAME.
  • Go to OLD AND NEW VALUES 
  • Type old and new value, and then click ADD. 

 

Understanding the Importance of Weighting Cases in SPSS

In many cases in survey research, not all elements in a sample have an equal probability of being selected. This is where weighting cases in SPSS comes into play. Weighting allows you to give different levels of importance to different observations in your dataset, ensuring that your analysis reflects as accurately as possible the characteristics of the entire population, not just the sample.

What is Case Weighting?

Case weighting, or population weighting, is a statistical technique used to adjust the influence of each case (individual or unit) in your dataset based on its relative importance or representation in the population you are studying. The purpose is to correct for any sampling biases or uneven probabilities of selection, making your analysis more accurate and representative.

How to Apply Case Weighting in SPSS:

IHere we describe how you can activate a weight already in your dataset.

Go to Data - Weight Cases.  

Now click "Weight Cases by" and select the correct variable. If you want to deactivate weighting, remove the variable and click "Do not weight cases"

Now you will see "Weight On" down to the right in the "Weight status area". 

You can also use the command WEIGHT to activate or deactivate weighting. Use WEIGHT BY and then choose the weight variable to activate the weight variable, and WEIGHT OFF to deactivate it. 

 

 

3. Explore your Data

When you have your data in SPSS and have done all the preparations, you can start exploring your data with basic descriptive analysis as a foundation for further research. 

 

Frequencies

The easiest way to start looking at your data is to start with frequencies. The frequency table tells you how common each of the categories are, both in frequency and percent. 

ANALYZE → DESCRIPTIVE STATISTICS → FREQUENCIES

For categorical variables (nominal and ordinal): 

 

But the frequencies command can do more than nominal and categorical variables. 
For quantitative variables: 


 

  • Click on statistics for choosing for example mean and standard deviation

 

 

Descriptives


If you have quantitative data, then you want to have some basic descriptive statistics, like the mean or the standard deviation. It is possible to do this in frequencies, but SPSS has a special command for that. And that is descriptives

ANALYSE → DESCRIPTIVE STATISTICS → DESCRIPTIVES

  • Choose variables
  • Click OK

 

What you get: 

EXPLORE (if you want to investigate one variable at a time)

In data window: 

  • ANALYZE - DESCRIPTIVE STATISTICS - EXPLORE
  • Select a variable to "Dependent List"
  • Click OK




     

We get several numerical descriptions for one variable in the output window: 

  • First: A table that tells us how many valid and missing data we have. 
  • Second: the mean and standard error, 95% confidence interval for mean, median, variance, standard deviation, etc. 
  • Third: Stem & Leaf Plot






     

 

Explore Relations between Variables using Cross tabs


ANALYZE - DESCRIPTIVE STATISTICS - CROSSTABS

  1. Variable in columns
  2. Variable in rows
  3. Go to cells and choose columns under percentages, remove counts observed (if you want percentage distribution).
  4. OK and OK


 

Custom Tables

 

ANALYZE - TABLES - CUSTOM TABLES

  1. Variable in columns
  2. Variable in rows
  3. Under summary statistics, choose columns percent and column N% (if you want percentage distribution)
  4. Click Add and OK


 

 

 

 

4. Analysing Data

Analyzing data is at the heart of what SPSS does best. SPSS offers a toolkit for conducting a wide range of statistical analyses. In this section, we provide introductions to some of the most commonly used statistical methods available in SPSS.


What is a Chi-square Test? 

Chi-square test is a statistical test for examining the association between categorical variables. It is used in the analysis of contingency tables (also known as crosstabs = a table in a matrix format), which display the frequency distribution of the variables. Chi-square tests are used a lot in the analysis of surveys.

 

Conducting Chi-square tests in SPSS:

 

Go to:

CROSSTABS: Chi Square test – the whole table (not individual procentages)

 


  1. Analyze → Descriptive Statistics → Crosstabs
  2. Variable in columns
  3. Variable in rows
  4. Go to Statistics→ choose Chi-quare → click continue
  5. OK

 

 

 

 

CROSSTABS: Test of proportions
 

  1. Analyze → Descriptive Statistics → Crosstabs
  2. Variable in columns
  3. Variable in rows
  4. Go to Cells and under Z-test choose [compare column proportions] and [adjust p-values] (Bonferroni method) and [columns] under percentages. Remove [counts observed].
  5. Continue and OK

 

 

CUSTOM TABLES: Chi-square test
a) Analyze → tables → custom tables

  1. Variable in columns
  2. Variable in rows
  3. Go to Test statistics
  4. Under Tests, choose [Compare column proportions]
  5. Under Identify significant differences, choose [In a separate table] and [Display significance values]
  6. Under Significance level, choose [Adjust p-values] and [Bonferoni]
  7. Choose [Include multiple response variables in tests]
  8. Choose [Tests of independence (Chi-square)]
  9. OK

What is Correlation analysis? 

Correlation analyses are used to examine whether changes in one continuous variable are associated - or correlated - with changes in one or more other continuous variables. It is a statistical method which assesses the strength and direction of relationships between variables.

A wiedely used measure of correlation is Pearson's correlation coefficient, which quantifies the linear relationship between two continuous variables.  

How to conduct correlation analyses in SPSS? 

 

For Pearson's correlation coefficient

go to

 

Analyze -> Correlate -> Bivariate.

 

Select the variables you want to want to include in the analysis and click the arrow to transfer them to the variables window.

 

Make sure "Pearson" is checked. 

 

 

What is Regression Analysis? 

Regression analysis examines the relationships between a dependent variable and one or more independent variables. It is a statistical method which helps you predict a response variable (the dependent variable) based on one or more explanatory (independendent variables).

Regression analysis encompasses various types, including linear and logistic regressions. Linear regression is suitable when the dependent variable is numeric, and it aims to predict a continuous outcome. In contrast, logistic regression is used when the dependent variable is categorical, often representing a binary outcome, such as True or False.

Both may be simple (only one independent variable) or multiple (several independent variables). 

How to conduct regression analyses?

 

For Linear Regression analysis 

 

Go to: 

 

Analyze  -> Regresion -> Linear

 

Select the dependent variable and click the arrow

 

Do the same for the independent variable 

 

Click OK

Factor Analysis

If you got for instance 10 variables that are measuring approximately the same thing, then you probably do not need all 10. If you are able to find how the variables go together, then maybe you can combine those 10 and get a factor that is more useful than the individual variables. 
Factor analysis is based on covariance between the different variables. 

1) ANALYSE → DIMENSION REDUCTION → FACTOR

2) Choose a collection of variables that may have something in common. 

3) Click OK (for default analysis)


 

Communalities
What we get first is a table with communalities. When each variable is standardized, it has one unit of variance, and that is the initial value. The Extraction has to do with how much they have in common with each other and that feeds into the rest of the analysis.

 

Total variance explained
If each of the 12 variables has one unit of variance, there is 12 units of variance total. But if the variables run parallel to each other, then you can find a single factor that accounts for more than one unit of variance. And that is what we have in the table "Total variance explained". 
When we look at the column "Cumulative%", those listed over there - these factors accounts for over 3/4 of the variance in the original variables. So in the example we are able to go from 12 to 4. And that is a benefit for us because it is fewer things we have to deal with, and they are probably going to be more stable, more reliable, and more generalizable than the individual variables would be.

 

 

Component matrix

In terms of what goes in to these 4 variables, we look at the component matrix. We have the four components listed across the top, we got the 12 variables down the side, and we have these numbers that are like correlation coefficients. They go from -1 (negative one) to +1 (positive one).  And high absolute values indicate strong relationships between that variable and that component. So for example: Under component 1 we have a -.759 - that means that Facebook and component 1 are strongly connected. And other ones are less connected - for example -.080 between GDPR and component 4. 
 

 

 

5. Visualisations

Data visualisation is a an essential aspect of data exploration, analysis, and communication. Presenting your data graphically often uncovers patterns that may remain hidden in raw numbers. In this section, we offer guidance some  frequently used data visualization techniques available in SPSS.

 

Graphs → Chart builder

  1. Drag in a template chart
    You have a choice of variations – use bar for bar chart.
  2. Start putting the variables in

 

 

 

 

Bar chart example: Clustered bar

 

 

Element properties: Choose for example “Percentage” or Count.
Click [Set parameters]

 


 

See bar charts but choose line instead

Line graph for one variable

  1. Click right on the table in the output window and click [Edit]

  2. Choose the variable you want and click right to choose [Create graph] and [Line]



     
  3. In chart editor: Click [Elements] and [Show data labels]

Box plots, also known as box-and-whisker plots, provide a concise summary of the distribution and variability of numerical data. Box plots highlight the central tendency, spread, and potential outliers within your dataset. If you are examining continuous variables, you may visualize the distributional characteristics effectively using a box plot.

Go to:  GraphsChart builder

Drag in a template chart – choose Boxplot

 

Use a Simple Boxplot (the one to the left) when you want to visualize the distribution of a single continuous variable in relation to another variable. Simple boxplots are ideal for providing insights into the spread, central tendency, and potential outliers of one continuous variable, making them suitable for comparing the distribution of that variable across different groups or categories. 

 

Opt for a Clustered Boxplot (the one in the middle) when you need to explore how two variables covariate in relation to a third variable. Clustered boxplots allow you to simultaneously display the distributions of two continuous variables while considering the influence t of a categorical or grouping variable.

 

Start putting the variables in. 

Read more about the concise information a Boxplot provides here

6. More