Motivation

This course enables you to transform data into persuasive and evidence-based visualizations. Visualizations are persuasive if they motivate actions in an intended audience. Visualizations are evidence-based if they are reproducible, functional, and truthful.

This course introduces and discusses the fundamental design principles and visualization technology that allows you to design, implement, and critique persuasive and evidence-based visualizations. In a data-rich environment, where decision-makers often drown in data but thirst for insight 1, mastering this course equips you with a moderate level of data literacy.

Data literacy is the ability to interpret, construct, and convey arguments through the functional and truthful visual presentation of data. Data literacy is a vital skill in our data-driven world. The chances are high that you will be interpreting and designing data visualizations throughout your career. The level of data literacy offered through this course allows you to establish a competitive advantage in Silicon Valley and the global marketplace.

Learning objectives

You will learn to design and implement visualizations and critique the persuasiveness and evidence of visualizations. Upon successful completion of this course you will:

  • Understand the conceptual and technological fundamentals of data visualization.
  • Analyze and critique the persuasiveness and evidence of data visualizations.
  • Implement persuasive and evidence-based visualizations.

Course Logistics

Textbooks

There is no required textbook for this class. Data visualization is a fluid topic that is covered in arts, design, and technology. You will find a lot of “conventional wisdom” out there (including in the books below). Please consume information with a critical mind.

I consider the following books as my ‘common core’ of contemporary data visualization:

The philosophy of data visualization:

  • Tufte (2001): The Visual Display of Quantitative Information, Graphic Press.
    • This book is a classic introduction to visual data representations.
    • This book is available as a hard copy in the SCU library.
  • Tufte (2006): Beautiful Evidence, Graphic Press.
    • This book contains showcases that illustrate the thinking behind high-quality data visualizations.
    • This book is available as a hard copy in the SCU library.

The concepts of data visualization:

  • Cairo (2012): The Functional Art: An Introduction to Information Graphics and Visualization, New Riders.
    • This book provides you with the conceptual background of data visualization.
    • This book is available online in the SCU library.
  • Cairo (2016): The Truthful Art: Data, Charts, and Maps for Communication, New Riders.
    • This book is the sequel to the first one and focuses on the ‘truthful’ part.
    • This book is available online in the SCU library.

The technology of data visualization:

  • Munzner (2014): Visualization Analysis and Design, CRC Press.
    • This book offers a technical introduction to the elements of effective data visualization.
    • This book is available online in the SCU library.
  • Grolemund & Wickham (2017): R for Data Science, O’Reilly. (in particular chapter 3 on data visualization)
    • Everything you need to know for working with R.
    • This book is available online
  • McKinney, W (2012): Python for Data Analysis, O’Reilly. (in particular chapter 9 on plotting and visualization)
    • Everything you need to know for working with the Pandas library and Python.
    • This book is available online in the SCU library.
  • VanderPlas (2016): Python Data Science Handbook, O’Reilly. (in particular chapter 4 on visualization with mathplotlib)
    • A complete walkthrough of Data Analysis in Python.
    • This book is available online.

The practice of data visualization:

  • Nussbaumer Knaflic (2015): Storytelling with data. Wiley.
    • This book focuses on the use of data visualization in the professional environment.
    • This book is available online in the SCU library.
  • Wexler, Shaffer, Cotgrave (2017): The Big Book of Dashboards, Wiley.
    • This book is an excellent collection of dashboards and the reasoning behind them.

Technology

The hands-on elements in this course use Tableau Desktop, Jupyter/Python, R, and D3.js. I believe that experiencing data visualization through a variety of technology helps you to identify commonalities and important differences.

  • Tableau Desktop is an analytics platform for enterprise data.
  • The Jupyter/Python environment helps you to gather, clean, wrangle, and reproducibly visualize data. The easiest way to get started is to use Google Colab, which is a web-based Jupyter environment with Google Drive and GitHub support.
  • The R environment also helps you to gather, clean, wrangle, and visualize data in a reproducible fashion. R Studio also provides a (beta) cloud version.
  • D3.js is a JavaScript library for creating data products for the web. The easiest way to get started is to use Observable, which is a web-based development environment for D3.js

In the class meetings and assignments, we will work with all of these technologies. You are also free to use other technologies. Please discuss such plans with me before using different technologies.


PLEASE NOTE: I expect you to have Tableau Desktop installed on your laptop and R and Jupyter/Python ready to go.


It is my goal to spend the classroom time on conceptual and hands-on issues of data visualization. Therefore, we will only spend a minimal amount of time explaining how to use Tableau, Python, and JavaScript. I will introduce and explain all the code we need for the class meetings. If you want to use technology in your assignments but do not know it yet, the following resources will help you to get up to speed.

Tableau:

  • https://www.tableau.com/learn/training - This is a great resource to get answers on “how-to” questions.
  • Stirrup et al. (2016): Tableau: Creating Interactive Data Visualizations, Packt. - This book is available online in the SCU library.

Python, Jupyter, R, and D3.js:

  • https://jakevdp.github.io/WhirlwindTourOfPython - A great introduction into Python
  • Rossant, C (2015): Learning IPython for interactive computing and data visualization, Packt. This book is available online in the SCU library.
  • Lander (2017) R for Everyone: Advanced Analytics and Graphics - A great introduction into R.
  • Meeks (2015): D3.js in Action, Manning.

Communication

I am committed to your learning success. Please feel free to contact me with any questions regarding this course. If I am not able to help you myself, I will forward your request to someone who can.

  1. If you have general questions about course material, assignments, etc. please use the Slack workspace for the course. You will find the invitation link in Camino.
  2. Before you write an email, please search, read, and comment in the Slack workspace.
  3. If you send me an email that contains questions of interest to the whole class, I will answer them in the Slack workspace.
  4. My office hours are Mondays and Wednesdays from 6:00 PM to 7:00 PM. Please make an appointment in the Office Hours Calendar. I am also available after each class.
  5. Please make an appointment whether you want to meet during office hours or outside of my office hours. A meeting request must have a specific agenda. I am available via Slack (preferred) phone, zoom, or face-to-face.
  6. I post all course material, course information, announcements, and updates on Camino. On Camino, you will also find the class recordings. Please make sure that your correct email address is listed in Camino so that you do not miss important information.
  7. I maintain a Class Log (accessible only with SCU ID) that contains all the links, resources, and whiteboard drawings that I use or create during the class meetings.

Class Meetings

Class meetings are Mondays and Wednesdays, 7:35 PM to 8:50 PM in Lucas Hall 307.

This course is centered around a reflective and practical approach to data visualization. Mondays are devoted to discussing design principles in data visualization. Wednesdays are lab sessions. During the lab session, you have the opportunity to work on a data visualization case study that allows you to practice the application of the design principle and technology. At the end of each lab session, you may be asked to present your results.

Course Schedule

The following table shows the tentative schedule for the quarter and the assignments per week. Numbers in parentheses denote the maximum number of points you can achieve for the assignments). Submissions are always due on Sunday at 11:59 PM in the Pacific Time Zone.

Week Class Meeting Topic Reader Project Individual Project Team Project
1 April, 1 Introduction - - -
1 April, 3 Lab R1 (3) - -
2 April, 8 Self Study (no class!) - - -
2 April, 10 Self Study (no class!) R2 (3) - -
3 April, 15 Analytic Design - - -
3 April, 17 Lab R3 (3) IP1 (10) -
4 April, 22 The Audience Model - - -
4 April, 24 Lab R4 (3) IP2 (10) -
5 April, 29 Visual Arguments - - -
5 May, 1 Lab R5 (3) IP3 (10) -
6 May, 6 Goals, Questions, Metrics - - -
6 May, 8 Lab R6 (3) - -
7 May, 13 The Data Pixel Ratio - - -
7 May, 15 Lab R7 (3) - TP1 (10)
8 May, 20 Situational Awareness - - -
8 May, 22 Lab R8 (3) - TP2 (10)
9 May, 27 Memorial Day (no class!) - - -
9 May, 29 Lab R9 (3) - -
10 June, 3 The Truth Continuum - - -
10 June, 5 Lab R10 (3) - TP3 (10)
Total = 100 points 10 (Self-Study) 30 30 30

Assignments

“What it boils down to is one percent inspiration and ninety-nine percent perspiration.” (Thomas Edison)

Your mastery of the learning objectives will be examined through contributions to a class reader, an individual project, and a team project. There will be no exams.

The following table links the learning objectives of this class with the assignments and shows the maximum number of points that you can achieve with each assignment towards the final grade.

Learning Objective Assignment Max. Points
Understand the conceptual and technical fundamentals of data visualization. Class Reader 30
Analyze and critique the persuasiveness and evidence of existing data visualizations Self Study Project 10
Implement persuasive and evidence-based visualizations Individual Project, Team Project 30 + 30
Total 100

The final grade distribution is as follows.

Points Letter Grade
100-94 A
>94-90 A-
>90-87 B+
>87-84 B
>84-80 B-
>80-77 C+
>77-74 C
>74-70 C-
>70-0 F

My grading criteria are as follows:

  • A grades (4.0) reflect work that meets all assignment objectives at the highest possible level and sometimes goes beyond that. The submitted work is of superior quality and could be presented to the target audience with no or minimal revisions. Typically, no more than 40% of participants in a course receive an A grade.
  • B grades (3.0) reflect work that meets all assignment objectives at a level that is above average but not exceptional. The submitted work shows high levels of competency and could be presented to the target audience with some editing.
  • C grades (2.0) reflect work that meets all course objectives at an average level but is not exceeding expected standards. The submitted work lacks a clear in-depth understanding of the subject and could be presented to the target audience only with extensive editing. Typically, at least 5% of participants in a course receive a C grade.
  • F grades (0.0) reflect work that does not meet course objectives and is below minimum standards. Submissions are late without prior consultation with the instructor, miss the assignment objectives, or show a clear lack of learning progress. Also, repeated violations of the academic integrity standards result in an overall F grade.

I reserve the right to change the grading to accommodate special circumstances and opportunities. Any changes, however, will be discussed and announced in class and on Camino.

Class Reader

The class reader is the virtual extension of the classroom. You use the class reader to collaboratively develop a deeper understanding of the conceptual and technical fundamentals of data visualization.

Your objective is to contribute in a meaningful way to the class reader every week. A meaningful contribution is defined as the following set of activities:

  • Add and annotate a unique reference to the class reader. The annotation should consist of a summary of the key points, a critical analysis, and a personal reflection.
  • Write a component of the class reader that combines several annotated references in a useful manner.
  • Evaluate and critique an existing component of the class reader.
  • Substantially improve an existing component of the class reader.
  • Organize components into coherent and consistent structures (chapters, sub-chapters, tables, lists).

When contributing to the class reader, make sure that you understand the requirements of academic integrity as outlined below.

The structure of the class reader is as follows:

  1. Fundamentals
    • Theoretical background of data visualization
    • Contemporary research results
  2. Case Studies
    • Description and replication of great examples of data visualization
  3. Patterns
    • Reusable solutions to everyday data visualization questions
    • Applied by multiple members of the course
  4. Ethics
    • Implications of (good and bad) data visualization
    • The role of data visualization in politics, society, and business

In the spirit of great examples of collaborative writing, we use GitHub to organize the writing process. You will use branches, projects, issues, pull requests, and wikis to manage your work efficiently.

Please note that this is a project that has been started in previous quarters.

I will evaluate your contribution to the class reader on a weekly basis (Week 1 to 10) using the following criteria.

Criteria Metrics Max. Points
Quantitative Activity Commits, Additions, Deletions, Issue Handling, Wiki Contributions .5
Continuous Integration Management of the publication cycle .5
Qualitative Activity Quality of Content, Arguments, and Reflection as reported in the GitHub comments 2

I will grade you based on the results in the class reader GitHub repository on Sunday, 11:59 PM each week.

Self-Study Project

We will have no class sessions in the second week of the quarter. Instead, you will spend this week exploring the heterogeneous world of data visualization. The topic for the self-study project is climate change caused by human activity.

Your objective is twofold:

  1. Collect 5 distinctly different visualizations of climate change (At least one visualization should argue AGAINST that climate change is caused by human activity). Review, compare and contrast the five visualizations. To do so, first, develop a literature-based evaluation framework and then use the framework to evaluate your five visualizations. Second, write an overall assessment and conclusion.
  2. Replicate the following visualization of climate change to the best of your abilities: Warming stripes. The data is linked in the article. This visualization must not be part of the five visualizations that you collected above.

Your submission consists of a GitHub project that contains all material (visualizations, text, etc.). Make sure that you use appropriate means for referencing material that you have used for your project (See the academic integrity policies below).

I will evaluate your self-study project based on the following criteria.

Criteria Metrics Max. Points
Content Understandability (1), Completeness (1) 2
Evaluation Framework Structure (1), References (1) 2
Persuasiveness Clarity (1), Argumentation (1) 2
Visualization Effort (1), Replication (1) 2
Style Professionality (1), Originality (1) 2
Total 10

If you have any questions about the self-study project, use the Slack workspace. The self-study project is due on Sunday, April 14, 2019, 11:59 PM.

Individual Project

You pursue two objectives with the individual project:

  1. Transform an unknown dataset into visualizations. This activity will allow you to learn from how to explore data and iteratively develop effective visualizations.
  2. Get comfortable using Tableau. Tableau is an important tool for data visualization. Completing the individual project will provide you with the skills to comfortably use Tableau.

The topic for the individual project is the City of Chicago’s Automated Speed Enforcement Program. You will find the data here: https://data.cityofchicago.org/Transportation/Speed-Camera-Violations/hhkd-xvj4.

The following table provides an overview of the deliverables for the individual project.

Project Phase Due Max. Points
Data Exploration April 21, 2019 (11:59 PM) 10
First Version April 28, 2019 (11:59 PM) 10
Revised Version May 5, 2018 (11:49 PM) 10
Total 30

PLEASE NOTE: It is vital for you to start early and discuss intermediate results with me. I will not accept late submission without prior notice or a doctor’s note. I am aware that sometimes life goes crazy but please notify me in advance, and we will work it out.


Data Exploration

During data exploration, you should get familiar with the dataset. Your objective is to develop five distinct visualizations using Tableau that provide an effective overview of the data. Think about the following questions:

  • What could be the core message of the data?
  • What are the important descriptors of the data?
  • What are the important changes over time?
  • Who could be interested in the data?

This list of questions is not complete and its sole purpose is to get you thinking.

You will submit a link to a Tableau Public Project.

I will evaluate your data exploration based on the following criteria.

Criteria Metrics Max. Points
Content Understandability (1), Completeness (5), Distinctiveness (1) 7
Persuasiveness Clarity (1), Argumentation (1) 2
Style Originality (1) 1
Total 10

First Version

The first version of your individual project documents your first attempt of a dashboard. The first version should:

  • Visualize three aspects of the data in an interesting, non-trivial, and somewhat unexpected fashion to the mayor of Chicago.
  • Document the “Making-of” (Details of your development process, data wrangling steps, your reasoning, detours, literature, etc.)
  • Road-map with future features/enhancements.

You will submit a github repo that includes links to a Tableau Public Project.

I will evaluate both versions based on the following criteria:

Criteria Metrics Max. Points
Finding 1 Persuasiveness (1), Content (1) 2
Finding 2 Persuasiveness (1), Content (1) 2
Finding 3 Persuasiveness (1), Content (1) 2
Dashboard Structure (1), Argumentation (1) 2
Style Originality (1), Professionality (1) 2
Total 10

Revised Versions

The revised versions of your individual projects document your individual mastery of the course. The revised version of your dashboard should substantially improve the first versions based on the roadmap developed during the first version and include:

  • The final version of your data product.
  • A documentation of your data product.
  • The final version should make use of advanced and interactive features of Tableau.

You will submit a github repo that includes links to a Tableau Public Project.

I will evaluate the result of this phase based on the following criteria:

Criteria Metrics Max. Points
Improvement Persuasiveness (1), Evidence (1), Structure (1) 3
Audience Argumentation (1), Fit (1) 2
Dashboard Structure (1), Interactivity (1) 2
Integration Originality (1), Effort (1), Professionality (1) 3
Total 10

Team Project

The objective of the team project is to collaboratively develop a data product that is repeatable, inspectable, reusable, and diffable (i.e., you can see changes over time). A data product makes a complex data-driven argument using several data visualizations. You will work teams of up to five students.

The topic for the individual project is Gun Violence in the United States. We use the following data product as the inspiration: https://www.vox.com/policy-and-politics/2017/10/2/16399418/us-gun-violence-statistics-maps-charts. Please note that this just serves as an inspiration and starting point. You are expected to go beyond this example.

The challenge of a team project is to organize your team, hold one another accountable, and complement your skills and interests. At the end of the team project, your teammates will evaluate your contributions to the project. This evaluation may influence your grade for the team project.

The following table provides an overview of the deliverables for the team project.

Project Phase Due Max. Points
Exploratory Data Analysis May 18, 2019 (11:59 PM) 10
First Version May 26, 2019 (11:59 PM) 10
Revised Versions June 9, 2019 (11:59 PM) 10
Total 30

PLEASE NOTE: It is vital for you to start early and discuss intermediate results with me. I will not accept late submission without prior notice or a doctor’s note. I am aware that sometimes life goes crazy but please notify me in advance and we will work it out.


Exploratory Data Analysis

During the exploratory data analysis, your objective is twofold. First, you should collect, clean, and integrate data. Second, you establish a thorough understanding of the content and the limitations of your data.

The exploratory data analysis must:

  • be completely reproducible.
  • documented.
  • free of errors and warnings.

I will evaluate your exploratory data analysis based on the following criteria.

Criteria Metrics Max. Points
Data description Understandability (1), Completeness (1) 2
Data coverage Volume (1), Creativity (1), Quality Assessment (1) 3
Data preparation & use Clarity (1), Explanations (1), Integration (1) 3
Style Professionality (1), Effort (1) 2
Total 10

First Version

In this phase, you will develop the first version of your data product. You should achieve the following:

  • Develop a narrative that connects at least three interesting, non-trivial, and somewhat unexpected aspects of the topic.
  • Document the “Making-of” (Details of your development process, data wrangling steps, your reasoning, detours, literature, etc.)
  • Road-map with future features/enhancements.

I will evaluate both versions based on the following criteria:

Criteria Metrics Max. Points
Narrative Evidence (1), Coverage (1) 2
Finding 1 Persuasiveness (1), Content (1) 2
Finding 2 Persuasiveness (1), Content (1) 2
Finding 3 Persuasiveness (1), Content (1) 2
Style Creativity (1), Professionality (1) 2
Total 10

Revised Version

The final data product must be online by the deadline. The final data product should consist of two items:

  • One reproducible and self-contained deliverable (a Jupyter notebook, an R notebook, an Observable notebook, a webpage, etc.).
  • A short video (< 90 seconds) that summarizes the key points of your data product.

I will evaluate the revised versions based on the following criteria:

Criteria Metrics Max. Points
Progress Improvements in Data Analysis (1), Improvements in Visualizations (1), Improvements in Narrative (1) 3
Video Content (1), Effectiveness (1), Originality (1) 3
Professionality Style (1), Structure (1), Polishing (1), Originality 4
Total 10

How to get an A in this course

I firmly believe that the mastery of data visualization requires constant practice. You will ace this course if you:

  • Adhere to the academic integrity standards outlined below.
  • Participate in the class discussions, ask questions, and share experiences.
  • Support your teammates.
  • Show intermediate results early and often.
  • Start early on the assignments, seek continuous feedback from me and other sources.
  • Continuously think about why you are doing something in your assignments. This is far more important than what you are doing.
  • Answer the ‘boss question’ before submitting any deliverable: Would you be comfortable to send your submission as is to your boss or a recruiter? If your answer is yes, please submit. If your answer is no, revise before you submit.

Academic Integrity

The Academic Integrity pledge is an expression of the University’s commitment to fostering an understanding of and commitment to a culture of integrity at Santa Clara University. The Academic Integrity pledge, which applies to all students, states:

“I am committed to being a person of integrity. I pledge, as a member of the Santa Clara University community, to abide by and uphold the standards of academic integrity contained in the Student Conduct Code.”

You are expected to uphold the principles of this pledge for all work in this class. For more information about Santa Clara University’s academic integrity pledge and resources about ensuring academic integrity in your work, see www.scu.edu/academic-integrity.

In particular, I expect that you give credit to any material (including but not limited to journal articles, web article, blog posts, images, data sets, and any media) that you have used for completing any assignment in this class. Being able to give credit by referencing sources consistently and correctly is evidence of mastery of a topic. It shows that you can construct original arguments that are backed with verifiable evidence. Failing to give credit is a sign of inadequate learning progress. It shows that you have not understood the topic well enough to formulate your own arguments in relation to already existing ideas.

During your work in this class, you will use, modify, or extend digital content that you have found online. You will also use libraries, APIs, code snippets, and data sets that have been created by others. In every piece of work (presentations, assignments, etc.), you must acknowledge work, source code, data sets, and any other content that was not produced by you. Acknowledgments must be easily identifiable, inseparable from your content, and must not violate licenses.

Failure to provide appropriate acknowledgments will result in an F grade for that assignment. Repeated failure to provide appropriate acknowledgments will result in an F grade for the entire course.

During the first class, we will discuss this digital content policy. After this class, I will strictly enforce this policy. If you have doubts, contact me.

Course Conduct

My responsibility

I will support you in your learning in this class and beyond to the best of my abilities. If I am not able to help you myself, I will identify someone who can. I will evaluate your contribution solely based on the standards set by this syllabus. Changes to the syllabus will be highlighted, discussed during class sessions, and will be published on Camino.

Your responsibility

By enrolling in this class, you agree to the requirements stated in this syllabus. You will operate with integrity in your dealings with me and your fellow students. You will engage the learning materials with appropriate attention and dedication and maintain their engagement when challenged by difficult learning activities. You will contribute to the learning of others and you will perform to standards set by this syllabus.

Mutual respect is the foundation of this course. No one will be criticized for being wrong. Appropriate conduct includes honesty, self-respect, respect for others, and compliance with university policies and standards. Computers in the classroom should be used only for completing course-related work and for taking notes; cell phones must be turned off or muted.

Attendance Policy

Please let me know via email during the first two weeks of the course if you have any conflicts between a course element (class meeting, assignment) and another vital commitment (another course, work, university-related extracurricular activities, religious commitments). At my discretion, I will you provide with alternative means to complete the course element.

I am aware that many of you have multiple commitments. You should attend at least 80 percent of all scheduled class meetings. If you miss more than 20 percent of scheduled classes, you will receive a reduction by one letter grade.

University Policies

Disability Resources

If you have a disability for which accommodations may be required in this class, please contact Disabilities Resources (Benson Hall 216, 408-554-4109) as soon as possible to discuss your needs and register for accommodations with the University. If you have medical needs related to pregnancy, you may also be eligible for accommodations. If you have already arranged accommodations through Disabilities Resources, please discuss them with me during my office hours as soon as possible.

While I am happy to assist you, I am unable to provide accommodations until I have received verification from Disabilities Resources. If you are in doubt of whether you are eligible for accommodations, I encourage you to contact Disabilities Resources (Benson Hall 216, 408-554-4109). The Disabilities Resources office would be grateful for an advance notice of at least two weeks.

Accommodations for Pregnancy and Parenting

In alignment with Title IX of the Education Amendments of 1972, and with the California Education Code, Section 66281.7, Santa Clara University provides reasonable accommodations to students who are pregnant, have recently experienced childbirth, and/or have medical needs related to childbirth. Pregnant and parenting students can often arrange accommodations by working directly with their instructors, supervisors, or departments. Alternatively, a pregnant or parenting student experiencing related medical conditions may request accommodations through Disabilities Resources (Benson Hall 216, 408-554-4109).

Discrimination and Sexual Misconduct (Title IX)

Santa Clara University upholds a zero-tolerance policy for discrimination, harassment, and sexual misconduct. If you (or someone you know) have experienced discrimination or harassment, including sexual assault, domestic/dating violence, or stalking, I encourage you to tell someone promptly. For more information, please consult the University’s Gender-Based Discrimination and Sexual Misconduct Policy at http://bit.ly/2ce1hBb or contact the University’s EEO and Title IX Coordinator, Belinda Guthrie, at 408-554-3043, . Reports may be submitted online through https://www.scu.edu/osl/report/ or anonymously through Ethicspoint https://www.scu.edu/hr/quick-links/ethicspoint/

Acknowledgment

This syllabus was inspired by Aleszu Bajak’s syllabus, Jeffrey Shaffer’s data visualization with Tableau course, and earlier versions of CS171 at Harvard.


  1. Loosely based on Naisbitt, J. 1982: Megatrends, Warner Books