Projects

Data Visualization

Project 3: TV Time

Description and Motivation:

The purpose of this visualization is to allow users to analyze the TV show "The Good Place". The motivation for creating this application is to visually show and answer questions about characters and episodes, specifically looking at dialog and relationships.

Our visualization answers:

How often does each character speak?
How much do they say?
What do characters tend to talk about?
How often do they share a scene with other characters?
Are there changes between seasons, during a season in these patterns?

Video Demo:

Data:

The data came from a GitHub repository originally used to analyze "The Good Place" for a Medium Article "Text Analysis of The Good Place" published on October 19, 2020. The data that we used from this source was already preprocessed into CSV format in the order: episode, character, line. The data consists of all 4 seasons' transcripts in the form of CSV.

Data Source: https://github.com/daratanxe/good-place

Since the data came in this format, only light preprocessing was required. Upon researching the series, we found how many episodes were in each season and used this information to add a season field to the data.

The data for the force-directed graph had to be in a different format so once we categorized the characters – the data was converted into the format needed from the CSV file contents. We accomplished this by dynamically pulling from the script.csv file to create an object that contained the appropriate Nodes and Links depending on the filter applied. An example of how we dynamically structured the data can be found below. (Note: the JSON format file below was not ultimately used in the final product. Just an example of how we constructed the data format).

JSON format: https://github.com/keerthi-sekar/CS5124-Project3/blob/main/Project3-TV-Time/data/thegoodplace.json

2 Arrays within the file: Nodes and Links which are used to map out the graph in the UI

Visualization Components:

Overall Dashboard contains:

Barcharts for characters and episode breakdown of line count (C-Goals)
TV Cast Infographic to connect the faces and real actors to the character names in the data (C-Goals)
Interactive Word cloud identifying most occurring words and phrases (B-Goals)
Force Directed Graph – shows the relationship of characters based on scene count and interaction (A-Goals)
Search for a word or phrase (A-Goals)

When you click on 1 bar or item on 1 graph it updates ALL the visualizations. You can also filter at the top by Season or phrase.

Character Bar Chart shows the number of lines each character has (all time and can search based on season or episode) Each bar is clickable to filter the dashboard even further. The tooltip tells what character is highlighted and the line count. We decided to show these top 13 characters in this bar chart and throughout our visualization and filter out the rest because of the nature of the show we are analyzing. As you can see from the chart below there are really 6 main characters and the rest are much smaller in importance. Beyond these 13 characters, the others in the show had a negligible number of lines so we chose to exclude them.

Episode Timeline Barchart for Seasons 1-4. (All-time episodes) The filters apply to this visual and are color-coded intentionally by season. Each bar is selectable and can act as its own filter. The Tooltip tells the individual episode statistics in regard to lines.

The wordcloud was built from project 2’s word cloud code. The light & bright colors support the less occurring phrases while the dark colors are the most occurring. The size helps with the color scale to understand the proportions. All the English prepositions and filler words were filtered out.

The Tooltips tell the number of occurrences the words appear depending on the filter set. You can also click the word to filter the dashboard by that word.

The force-directed graph shows how all the characters interacted with each other based on scene interaction. You can drag the nodes to see how the others repel or attract and form as seen in these pictures. The node tooltip tells which character you are dragging, and the line tooltip tells the count of scenes between the 2 nodes/characters. Since our data did not include scenes, we determined who shared scenes by how often they said each other's names.

For filtering – outside of the filter/interaction options per each chart, you can also filter by season and phrase at the top of the dashboard. Example below.

Design Sketches:

We chose the wordcloud option for the B-goals because we did a wordcloud for Project 2 so we had experience and code we could reuse.

Discoveries and Findings:

Analyze a character’s personality in the show and respective interactions - See how Michael interacts with the main cast.

Michael interacts with most of the characters at some point

Jason Mendoza doesn’t have any lines in the first 2 episodes of season 1. This gives the insight that he slowly became a main character and wasn't one right away.

Love was said 66 times in season 3 and Chidi, Eleanor, Tahani, and Michael were connected to a conversation around love, Janet and Jason were connected about a relationship about love, and surprisingly, Doug and Shawn.

The phrase “what the fork” is commonly used throughout the series because the “good place” has a cursing filter. When we filter by that phrase, we see that it wasn’t spoken at all in Season 3 which gives us the insight that they were not in the “good place” during Season 3.

Process:

Libraries & Resource Used: HTML, CSS, JavaScript, D3 (v6), d3-cloud (https://www.npmjs.com/package/d3-cloud)

Deploy Code: Code can be deployed using localhost (instructions below)

How to run code:

Option 1:

Go to github source code link (Top of Webpage)
Clone the repo or download and unzip
In a terminal cd into the downloaded/cloned directory
In the same terminal: run `python -m http.server 8000` (for Python 3)
- Python 2: run `python -m SimpleHTTPServer 8000`
- Must download Python if not already installed

Navigate to http://localhost:8000 in a browser

Option 2:

Go to release page link (top of webpage)
Click on “Source Code” (.zip or .tar based on your machine)
Extract the zip/tar file
In a terminal cd into the downloaded/cloned directory
In the same terminal: run `python -m http.server 8000` (for Python 3)
- Python 2: run `python -m SimpleHTTPServer 8000`
- Must download Python if not already installed
Navigate to http://localhost:8000 in a browser

Option 3:

Instead of using Python, you can also use Visual Studio Code extension’s Live Server.

Install Live Server in VS Code
Go to index.html of the project
Click “Go Live” and it will automatically take you to a browser with localhost connected.

Division of Work:

Keerthi: Basic UI Layout charts, Wordcloud (visual and processing data), Force Directed Diagram

Anna: Data research/processing, Filtering, Linking, Tooltips, Click Interaction, and the search features