Categories
Assignment 5 Uncategorized

Assignment 5

Building networks is a complicated process, requiring much analytical thinking and an understanding of multidimensional data. Using programs like Gephi allows for a parsing of complex relationships between this data. I happened to work with Gephi briefly during my foundation seminar of freshman year, but not nearly as in depth as we are now. Initially, working with Gephi in this class was intimidating, as there was so much to learn and it doesn’t seem as user friendly as the previous platforms we have worked with.  However, after spending a few days playing with the program and finding new things, I’ve found that Gephi’s many benefits allow for a powerful visualization. I decided to build my own dataset on demographics of the Bucknell Women’s Cross Country team, with the intent of discovering and analyzing the different majors being studied by the women of all four class years as well as finding any trends that there may be within the data. I began the process by sending out a survey to collect the information before creating the CSV file containing the data.  At first glance, I immediately noticed biochemistry to be a frequent major. This was exciting for me, as I was interested in how Gephi would work to display this common theme across team members.   

Due to Gephi only allowing for a mapping of relationships between the same types of nodes, the process of portraying what I would like the viewers to understand has been particularly difficult for me. When discussing the purpose of nodes and edges, Graham says,  “Everything about a network pivots on these two building blocks” (Graham, 202). In networks, nodes can be further defined by attribute rather than seen as just dots. Each individual node is representative of each team member, and the edges refer to the undirected  relationships between them. Because the connections I chose to make between people (major) have no beginning or end, the relationships can be considered as “rhizomatic relationships” (Lima 44). Rhizomes acknowledge multiplicity in data. They “connect any point to another in  a way that allows for a flexible network to emerge” (Lima 44). Gephi was able to create a flexible network that could I could parse further to produce the relationships I portrayed. I filtered by degree and got an average degree of 28.8, representing the number of edges adjacent to each node.

There is no modularity in my data set, as all team members were already connected due to being a part of the same team. In a way, the team is it’s own “small world.” Each member (node) is interconnected with one another through an edge, simply because  they are apart of the same “world.” When choosing which layout to apply, I began by using the circular layout due to it being simple and easy to understand visually. Graham states, “It is easy to become hypnotized by the complexity of a network, to succumb to the desire of connecting everything and, in doing so, learning nothing” (Graham 201). Therefore, I knew that I wanted my network visualization to be as simple as possible. Nodes are colored based on major. The edges occupying the rest of the visualization are the connections between each member of the team.  I then reconstructed the original layout by dragging nodes of the same major and placing them alongside each other to allow for an easier visual of the frequency of each. My network formed the following: 

overview mode
preview mode

It is clear from this overview that there is a dominant major among team members, which happens to be biochemistry. I can’t infer any reasoning behind this through my data set, but I thought that it was an interesting trend that I knew I wanted to look more closely at. I have learned that because there are no arrows connected to the edges between nodes, the relationships between each node are equally significant, having no direction in the relationships. 

So that the user can add to the depth of their network, the program allows for a search of metrics that provides more insight into the overall network. I chose to first partition by major which allowed me to visualize how many people belong to each major category. Those who shared the same major had more connections to each other than to those of different majors. Because I chose to look into the frequency of the biochemistry major throughout the team, below is the visualization displaying the nodes directly linked to each other through the biochemistry “edge.” Filtering my data through partitioning by major allowed me to create the following: 

partition filter

I then decided to partition by class year to look at what majors were studied by girls of different years, in particular, biochemistry. Because the nodes are all different colors, I found that of the seven freshmen, no majors were shared:

Of the ten sophomores, biochemistry was a common major between four people, represented by the four pink nodes closely connected to each other.

Of the eight juniors, three of them are studying biochemistry.

Lastly, of the five seniors, only one studies biochemistry. However, two of them study biology, which is represented by the connected blue nodes.

As Lima said, network visualization allows for the portrayal of “intangible structures that are invisible and undetectable to the human eye” (Lima 80). By filtering the data in the way that I did, I was able to visualize something that may not have been so obvious before.  Gephi allowed for me to explore the cross country team’s academic side, something that I don’t consider very often. I liked discovering that biochemistry seems to be a subject of interest among my teammates, despite not knowing of a correlation behind reasoning.

Categories
Assignment 3

Assignment 3

Lima’s Chapter Two discusses the transition from trees to networks in the world of data visualization. He describes network visualization as an “ubiquitous data sphere,” containing tangled networks of nodes and links, representing huge volumes of data while using optimal screen space. Networks are an omnipresent structure, symbolizing data as a non-hierarchical autonomy, almost as an art of multidimensional behaviors from which we can reveal new knowledge and patterns as well as abstract new meanings

I used the sample data set in Palladio to analyze a set of relations on the names in the data set based on the place of death according to the number of people that died at those locations. I added a point-based map layer from the data, focusing on place of death. Through my visualizations, I found that more deaths occurred in Paris, Moraco, New York, and London than in any other locations in the set. This conclusion was drawn from the bigger node sizes on those locations, due to my size points being based on number of people. The visualization allows the viewer to hover over each node, and a tool top then indicates how many people from the data have died in that location. Examples are included below.

Palladio Map: Place of Death

Furthermore, space and time are two significant concepts in the digital humanities. Meirelles writes that our concepts and corresponding visuals are organized around the difference between linear and cyclical times. Linear times can be visualized with timelines, typically represent historical time. Timelines contain chronological and sequential narratives of historical events, using space to communicate temporal distance through intervals. By mapping time and events in this uniform way, it enables viewers to make an easy comparison of time intervals. In the digital humanities, digital timelines enable navigation through time by means of sliding back and forth along the structure. Inclusion of historical context and the ability to filter data by certain thresholds makes digital timelines an effective, sometimes detailed method for representing events over time.

I then used Palladio’s timeline tool to create a representation of date of death’s according to the number of people, grouped by place of death. This created an easy-to-read, interactive visualization that allowed for me to gain knowledge of numbers of deaths in each country during certain time periods according to the data set. Taller bars correspond with a greater amount of deaths and by hovering over each bar, it highlights and indicates which country is being focused on. Below, I have inserted a timeline screenshots focusing on London, New York, and Paris to demonstrate this portrayal.

Palladio: London Timeline
Palladio: New York Timeline
Palladio: Paris Timeline

Timeline JS

Timeline JS is an open-source tool that enables anyone to build visually rich, interactive timelines. Creating a timeline with this program can be as simple as using a Google spreadsheet. After doing some research, I decided to use Timeline JS to hone in on three significant events that occurred in London, New York, and Paris that may have caused an increased death count and node size at certain times. The timeline is embedded above so that viewers can interact with and grasp an understanding of the events at hand, and consider possible reasons for death rates in certain countries at indicated times in the data set.

Johanna Drucker

The means by which a graphic produces meaning is known as graphical expression. Drucker (2016) emphasizes that knowing how to read visualizations as graphical expression is crucial. One particular form of graphical information is the columnar form of the spreadsheet. Discrete boxes and the grouping of data through columns allows for a meaningful result. For this assignment we used data organized by cells, rows, and columns in a spreadsheet to generate visualizations from which we can document a system of relations to create meaning.

Additionally, Drucker argues that “almost all information visualizations are reifications or mis-information.” I agree with her claim, as I believe that visualizations are representations that seem like presentations. Viewers are often presented with a situation that is further removed from the original work. Therefore, I feel the visualizations that I created with Palladio are representations, rather than knowledge generators, of the data sample that I chose to use. They are generated using only the specific data in the provided sample, with no outside knowledge. However, the viewer can draw conclusions, similar to the ones in this post. In that case, viewers can generate their own knowledge from the visualization, similar to how I used destructive historical events to justify higher death rates in London, New York, and Paris.

Categories
Assignment 2

Assignment 2

Slave Location

Voyant: Slave Narratives
Tableau: Slave Population in U.S.

I used Voyant to analyze the seven slave narratives, those of which named locations in the U.S., by creating a dreamscape visualization that resembles a map. The map focuses in on the states mentioned throughout the narratives with circles representing each state. The bigger the circle location, the more frequently it is mentioned in the narratives. Voyant also has the ability to visualize links between states that are mentioned together in the texts. The visualization clearly shows the circles to be located on the East Coast and in the general Northeast area. This is due to the authors mentioning these areas in their narratives. Bigger, darker circles represent more popular areas in the texts. Links between states that the author mentioned together in the texts are represented by the arrows on the map. I used Tableau to create the second visualization. The program used data from the US Slavery 1860 data base to display the number of slaves of the United States on a map. A higher population of slaves is represented by darker hues of orange, as shown in the legend that I included. Although Voyant and Tableau visualize two different kinds of data, textual and quantitative, both programs seemed to display the same takeaway from the data. I thought it was interesting to notice the similarity between the two maps in the sense that the slave narratives and the slavery database focused in on similar areas of the country. 

Creating two different visualizations with different programs and data to portray a central idea takes some thought. I knew to use Voyant for the slave narratives as it is a tool for textual analysis, allowing for me to sort through the words in all seven texts. At the same time, it was able to pick out which words were states, helping to create the map visualization. Tableau, being more of a tool for quantitative data, allowed for the creation of visualizations from databases. The program darkens the color of shaded states based on density. The more number of records accounted for in the data, the darker the shade of the state on the map.

However, I think that the different abilities of Voyant and Tableau did not well together to present the data I wanted, being the location and population of slaves in the United States during that specific period. Tableau created a better representation of the slave representation in the United States, spanning all the way from the North East to the South, and even over to the West. Anyone with a general idea of slavery in the U.S. would look at the visualization with an understanding of why certain states were shaded the way they are. On the other hand, Voyant’s visualization of states mentioned in the slave narratives is a misrepresentation of the slave population. The map’s nodes mainly focus on the Northeastern states, with few connections to the Southern states, which were a prominent part of slavery during the time period. This could be due to a focus on northern states throughout the slave narratives, but in regards to reasoning, I can’t seem to figure out why. Therefore, I think that using the slave narratives to create visualizations regarding slave population in the States is misleading and inaccurate. Rather, using the US Slavery 1860 data base gives a better representation of slavery overall during this time period in history.


Children

Voyant: Line Graph
Tableau: Bar Graph (Child’s age and gender)
tooltip: boy
tooltip: girl

Once again, I used Voyant to portray a similar correlation between three terms throughout the seven slave narratives; child, young and sold. One could assume that “sold” is referring to being sold as a slave, so I thought it would be interesting to take notice of how often young children are related to the world “sold” throughout the texts. From the line graph I created with Voyant, an association can be made between the three terms, as their lines tend to follow the same trend throughout the narratives. Although it is sad to think about, the strong correlation displayed by the visualization makes it clear that children being sold into slavery was a prominent topic of discussion throughout the texts.

Using age and sexage data from the African Names Database, I then used Tableau to create a bar graph which displays the average age of children in slavery. The bar graph focuses on children by only pulling data from the “boy” and “girl” sexages. I utilized the tool tip again by adding the number of records in order to show just how many young children were apart of slavery during the time. This visualization furthers my point regarding young children being sold into slavery. If the average age of children in slavery was around nine/ ten years old, there were definitley children much younger involved. Data visualization tools like Voyant and Tableau allow for the analyzation of data in ways that people may not have considered before. For example, you may have never known just how young slaves were if it were not for Tableau!

Gender

Voyant: Bar Graph
Tableau Pie Chart

I used Voyant to showcase how much more dominant men were than women in the narratives. The bar graph displays the relative frequencies of the words “man” and “woman” throughout the texts. It is very clear that, for the most part, men are discussed much more than women are. We can see that in Turner’s text, women were actually completely left out. Without Voyant’s easy-to-use program, it would be a lot harder to make this assumption about a set of texts without reading them. This tool allows for a general analyzation of something in a simple, time-sufficient way.

Using Tableau and the African Names Database, I was able to create a pie chart portraying the genders and ages of slaves in the data set. I thought it was interesting to compare to my Voyant visualization to, once again, display how insignificant females were in comparison to men at this point in time. Males (man and boy) are represented by green and blue colors while females (woman and girl) are represented by red and pink colors. The chart shows that men are the most dominant group of slaves, followed by boys, women, and girls. The tooltip in Tableau allows for the viewer to more clearly see the order of most common gender and age group.  

After working with Voyant and Tableau, I now understand what each does and for what kind of data visualization they fit best. Voyant allows us to present qualitative data in an interactive, visually appealing way. Similarly, Tableau allows us to work with quantitative data sets to present information. Voyant is extremely useful for finding keywords in a corpus. Additionally, one can compare one work to another without having to read either piece, which can be very useful when considering long narratives. I found Tableau to be a little more complicated than Voyant, but after exploring its possibilities more, it became more familiar. To me, Tableau is much more impressive than Voyant in the sense that it compares such specific elements of each data set. In addition, it creates previews of what your visualizations may look like with the “show me” tab, which is something that I really liked about it. After becoming more familiar with the program, I was able to illustrate very interesting concepts.

Tanya Clement

The process of corpus construction and the creation of visualizations using Voyant and Tableau has verified Tanya Clement’s concept of “differential reading.” She observed that the use of visualization platforms “combines the video streams from these cameras, and the resulting images duplicate a multidimensional viewpoint. That we are aware it is a virtual reality keeps us mindful of the processes we use to produce it, but the experience of this encompassing vantage point allows for a feeling of justice or authenticity that is based on plausible complexities, not simplified and immutable truths.” These processes allowed me to see features that I might not have seen with other platforms. While creating visualizations and using different data to display similar concepts, I was given the chance to see data from different perspectives, creating the multidimensional viewpoint she discusses in her observation of visualization platform. This viewpoint is important so that the viewer can experience multiple viewpoints and grasp a thorough understanding of the subject at hand. This can lead to a feeling of authenticity of the created visualizations.

Categories
Assignment 1

Assignment 1

“Art in Odd Places”

David Bunde created this visualization so that artists of the Lower East Side in Manhattan may have the chance to explore the role public space plays in society in terms of authors displaying their projects. I was drawn to this visualization because i live in close proximity to the location-at-hand. Viewers, specifically artists, can interact with the visualization in a way that allows them to examine and understand the relationships between creators, locations, and projects through three corresponding columns. The method the author used to connect each aspect of the visualization to one another enables the viewer to see the information they want very easily. The visualization exemplifies the concept of a dynamic visualization in many ways. Firstly, one can interact by hovering over any listed artists’ name, any location, or any project and see its connection to the other nodes. Lima described the world wide web as a “Tangled network of nodes and links, embodying an enormous volume of data” (Lima 56). Bunde uses artists names, locations in the East Village, and project names as nodes and links them together through the interactive visualization. In these ways, we can interact with and grasp an understanding of the material.

“I have a headache”

The unknown author created this visualization to illustrate an array of over-the-counter drugs available for headaches. What drew me to this piece were the bright colors used for nodes as well as the branches from a central foundation. Being a network, it displays the many decision making processes one may go through to choose a brand of medication. It uses branches, similar to those of a tree graph discussed in class, to radiate from the central headache and “branch off” into a chain of decisions. In chapter one, when discussing tree graphs, Meirelles writes that “In a nutshell, hierarchical systems are ordered sets where elements and/or subsets are organized in a given relationship to one another, both among themselves and within the whole” (Merielles 17). Suggested medications in the visual are related to factors like age, brand, dosage, etc. Although this work is not interactive, it still provides the viewer with a chart packed with hypothetical choices one may face. I feel that if the author made this visual interactive and available to be viewed from different perspectives, it would be much more useful for the people interested. Furthermore, it would contribute to new ways of understanding the material.

“SELFIECITY”

“Selfiecity” is an interactive visualization that illustrates different themes and trend in the selfies people from all around the world take. I chose to examine it because in today’s world, social media as well as selfies are a very prominent part of society. Du Bois describes data visualization in his Visualizing Black America as, “the rendering of information in a visual format to help communicate data while also generating new patterns and knowledge through the act of visualization itself” (DuBois 8). The visualization is definitely very organized, allowing for the viewer to easily explore and find patterns in data sets. The use of image plots helps display and categorize the pictures. Categories of the images include types of selfies and different poses. It even includes and compares selfies from different cities, ages, and genders. Viewers can easily access and see the data without drawing bias from it. Additionally, it builds on our understanding of the subject with easy-to-read graphs and theoretical essays.

The visualization utilizes both quantitative and qualitative metrics to allow for multiple perspective to be taken into account. This use of dynamic data gives viewers the opportunity to interact with and know the data.

Although it is a very well-constructed visualization, I did find a privacy issue in that the authors didn’t ask permission to use the pictures in their study. This is a common ethical problem in research and case findings. In addition, the authors may not have accounted for all selfies. Most times, there are populations of people who fail to be represented in data samples due to biases or simply not being on social media.

“Mapping the Republic of Letters”

The Mapping of the Republic of Letters uses visualization to display Voltaire’s correspondence of networks across continents and further the understanding of his connections to certain people and places. Networks included were social networks created by scientific academies and physical networks created by travel. I feel this is a well- developed, interactive work that answers questions people may have regarding these networks. The method used to present the data allows for viewers to interpret and draw their own conclusions from a multi-dimensional perspective. This dynamic method eliminates any possibility of bias from the author in the visualization due to direct manipulation of the graphical objects and statistical properties.

Categories
Uncategorized

Blog Post 1

Data literacy is important in all aspects of our world. It presents to people statistics, world issues, navigation, and so much more. It provides us with data that can be applied to real- world problems, whether it be in health, inequality, the economy, or the environment. Data visualization renders information in a way that helps communicate the data to the people. However, data ethics are of utmost importance. Bias, accountability and transparency are common issues in data ethics. Withholding bias in certain areas of research can lead to accountability and transparency for all races, genders, and social groups. This can lead to the design of data sets and systems that work towards equality and fairness. Du Bois was an important figure in the beginning of date literacy and visualization. He led his students to collect and analyze data on black communities so that they may think sociologically and go on to compile charts and graphs based on their findings. These charts and graphs could then be used to analyze experiences of black people, especially in Philly, around the beginning of the twentieth century. Their findings showed that black population and fertility rates were increasing.

The following examples from viz.wtf portray misuses of data visualization.

In this image, the creator violates correct representation of numbers, as the images aren’t directly proportional to the quantities represented.
In this image, the labeling is not very clear, as the lines are all the same color and there is no clear way to follow them. There are no clear events on the x axis to explain the extreme changes in the y axis either, making it hard to interpret the visualization well.