Hannah-Graber-Portfolio

Critique by Design: Reimagining NY State Schools’ Internet Data

Within the scope of this course, I have learned how to effectively critique data visualizations and create some of my own, but for this assignment, I will be doing both to critique an existing visualization via redesign. On this page, I will post the initial visualization and critique, delve into user research and feedback on new sketches I’ve created, and recreate the original visualization according to the feedback I gave and received.

Step One: Original Data Visualization

Below is the original data visualization I chose to critique via redesign. The graph comes from the New York State Education Department from a slide deck outlining spring 2020 digital equity survey results. The data were submitted by New York State Schools “to the best of their ability and knowledge” in June and July of 2020. I picked this visualization because I am interested in digital equity as a policy topic and this particular visualization didn’t fully represent the brevity of internet access barriers, in my opinion. I know that it’s very important for schools to understand the challenges their students face at home (including connecting to the internet to complete schoolwork) and I felt as though this visualization didn’t effectively tell the data’s story.

Step Two: Critique

Describe your overall observations about the data visualization here. What stood out to you? What did you find worked really well? What didn’t? What, if anything, would you do differently? Upon first glance, the data visualization was confusing and felt chaotic due to the many colors, data labels, and labels on the x-axis. I didn’t immediately know what “NRC” meant from the title, so it took a bit of time to understand what was being said with the data. The “#N/A” category on the x-axis felt lazy and led me to question if the person organizing the data was reliable. In terms of things that worked well, the big, bold title is easy to read and draws my eye to the top left corner immediately. I think it’s wise that the creator disaggregated the data by different ‘type’ of school to help identify that schools—and the students at those schools—may have differing needs as they relate to internet access. The color palette feels natural to the eye as we tend to see red-green color schemes in everyday life.

That said, the bright colors make it hard for me as a reader to identify what’s important about the data. Does red signify the “worst” barrier? Not in every category. Additionally, the data itself is hard to interpret because the y-axis is represented in percentages, but the data labels are raw to show sample sizes. This discrepancy could work for a more advanced audience, but I think are generally confusing. Additionally, the categories of school type feel disparate and somewhat unrelated which makes me wonder if there is a type of school I should be focusing on as a viewer. Regarding my thoughts on preliminary changes, some things that come to mind are to cut down on the types of schools represented, delete the #N/A data, represent the data as percentages to cut down on the mental interpretation the audience has to do, and change the color scheme to highlight certain data over others.

Who is the primary audience for this tool? Do you think this visualization is effective for reaching that audience? Why or why not? For context, this visualization is part of a larger slide deck exploring the spring 2020 digital equity survey results for the New York State Education Department. Thus, the intended audience seems to be staff that work in the department who want to learn more about the demography of their students (perhaps administrators like principals). I think this visualization is somewhat effective for reaching the audience. On one hand, the data is helpful for learning about students’ challenges to internet connectivity. On the other hand, the data isn’t presented in a particularly clear way that would leave staff with an impression about the important components of the data.

How successful what this method at evaluating the data visualization you selected? Are there measures you feel are missing or not being captured here? What would you change? Provide 1-2 recommendations (color, type of visualization, layout, etc.) I think Stephen Few’s Data Visualization Effectiveness Profile was very successful at capturing the various aspects of quality and I wouldn’t add any additional measures. Regarding changing the measures, I think the “aesthetics” category should explicitly mention the use of color as a highlight (e.g., does the use of color effectively highlight the most important data?). Additionally, I might change the “usefulness” category to include something about including TOO much data in the visualization (e.g., is the entirety of data presented in this visualization useful to the intended message?).

Steps Three and Four: Redesign Process and User Research

To aid in my redesign process, I created three separate sketches that display the data differently than the original visualization, and differently than one another by iterating each sketch slightly. To gather feedback and test the readability of my sketches, I asked two separate users a series of questions that remained the same for all three visualizations. The users have the following demography:

User 1: Male, 27 years old, software engineer
User 2: Female, 52 years old, sixth grade middle school counselor

Below are the sketches and feedback.

Redesign Option 1 and User Feedback

To begin, I knew I had to do a fair amount of cleaning in terms of data presentation and overall chart junk. I immediately decided to remove the data for “None” and “Not Reported” from each type of school cateogry and cut down on the number of categories represented in this visualization. It seemed to me that the categories most homogenous with one another where those referncing the “need” of schools rather than their location. Thus, I kept “low needs,” “average needs,” and both “high needs” schools types in my sketch.

Further, I decided that instead of focusing on all barriers to internet access, I could highlight the data that represents barriers to cost. Ostensibly, cost as a barrier to internet access is something the district can more directly impact (as opposed to internet availability). The proportion of students who experienced barriers to internet via “availability” or “other” were represented with shades of gray while the data of interest–the proportion of students for whom cost was a barrier–were represented with green.

Redesign Option 2 and User Feedback

My second sketch has many similarities with the first sketch in terms of type of graphic, layout, and axes. However, for this redesign, I wanted to highlight the majority barrier per each type of school rather than only cost because the proportion of students in rural high-needs schools experiecing a lack of availability of internet access is a signficant piece of the story of the data. I chose to omit the “other” barrier option (due to its lack of clarity or elucidation about the root problem) and instead focus on cost and availability. The majority proportion per type of school is represented as a solid bar while the other source of inaccess is represented with dotted lines. I still wanted to show the proportion of the other type of inaccess without highlight it, per se.

Redesign Option 3 and User Feedback

Finally, I tried to represent said ‘majority’ proportions per school type in a visualization other than a bar chart. I was attempting creativity but I’m not satisfied with the product. I agree with the users who provided feedback below in that this matrix is confusing due to a lack of data rather than too much data like in the original visualization. For instance, should the data be read across the rows or columns?

I do like the use of color for the left-justified labels denoting low, medium, and high need schools. Also, the rural high-needs school data is clearly highlighted in without the use of additional color.

User Research Takeaways

A few key patterns have emerged in the feedback from the two users. First, the way I’ve represented the data in the titles and axes are confusing and unclear as to who is struggling with access: schools themselves or students attending the schools? Similarly, the representation of low, average, and high needs schools should be more clear by removing text from the axis or using color/bold to differentiate type.

Rural and urban high needs schools should be sub-labels under a greater “high needs” label to make it more clear that those two columns of data are unique than the other two (low and average needs). Further, a larger label would help the reader pay attention to what makes the two types of schools different which will hopefully make the rural/urban distinction more clear. My classmates suggested that I remove words from the x-axis labels to further clarify the meaning of the graph and reduce unneccesary text.

Finally, there was some feedback about the organization of the data in terms of what gets included on the chart. I will base my final chart on the layout in sketch 2, keeping low and average needs school data and maintaining the position of rural high needs schools on the axis. While I agree with User 1 in that moving the rural data to the rightmost position would further mark it as ‘other,’ I think the different colored majority data sandwiched in between average needs and urban high needs schools is more powerful in showing its uniqueness. Additionally, my classmates suggested that instead of representing the ‘minority’ data with dashed lines, I can instead represent both cost and availability data with solid colored bars to reduce confusion.

Step Five: Building a Solution

Finally, below is my updated redesign:

I attempted to clarify the message of the data in the title and cut down on text within the visualization to reduce confusion resulting from chart junk. I picked colors that didn’t have a clear association with one another, though still effectively highlight important facets of cost and availability barriers. I did find that I was slightly limited by the functionality of Flourish. For example, I wanted to emphasize certain words within the y-axis labels (i.e., bold urban/suburban and rural and potentially change the color of high, average, and low needs text) but I was only able to edit the entire label.

Overall, I believe this final redesign is both effective on its own (sans comparison within the context of this course) and addresses the shortcomings of the original visualization and subsequent sketches.

Back to the main Portfolio page