The important point is methodological. We asked OpenAlex for grouped yearly counts, saved those counts to a CSV, and built the page from that smaller dataset. The project stayed tractable because the question was narrow and the intermediate data remained inspectable.
open-access-viz/
+-- fetch_data.py# Python script to pull data
+-- oa_data_raw.csv# Yearly OA counts used by the chart
+-- index.html# Interactive visualization
+-- tutorial.html# This page
The Conversation
The conversation was short because the question was narrow and the output was concrete. In this kind of work, brevity is often a good sign. Here is a simplified version of how it went:
Me
I'd like to show data cleaning and visualization of a dataset. Something relevant to scholarly communications.
Claude Code
A few workable options are OpenAlex data on Open Access trends, DOAJ journal data, or Retraction Watch. Which direction do you want?
Me
Let's do OpenAlex, the rise of open access.
Claude Code
I'll fetch yearly OA counts from OpenAlex, save them to a CSV, and build an interactive visualization from that dataset.
[Claude Code then wrote a Python script, ran it, inspected the output, and created the HTML visualization]
Step-by-Step Breakdown
1
Describe your goal
Name the question, the data source, and the output. Here is the kind of prompt that works well:
Example prompt
"I want to visualize the growth of Open Access publishing over time using data from OpenAlex. Fetch yearly counts, save them to a CSV I can inspect, and build an interactive HTML page to display the results."
2
Review and iterate
Claude Code will usually:
Ask clarifying questions if needed
Write the code
Run it (with your permission)
Show you the results
Then you review the output and correct it. "Make the colors more accessible." "Add tooltips with exact values." "Show the CSV before you build the chart." The corrections are small because the scope is small.
3
Understand what was built
Do not stop at "it works." Ask Claude Code to explain the code and the data assumptions:
Example follow-up
"Explain how the fetch_data.py script works, what API calls it makes, and how you calculated the OA percentages."
Key Concepts
The data pipeline
Fetch: Pull data from an API (OpenAlex, in this case)
Clean: Fill missing values and calculate the fields the chart needs
Transform: Organize the data into the shape the visualization expects
Visualize: Turn it into a page someone else can read
Reusable Structure
This is the structure behind many small data projects. The source changes and the chart changes, but the workflow is often recognizably similar.
The companion guide Working with APIs and External Data takes up the request side of this workflow in more detail: reading the docs, making small test calls, saving responses, and deciding when the browser is the wrong place to run the request.
Ideas for your own projects
Analyze publication output by year or department
Track Open Access rates for a campus, lab, or field
Compare journal costs against OA availability
Build a small reporting page for a stakeholder group
Visualize research output by topic or collaboration pattern
Data sources for your own projects
OpenAlex — free scholarly metadata API (what we used)