Building Bay Whales Notes on making 20 years of whale stranding data legible
For years, whales and orcas showed up in my dreams, especially during the hardest stretches of my life. I’ve always been in love with the sea, and I came to think of these dreams as a kind of guidance. That love is part of why the whale strandings have always hit me hard. When something upsets me at that level, I usually feel a pull to do something about it, even when it seems out of reach.
I’m not in marine science, so for a long time I didn’t know how to help. But this year, after I started building digital tools with Claude Code, I realized I could finally do something. I knew I was going to build a tool that would make the patterns in the data visible: not just individual strandings reported one at a time, but the full picture across years and species.
NOAA records reported whale strandings on the West Coast through the Marine Mammal Stranding Network. Most of the Bay Area data comes from The Marine Mammal Center and the California Academy of Sciences. The data exists, and it’s public. But it lives somewhere almost no one outside the field looks. TMMC posts about strandings one at a time on their website. There’s no way to look at the bigger picture, to see all of them on a map, year by year, and ask what the data is telling us.
That’s the gap I wanted to close. The goal: take 20 years of stranding data and make it something a researcher, a journalist, or anyone curious about the Bay could explore in five minutes.
The stakes here aren’t abstract. The last decade has seen significant whale mortality events on the West Coast: gray whale die-offs, humpback entanglements, marine heatwaves coinciding with stranding spikes. These deaths are often shaped by overlapping pressures: warming water, vessel strikes, entanglement, harmful algal blooms.
A public tool can’t fix any of that. But it can make the trend visible, and visibility is where most environmental conversations have to start.
The principle
I went into this with one clear rule for myself: don’t let the design get ahead of the data.
Most data tools don’t fail because the data is bad. They fail because the storytelling around the data slowly stops matching what the data actually says.
So I built the project around a few principles.
The data tells the story. The interface gets out of the way. If something on screen wasn’t earning its place, I cut it.
Every number in a caption got checked in code, not in writing. “216 strandings.” “10 heatwave years.” “41% with a confirmed human cause.” Each one was verified against the live dataset in Python before it shipped. When I changed the boundary of the Bay polygon, every count that depended on it got re-checked and updated. I never let myself write a sentence that sounded right without confirming the number behind it.
I iterated on color, layout, and copy as much as I needed to, and I let myself remove things that weren’t working, even when I’d put real time into building them.
What I built
Bay Whales is at baywhales.org. It shows every reported whale stranding in the San Francisco Bay Area since 2005: 216 records, drawn from NOAA’s West Coast Region database. You can filter by species, scrub through years on a vertical timeline, click any pin to read about that specific stranding, and read nine curated stories that reframe the data in different ways.
The stack is React, TypeScript, and react-leaflet, built with Claude Code. But the more interesting decisions were design ones, not technical ones.
The intro screen took several rounds. The opening view sets the tone for the whole tool: first impression, framing, weight. I tried morphing animations and more elaborate transitions, but they read as overworked for a project about whale strandings. I cut back to a simple opacity crossfade and a clear count: 216 whale strandings, in the Bay Area since 2005. Three seconds, then it gets out of the way.
The color palette went through many iterations. Pin color carries species information, so the colors had to do a lot of work. They had to be distinguishable for both standard and colorblind viewing conditions. They had to hold up across hundreds of pins clustered on a map without becoming visual noise. And the tones needed to match the weight of the subject, finding a palette that felt right for the data without leaning either decorative or alarmist. I went through muted, vivid, brand-aligned, and colorblind-safe versions. The set I landed on: violet for gray whales, blue for humpbacks, lime for fin whales, olive for everything else.
The species filter went through a structural change. I started with checkboxes, but they implied ‘click to enable,’ when the actual default is the opposite. Everything on, click to narrow. I switched to a pill model that matched how people were actually thinking about the data.
Each of the nine stories is a single config entry. That made it possible to keep iterating on individual stories without breaking the others, and to add new ones easily, which mattered because the rail kept growing as I noticed more patterns in the data.
The pattern rail
The bottom of the tool has nine curated stories. Seven of them are patterns: spring is the peak season for strandings, gray whales started entering the Bay more often after a certain year, three humpbacks crossed into the Bay for the first time, marine heatwave years overlap with stranding spikes, an industrial corridor in the East Bay clusters strandings near refineries, among others.
The other two are what I’m calling methodology pills: short notes that tell you how to read the data, rather than pointing out a pattern in it.
One says: most causes aren’t confirmed. Only 41% of these strandings have a confirmed human cause. The rest are either unknown or the body was too decomposed to tell. If you walk away from the map thinking “human activity is causing whale deaths,” that’s the framing the data allows, but the methodology pill makes you sit with the uncertainty.
The other says: Marin County looks deadlier, but that’s because it’s monitored more. TMMC is in Sausalito. Cal Academy is in San Francisco. The reason Marin shows more strandings isn’t necessarily that more whales are dying there. It’s that more people are looking. The map can’t tell you about the strandings nobody recorded.
I added these pills because every data tool steers the reader somewhere. I wanted this one to steer toward the gaps.
The data is real, but it’s also incomplete and uneven, and a tool that doesn’t tell you that is doing you a disservice.
The marine heatwave story
One of the stories took longer than the rest: warm seas, more strandings.
Marine heatwaves, long stretches of unusually warm ocean temperatures, have become more common on the West Coast over the past decade. NOAA’s California Current Marine Heatwave Tracker classifies them by year. When I overlaid the heatwave years against the stranding data, the pattern was visible: 151 of the 216 records in the dataset, about 70%, fall during years NOAA classified as heatwave years.
That’s the headline number, and it’s the kind of claim that needs to be verified carefully, because it’s the kind of claim that gets shared. I checked everything. The list of heatwave years (initially 9, then 10 after re-reading the classification and adding 2024). The 70% number against the live dataset. The palette went through four versions before landing on burgundy. With a story this stark, the risk is letting the colors say more than the data does.
Working through the heatwave story is what got me thinking about what comes next.
What’s next
Bay Whales is the first in a series of public-interest tools I’m building this year, each one taking a piece of environmental data and making it legible.
The bigger question I’m sitting with is whether this approach has real applications beyond a single project. Could a tool like this be useful inside an organization, like a research lab, a conservation group, or a regulator, and not just for a public audience? Could the same methodology work for other regions, other species, other kinds of environmental data where the records exist but aren’t legible?