Mapping the Bears Ears National Monument from Obama to Trump: Discovery to Prototype

As part of the discovery phase of a small data viz project I researched the boundaries of Obama’s Bears Ears national monument compared to Trump’s shrinkage. I wanted to display this in Leaflet, and expected the process to take about 30 minutes. The monument boundary files were in an unfamiliar format, and it took me close to three hours. The gig ended up going to someone else, but I wanted to document the process in case it helps anyone, especially future me.

If you want to skip to the end, I created a GitHub Gist that you can view on bl.ocks.org.

Step one was to find the boundaries. The best resource I’ve found for Bears Ears is a blog post from Harvard University’s Center for Geographic Analysis, “Mapping Bears Ears National Monument.” They link to two GIS boundary resources from the U.S. Bureau of Land Management (BLM):  Obama’s National Monument Extent and Trump’s National Monument Extent.

The first hang up was the unfamiliar coordinate system used by the BLM files. They were in a Geodatabase format, with a .gdb file extension. That was new to me, but I found that ogr2ogr could convert it to GeoJSON, because ogr2ogr converts anything. But the GeoJSON didn’t show up in Leaflet. Upon further review, that’s when I realized the BLM files were not in latitude, longitude. Some Google searches found that I had to add ‘-t_srs WGS84’ to get it to work. Final ogr2ogr conversion:

> ogr2ogr -skipfailures -f “GeoJSON” -t_srs WGS84 bears_ears_obmama.json NLCS_NM_NCA_historic.gdb

A similar command created the ‘bears_ears_trump.json’ GeoJSON file.

This gave me something I could show in Leaflet. However, recall that the BLM files showed the “National Monument Extent.” That means there’s more there than just Bears Ears. So I opened the files in QGIS to remove the extra stuff.

Then select Layer -> Save As to create a new GeoJSON file. I also trimmed Trump’s BLM file using QGIS.

So now we just have what we need, but the combined size of the two GeoJSON files is almost 2 megabytes. Too large for the web. Luckily MapShaper will let you upload a boundary file in multiple formats, and simplify it. (MapShaper will also handle some conversions if you don’t want to figure out ogr2ogr, but it would not have done this conversion since it does not handle Geodatabase files.)

Now the combined file sizes are one-tenth of the original, much better for web. Let’s check it out in Leaflet. Here’s the first pass. Obama’s Bears Ears is outlined in white, which Trump’s replacement is blue. This is how it came from BLM, they even looked different in QGIS.

It’s nice that Trump’s stands out, but Obama’s is hard to see. Let’s make Obama’s outline black.

And there it is. Once again, you can view the raw code on GitHub Gist and view  on bl.ocks.org.

Packing and Cracking: A visual tour of Pennsylvania’s Congressional Districts

Pennsylvania’s U.S. congressional districts were fairly equitable in the 2000’s, and multiple seats switched parties through the decade. In 2004 Republicans won 12 seats, in 2008 Democrats won 12 seats, and in 2010 Republicans once again won 12 seats. In those elections, Democrats held six seats all three years, Republicans held seven seats all three years, four swapped R-D-R, one swapped R-D-D, and one swapped D-D-R.

Read the rest on Medium.

Analyzing out-of-state political donations with Neo4j

I competed in the graphs-4-good hackathon at Neo4j’s GraphConnect conference. The random guys at my table and I were the only team to use the FEC data, and we were awarded for the “Best use of Cypher.”

We got to use a special sandbox that was set up at NICAR, a conference for data journalists. My team decided to look at in-state vs. out-of-state donations. This is the Neo4j model after trimming unnecessary nodes, such as “Occupation.”

The Candidate node has a property of “state,” which we compare to the relationship: (c:Contributor)-LIVES-IN->(s:State)

The FEC data includes everyone running in a federal election, so everyone who mounts a primary campaign is in there. To keep it simple, we decided to focus on incumbent Senators in 2016, which you can see set in the WHERE clause below.

Since we need to sum up two different things, it was easiest to separately set new properties to hold the totals. This query sets the in-state total:

MATCH (c:Contributor)-[mc:MADE_CONTRIBUTION]->(cb:Contribution)
MATCH (cb)-[mt:MADE_TO]->(fk:FECCommittee)
MATCH (fk)-[fun:FUNDS]->(cd:Candidate)
MATCH (c)-[l:LIVES_IN]->(s:State)
WHERE cd.election_year = 2016
AND cd.incumbent = ‘I’
AND cd.office = ‘S’
AND s.code = cd.state
WITH cd, sum(cb.amount) as total
SET cd.in_state = total
RETURN cd.name, cd.state, cd.in_state
ORDER BY total desc;

To set the out-of-state field, you need to edit the final AND line, and the SET and RETURN lines. Here are the last five lines of that query.

. . .
AND s.code <> cd.state
WITH cd, sum(cb.amount) as total
SET cd.out_of_state = total
RETURN cd.name, cd.state, cd.out_of_state
ORDER BY total desc;

Finally, we put them together and calculated the percentage of out-of-state donations with this query:

MATCH (cd:Candidate)
WHERE cd.election_year = 2016
AND cd.incumbent = ‘I’
AND cd.office = ‘S’
AND cd.in_state > 0
WITH cd, cd.in_state + cd.out_of_state as total
RETURN cd.name, cd.state, cd.in_state, cd.out_of_state, total, cd.out_of_state*100/total as perc_out
ORDER BY perc_out desc;

This is only a proof-of-concept. It only includes filings for about three months of donations, not the entire election cycle. But I’m hoping to apply it to a full election cycle data set soon.

 

Presentation: Census Data Demystified

I presented at BarCamp Philly.

Slides are available here.

List of links within the slides:

I like to learn by copying and editing. If you do as well then pick one of my mapping blocks at bl.ocks.org/enactdev and follow these steps:

  • Clone the block’s Gist.
  • Replace the GEO JSON file with yours.
  • Replace the data JSON file with yours.
  • Rename the comparison fields in index.html (NOTE: I match a field in the GeoJSON object with the index of the data JSON dictionary).
  • Rename the title and hover text in index.html
  • Don’t forget to recenter (also in index.html)!

 

Google Analytics Visualizations: Venn diagram of pageview overlap by session and map of weighted pageviews by state

While working with Google Analytics for Compass Red I’ve developed some fun visualizations. Here is a Venn diagram of overlapping pageviews to a group of Good-Better-Best products. (Product names removed due to non-disclosure agreement.)

This is built by segmenting pageviews. You need three segments: Good-Better, Good-Best, and Better-Best. From each of those three you can get the number of people visiting all three, and of course getting the number of individual visitors to each single page is easy.

This next visualization looks at the number of pageviews per 100,000 people in each state.

Population by state comes from the U.S. Census.

Visually explore Delaware’s latest Census data

There are two views, both offering the same Census measurements. One displays Delaware by county, and the other breaks New Castle County up into four areas. (Note: Those areas are set by Census, and are called PUMAs, or Public Use Microdata Areas. Basically areas with enough of a population to be statistically meaningful. Minimum population is 100,000.)

Here are some interesting things I found. You can easily recreate the screenshots below, or create your own views.

You can definitely see the influx of retirees to Sussex county. Our southern-most county accounted for almost half of the population increase in Delaware since 2012, drawing in 16,861 new residents out of 34,973 for the entire state. In that same timespan, Sussex’s median age rose by two years to 48.7, and is ten years greater than the average age in New Castle County. Lastly, the median year a home in Sussex was constructed rose from 1991 to 1998, and is 14 years greater than Delaware’s average of 1984.

Overall, New Castle County population went up, but the Wilmington area and south NCCo went down.

Per-capita income dipped in northern NCCo, but they’re still by far the highest, almost $9,000 higher than the county average. Per-capita income rose the most in southern NCCo, by 16%, or almost $5,000.

I’m going to open sourcing the project, but need to clean some things up first. For example the median home construction year is treated like a number, so the year 1973 is printed as 1,973. There’s also work to be done with connecting the population with various metrics, for example linking population to the number living in poverty to get the poverty rate.

This post originally appeared on the Open Data Delaware blog.