Packing and Cracking: A visual tour of Pennsylvania’s Congressional Districts

Pennsylvania’s U.S. congressional districts were fairly equitable in the 2000’s, and multiple seats switched parties through the decade. In 2004 Republicans won 12 seats, in 2008 Democrats won 12 seats, and in 2010 Republicans once again won 12 seats. In those elections, Democrats held six seats all three years, Republicans held seven seats all three years, four swapped R-D-R, one swapped R-D-D, and one swapped D-D-R.

Read the rest on Medium.

Analyzing out-of-state political donations with Neo4j

I competed in the graphs-4-good hackathon at Neo4j’s GraphConnect conference. The random guys at my table and I were the only team to use the FEC data, and we were awarded for the “Best use of Cypher.”

We got to use a special sandbox that was set up at NICAR, a conference for data journalists. My team decided to look at in-state vs. out-of-state donations. This is the Neo4j model after trimming unnecessary nodes, such as “Occupation.”

The Candidate node has a property of “state,” which we compare to the relationship: (c:Contributor)-LIVES-IN->(s:State)

The FEC data includes everyone running in a federal election, so everyone who mounts a primary campaign is in there. To keep it simple, we decided to focus on incumbent Senators in 2016, which you can see set in the WHERE clause below.

Since we need to sum up two different things, it was easiest to separately set new properties to hold the totals. This query sets the in-state total:

MATCH (c:Contributor)-[mc:MADE_CONTRIBUTION]->(cb:Contribution)
MATCH (cb)-[mt:MADE_TO]->(fk:FECCommittee)
MATCH (fk)-[fun:FUNDS]->(cd:Candidate)
MATCH (c)-[l:LIVES_IN]->(s:State)
WHERE cd.election_year = 2016
AND cd.incumbent = ‘I’
AND = ‘S’
AND s.code = cd.state
WITH cd, sum(cb.amount) as total
SET cd.in_state = total
RETURN, cd.state, cd.in_state
ORDER BY total desc;

To set the out-of-state field, you need to edit the final AND line, and the SET and RETURN lines. Here are the last five lines of that query.

. . .
AND s.code <> cd.state
WITH cd, sum(cb.amount) as total
SET cd.out_of_state = total
RETURN, cd.state, cd.out_of_state
ORDER BY total desc;

Finally, we put them together and calculated the percentage of out-of-state donations with this query:

MATCH (cd:Candidate)
WHERE cd.election_year = 2016
AND cd.incumbent = ‘I’
AND = ‘S’
AND cd.in_state > 0
WITH cd, cd.in_state + cd.out_of_state as total
RETURN, cd.state, cd.in_state, cd.out_of_state, total, cd.out_of_state*100/total as perc_out
ORDER BY perc_out desc;

This is only a proof-of-concept. It only includes filings for about three months of donations, not the entire election cycle. But I’m hoping to apply it to a full election cycle data set soon.


Presentation: Census Data Demystified

I presented at BarCamp Philly.

Slides are available here.

List of links within the slides:

I like to learn by copying and editing. If you do as well then pick one of my mapping blocks at and follow these steps:

  • Clone the block’s Gist.
  • Replace the GEO JSON file with yours.
  • Replace the data JSON file with yours.
  • Rename the comparison fields in index.html (NOTE: I match a field in the GeoJSON object with the index of the data JSON dictionary).
  • Rename the title and hover text in index.html
  • Don’t forget to recenter (also in index.html)!


Google Analytics Visualizations: Venn diagram of pageview overlap by session and map of weighted pageviews by state

While working with Google Analytics for Compass Red I’ve developed some fun visualizations. Here is a Venn diagram of overlapping pageviews to a group of Good-Better-Best products. (Product names removed due to non-disclosure agreement.)

This is built by segmenting pageviews. You need three segments: Good-Better, Good-Best, and Better-Best. From each of those three you can get the number of people visiting all three, and of course getting the number of individual visitors to each single page is easy.

This next visualization looks at the number of pageviews per 100,000 people in each state.

Population by state comes from the U.S. Census.

Visually explore Delaware’s latest Census data

There are two views, both offering the same Census measurements. One displays Delaware by county, and the other breaks New Castle County up into four areas. (Note: Those areas are set by Census, and are called PUMAs, or Public Use Microdata Areas. Basically areas with enough of a population to be statistically meaningful. Minimum population is 100,000.)

Here are some interesting things I found. You can easily recreate the screenshots below, or create your own views.

You can definitely see the influx of retirees to Sussex county. Our southern-most county accounted for almost half of the population increase in Delaware since 2012, drawing in 16,861 new residents out of 34,973 for the entire state. In that same timespan, Sussex’s median age rose by two years to 48.7, and is ten years greater than the average age in New Castle County. Lastly, the median year a home in Sussex was constructed rose from 1991 to 1998, and is 14 years greater than Delaware’s average of 1984.

Overall, New Castle County population went up, but the Wilmington area and south NCCo went down.

Per-capita income dipped in northern NCCo, but they’re still by far the highest, almost $9,000 higher than the county average. Per-capita income rose the most in southern NCCo, by 16%, or almost $5,000.

I’m going to open sourcing the project, but need to clean some things up first. For example the median home construction year is treated like a number, so the year 1973 is printed as 1,973. There’s also work to be done with connecting the population with various metrics, for example linking population to the number living in poverty to get the poverty rate.

This post originally appeared on the Open Data Delaware blog.