Analyzing out-of-state political donations with Neo4j

I competed in the graphs-4-good hackathon at Neo4j’s GraphConnect conference. The random guys at my table and I were the only team to use the FEC data, and we were awarded for the “Best use of Cypher.”

We got to use a special sandbox that was set up at NICAR, a conference for data journalists. My team decided to look at in-state vs. out-of-state donations. This is the Neo4j model after trimming unnecessary nodes, such as “Occupation.”

The Candidate node has a property of “state,” which we compare to the relationship: (c:Contributor)-LIVES-IN->(s:State)

The FEC data includes everyone running in a federal election, so everyone who mounts a primary campaign is in there. To keep it simple, we decided to focus on incumbent Senators in 2016, which you can see set in the WHERE clause below.

Since we need to sum up two different things, it was easiest to separately set new properties to hold the totals. This query sets the in-state total:

MATCH (c:Contributor)-[mc:MADE_CONTRIBUTION]->(cb:Contribution)
MATCH (cb)-[mt:MADE_TO]->(fk:FECCommittee)
MATCH (fk)-[fun:FUNDS]->(cd:Candidate)
MATCH (c)-[l:LIVES_IN]->(s:State)
WHERE cd.election_year = 2016
AND cd.incumbent = ‘I’
AND cd.office = ‘S’
AND s.code = cd.state
WITH cd, sum(cb.amount) as total
SET cd.in_state = total
RETURN cd.name, cd.state, cd.in_state
ORDER BY total desc;

To set the out-of-state field, you need to edit the final AND line, and the SET and RETURN lines. Here are the last five lines of that query.

. . .
AND s.code <> cd.state
WITH cd, sum(cb.amount) as total
SET cd.out_of_state = total
RETURN cd.name, cd.state, cd.out_of_state
ORDER BY total desc;

Finally, we put them together and calculated the percentage of out-of-state donations with this query:

MATCH (cd:Candidate)
WHERE cd.election_year = 2016
AND cd.incumbent = ‘I’
AND cd.office = ‘S’
AND cd.in_state > 0
WITH cd, cd.in_state + cd.out_of_state as total
RETURN cd.name, cd.state, cd.in_state, cd.out_of_state, total, cd.out_of_state*100/total as perc_out
ORDER BY perc_out desc;

This is only a proof-of-concept. It only includes filings for about three months of donations, not the entire election cycle. But I’m hoping to apply it to a full election cycle data set soon.

 

Presentation: Census Data Demystified

I presented at BarCamp Philly.

Slides are available here.

List of links within the slides:

I like to learn by copying and editing. If you do as well then pick one of my mapping blocks at bl.ocks.org/enactdev and follow these steps:

  • Clone the block’s Gist.
  • Replace the GEO JSON file with yours.
  • Replace the data JSON file with yours.
  • Rename the comparison fields in index.html (NOTE: I match a field in the GeoJSON object with the index of the data JSON dictionary).
  • Rename the title and hover text in index.html
  • Don’t forget to recenter (also in index.html)!

 

Google Analytics Visualizations: Venn diagram of pageview overlap by session and map of weighted pageviews by state

While working with Google Analytics for Compass Red I’ve developed some fun visualizations. Here is a Venn diagram of overlapping pageviews to a group of Good-Better-Best products. (Product names removed due to non-disclosure agreement.)

This is built by segmenting pageviews. You need three segments: Good-Better, Good-Best, and Better-Best. From each of those three you can get the number of people visiting all three, and of course getting the number of individual visitors to each single page is easy.

This next visualization looks at the number of pageviews per 100,000 people in each state.

Population by state comes from the U.S. Census.

Visually explore Delaware’s latest Census data

There are two views, both offering the same Census measurements. One displays Delaware by county, and the other breaks New Castle County up into four areas. (Note: Those areas are set by Census, and are called PUMAs, or Public Use Microdata Areas. Basically areas with enough of a population to be statistically meaningful. Minimum population is 100,000.)

Here are some interesting things I found. You can easily recreate the screenshots below, or create your own views.

You can definitely see the influx of retirees to Sussex county. Our southern-most county accounted for almost half of the population increase in Delaware since 2012, drawing in 16,861 new residents out of 34,973 for the entire state. In that same timespan, Sussex’s median age rose by two years to 48.7, and is ten years greater than the average age in New Castle County. Lastly, the median year a home in Sussex was constructed rose from 1991 to 1998, and is 14 years greater than Delaware’s average of 1984.

Overall, New Castle County population went up, but the Wilmington area and south NCCo went down.

Per-capita income dipped in northern NCCo, but they’re still by far the highest, almost $9,000 higher than the county average. Per-capita income rose the most in southern NCCo, by 16%, or almost $5,000.

I’m going to open sourcing the project, but need to clean some things up first. For example the median home construction year is treated like a number, so the year 1973 is printed as 1,973. There’s also work to be done with connecting the population with various metrics, for example linking population to the number living in poverty to get the poverty rate.

This post originally appeared on the Open Data Delaware blog.

Census visualization: where people commute to in Delaware and the U.S.

I teamed up with my Open Data Delaware co-coder Tom at the Open Bracket Hackathon to create “Worker Flow” maps to display pre-processed  census data from the “LEHD Origin-Destination Employment Statistics”, which provides the flow of workers from one census block to another.

The raw census data by block is rolled up into tracts and counties. Here is the worker flow in Delaware by tract:

Here is entire U.S. by county:

If you look closely, you can see a dark purple dot around DC. Here is the zoomed in view:

That’s the influx of people from Maryland into DC for the work day!

Track campaign finance at de2016.com

de2016.com has been launched, with only four days to spare before election day.

For each candidate, you can view a heat map of political donations from Delaware by zip code. Donations are from 2015 and 2016. Links: presumed future governor John Carney and Republican upstart Anthony Delcollo.

You can also view a similar page for every committee that received donations since 2015. from the “Committees” link on the left. So even though I didn’t link to the actual candidate, you can still view information about Tom Gordon, or compare Mike Purzycki and Eugene Young.

Candidates for U.S. House submit their donation information to the FEC, so my pages feature less information about Lisa Blunt Rochester and Hans Reigle, but I did map their donations by zip code.

If you click on “Campaign Finance” on the left, you can see which entities made donations to the largest number of candidates, and click through to see which committees received those donations.

Caveat: Campaign finance data is very messy! Typos or slight differences in how campaigns register donor names can result in data that is not properly linked. This data will continue to be cleaned up, and will result in a better and better picture in the future. But, for now, there are plenty of insights available.

If you enjoy de2016.com please share it!