First week at Metis - Project Benson
Challenge
An email from a potential client:
Vinny & Julia - It was great to meet with you and chat at the event where we recently met and had a nice chat. We’d love to take some next steps to see if working together is something that would make sense for both parties. As we mentioned, we are interested in harnessing the power of data and analytics to optimize the effectiveness of our street team work, which is a significant portion of our fundraising efforts. WomenTechWomenYes (WTWY) has an annual gala at the beginning of the summer each year. As we are new and inclusive organization, we try to do double duty with the gala both to fill our event space with individuals passionate about increasing the participation of women in technology, and to concurrently build awareness and reach. To this end we place street teams at entrances to subway stations. The street teams collect email addresses and those who sign up are sent free tickets to our gala. Where we’d like to solicit your engagement is to use MTA subway data, which as I’m sure you know is available freely from the city, to help us optimize the placement of our street teams, such that we can gather the most signatures, ideally from those who will attend the gala and contribute to our cause. The ball is in your court now—do you think this is something that would be feasible for your group? From there we can explore what kind of an engagement would make sense for all of us. Best, Karrine and Dahlia WTWY International
Approach
The motivation for this project is help the WTWY figure out the best place and time for them to deploy their street teams, so they can invite more people to the gala. In order to give them a good recommendation, we utilized historical MTA turnstile data and census data ACS to gain the insight of communting pattern in NYC.
Our potential target are female commuters and women working in tech. So, we combined traffic flow from MTA data and demographic information to give our client the best location. Because a lot tourists visit NYC and WTWY employees do not work at weekend, Saturday and Sunday are excluded from our recommendation to avoid the potential outliers in riderships data.
Data Analysis
By exploring the data and analyzing the daily commuter traffic and hour of day traffic, we noticed there is inconsistency in data recorded by hour due to the glitch in the turnstile. Therefore we remove dataset containing negative ridership or ridership greater than 80000 per day.
The top five MTA stations we recommend are:
- FULTON ST
- 23 ST
- 86 ST
- 59 ST COLUMBUS
- 14 ST-UNION SQ
Conclusion
-
The stations with the highest traffic have transfers to different lines such as the 23rd Street have connection to different MTA lines and also to buses, which guarantee a high traffic to capture emails
-
The morning period (8am -12pm) is the best time to capture emails, followed by the 12-8pm period
-
The expansion of data to cover multiple years could help to improve analysis to reduce the effect of outliers and special events
-
Expanding street teams outside subway stations can improve target accuracy because not everybody commute to work using the train, adding street teams outside train stations in trendy neighborhoods could increase reach
You can find the project on Github