In the previous post, I used SQL and Python to clean NYC taxi and Uber data in 2015. In this post, I performed data exploration and visualization using Pandas and Tableau, and present some interesting patterns of taxi trips in NYC.
With the ever increasing amount of big data available and the development of dynamic route optimization algorithm, low cost shared ride becomes more and more popular. Uber and Lyft both offer shared ride service, UberPool and Lyft Line, in order to best capture the benefit of shareconomy. Via, an on-demand shared ride start up, offers flat-rate shuttle service in urban areas and has recently expanded its business to Brooklyn.
Here, I explored NYC taxi dataset of year 2015 from Google BigQuery, started with a big picture analysis of NYC taxi services, analyzed the features of NYC taxi trips, and then focused on Brooklyn local trips. I built a simplified model (rather than a sophisticated dynamic TSP algorithm) to assess shared ride efficiency in Brooklyn. I discovered that over 15% trips within Brooklyn are shareable on late weekend night. Shared ride efficiency largely depends on the total number of trips, emphasizing the importance of scale in shareconomy.