Lyft Data Science Assignment

Lyft Data Science Assignment
Thank you for taking the time to complete Lyft’s Data Science
Assignment!
Assignment
Lyft ridesharing is a two-sided marketplace with drivers and passengers. Every day new drivers
join the platform and existing drivers either drive or they do not. Suppose you are working as a
Data Scientist on the Driver Retention team whose primary goal is to reduce the rate of churn of
activated drivers (a driver becomes ‘activated’ once they complete their first ride).
The team would like to understand churn better. Explore the data to provide the team with a
deeper understanding of churn at Lyft. Your summary should include:
● The definition (with justification) for a driver to be considered churned.
● An assessment on the current business impact of churn to Lyft.
● Insights on factors affecting churn.
● Insights on segments of drivers more likely to churn.
Next, the team would like to size the opportunity of reducing churn in order to prioritize their
roadmap. The team is considering the following two hypotheses:
i. Doubling the number of rides in an activated driver’s first week.
ii. Another hypothesis you recommend.
Using the data, help the team prioritize these two hypotheses. You should cover:
● How big the opportunities are.
● What might be the longer-term consequences on the marketplace of each hypothesis.

Data Science作业代做、代写Python实验作业、Python编程设计作业代做、代写data课程作业
● Which segments of drivers are most likely affected by each hypothesis.
● Which hypothesis you have more confidence in.
Finally, suppose the team wants to test the following hypothesis: “eliminating the Prime Time
feature will decrease driver churn”. Design an experiment to do so. Your design should include:
● How you will divide observational units into control and treatment, and a description of
the treatment and control conditions.
● What are some potential second-order effects on the experience of drivers and
passengers during this experiment.
● What are the primary and secondary metrics you will track.
● How long you will run the experiment and how you will choose the winning variant.
Submission Instructions
1. Please do not write your name on any submission documents.
2. Using the data provided, aim to spend roughly 5-8 hours answering the questions.
3. Prepare a 20 minute presentation for a panel of Data Scientists. At Lyft, we believe
Data Scientists are most effective when they‘re telling a story with data. Typically
slides are most effective but you are welcome to use other formats (e.g.
iPython-markdown, R-markdown, Word doc but you will need to .pdf them) if you prefer.
4. Include all of your working materials (including all code) in a separate PDF.
5. Keep in mind that we will be grading the assignment based on its technical
soundness and depth, business applications and insights, structure and
organization, completeness and polish.
Data Provided
data/driver_ids.csv
driver_id Unique identifier for a driver
driver_onboard_date Date on which driver was onboarded
data/ride_ids.csv
driver_id Unique identifier for a driver
ride_id Unique identifier for a ride that was completed by the driver
ride_distance Ride distance in meters
ride_duration Ride durations in seconds
ride_prime_time PrimeTime applied on the ride
data/ride_timestamps.csv
ride_id Unique identifier for a ride that was completed by the driver
ride_picked_up_at Timestamp for when driver picked up the passenger

因为专业,所以值得信赖。如有需要,请加QQ:99515681 或邮箱:[email protected]

微信:codehelp

原文地址:https://www.cnblogs.com/ddaaff/p/11265759.html

时间: 2024-10-13 01:18:02

Lyft Data Science Assignment的相关文章

Comprehensive learning path – Data Science in Python

http://blog.csdn.net/pipisorry/article/details/44245575 关于怎么学习python,并将python用于数据科学.数据分析.机器学习中的一篇很好的文章 Comprehensive(综合的) learning path – Data Science in Python Journey from a Pythonnoob(新手) to a Kaggler on Python So, you want to become a data scient

R8:Learning paths for Data Science[continuous updating…]

Comprehensive learning path – Data Science in Python Journey from a Python noob to a Kaggler on Python So, you want to become a data scientist or may be you are already one and want to expand your tool repository. You have landed at the right place.

Comprehensive learning path – Data Science in Python深入学习路径-使用python数据中学习

http://blog.csdn.net/pipisorry/article/details/44245575 关于怎么学习python,并将python用于数据科学.数据分析.机器学习中的一篇非常好的文章 Comprehensive learning path – Data Science in Python 深度学习路径-用python进行数据学习 Journey from a Pythonnoob(新手) to a Kaggler on Python So, you want to bec

Data Science(什么是数据科学)

科学上网时看到的有关于Data Science的理解,感觉挺好的,就翻一下. Data science is about understanding systems, whether they be natural systems such as climate, or man-made systems like the economy. (数据科学可以称之为理解系统,无论这个系统是自然系统,例如天气系统,或者人造的生态环境系统). Scientists have been conducting

【转载】Data Science at the Command Line

Data Science at the Command Line Data Science at the Command Line is a new book written by Jeroen Janssens. This website contains information about the upcoming workshop in London, the webcast from August 20th, instructions on how to install the Data

Data Science 和 Finance 两个领域的融合是什么样子的?

Data Science 和 Finance 两个领域的融合是什么样子的? 2015-05-24 董可人 现在大部分人所说的Quant一般是指各大投行里做衍生品定价,信用评估,风险控制之类工作的人,这种工作里又有很大一部分是涉及随机过程微分方程等偏理论的数学,对数据分析要求不高:只有另外一部分基于统计的工作才和Data Science相关.两者的分类可以参见这个问题在华尔街工作的数学博士的研究方向一般是什么? 如果对使用理论模型进行定价的工作感兴趣,很多学校开设的Financial Engine

Machine Learning and Data Science 教授大师

http://www.cs.cmu.edu/~avrim/courses.html Foundations of Data Science Avrim Blum, www.cs.cornell.edu/jeh/bookJan25_2016.pdf

data mining,machine learning,AI,data science,data science,business analytics

数据挖掘(data mining),机器学习(machine learning),和人工智能(AI)的区别是什么? 数据科学(data science)和商业分析(business analytics)之间有什么关系? 本来我以为不需要解释这个问题的,到底数据挖掘(data mining),机器学习(machine learning),和人工智能(AI)有什么区别,但是前几天因为有个学弟问我,我想了想发现我竟然也回答不出来,我在知乎和博客上查了查这个问题,发现还没有人写过比较详细和有说服力的对比

11 Facts about Data Science that you must know

11 Facts about Data Science that you must know Statistics, Machine Learning, Data Science, or Analytics – whatever you call it, this discipline is on rise in last quarter of century primarily owing to increasing data collection abilities and exponent