Ride Hailing Service: Report¶

Benjamen Simon¶

The Problem¶

A new ride hailing service is being launched that connects riders with drivers for trips between the Airport and Downtown. It’ll be active for only 12 months. The riders are charged a fixed rate of \$30 per ride, and I have been tasked to propose a pricing strategy to maximise the profit of the business over the 12 months, given some contraints and assumptions. This document will present my strategy, as well as the thought process and data derived results behind it.

Bottom Line Up Front: The Strategy¶

Given the contraints, assumptions, and data given by the client, the optimal pricing strategy is to pay a fixed rate of \$25.75 per ride, resulting in a profit of \\$4.25 per ride assuming no other costs, and a total average profit across the 12 months of \$50,994.39, with a 95% confidence interval of (\\$48,452.82, \$52,770.18).

The Model: Assumptions and Constraints¶

For each event the ride-hailing system connects one rider request with one driver. The rider is charged a flat \$30, and the driver can be payed any amount, P. The profit per ride is calculated as (\\$30 - P).

The riders:

There are a total of 10,000 riders in the city.
The service begins with 0 riders active.
Each month a maximum of 1,000 new riders can join the service and be eligible to request rides (become active).
In the first month that a rider is active, they request an average of 1 ride per month (with the number of rides requested coming from a Poisson distribution).
In each subsequent month, riders request rides based on a Poisson distribution where lambda is the number of rides that they found a match for in the previous month.
If a rider finds no matches in a month (which may happen either because they request no rides or because they request rides and find no matches), they leave the service and never return.

The drivers:

There are a practically unlimited supply of drivers available.
When a ride is requested, a very large pool of drivers see the request and can choose whether to accept or reject it.
A ride request is accepted with some probability q which is based on the rate of pay P for the ride (the acceptance is generated from a Bernoulli distribution).

The Data:

Data was provided from a similar service denoting what the pay, P, was and whether the ride was accepted.
From the exploratory analysis, there is a strong positive non-linear relationship between the probability of acceptance and the rate of pay. A logistic regression model was fit to the data to estimate the probability that a given rate of pay would be accepted.
These estimated probabilities are used in the simulation.

The Model:

The model is simulated in discrete time with 1-month units.
The population of riders are considered homogeneous and exchangable, but with unique histories.
No features of the driver population are considered, just the probability of acceptance based on the data.
The maximum number (1,000) of riders is assumed to be acquired each month, at the start of the month, so their rate of requests are that of a whole month.

The Models¶

There is a clear trade off between the number of riders active (and the number of requests per rider) and the profit per ride. A higher rate of acceptance leads to more riders active and more ride requests, but lowers the profit per ride, or may even lead to a loss per ride. With this in mind I developed 3 different models to explore this relationship and identify if a non-uniform strategy could increase profits.

Flat Rate¶

In this model all of the drivers are payed at a fixed rate across for every ride across the whole 12 months. This will find the natural balance point between the rate of acceptance and the profit per ride.

Adaptive Rate¶

The system is set up such that a if a rider has no rides accepted during a month then they will leave the service and never return, and thus not generate any additional revenue. A portion of riders will naturally leave the service without any possible remedy if they make no requests in a month, but some will leave because their requests weren't accepted, and these cases are potentially avoidable. If an individual leaves in month 1 because their request wasn't accepted, then we are potentially missing out on 11 months of additional revenue, which is quite substantial.

This model works by attempting to maximise the number of riders who remain active, and then maximise profits on additional rides. If a rider has yet to have an accepted ride that month, the pay for their current request is at a premium (possibly even a loss) to maximise the probability of acceptance and guarantee they will stay active, any additional rides requested by this rider that month are payed at a value to optimise profit.

Bait and Switch Rate¶

As in the adaptive rate, we consider the relationship between the number of active riders and the rate of pay/acceptance rate. As previously noted some riders will naturally leave the service because they make no requests in a month, the likelihood of this is increased the lower the rate of requests for an individual, and the rate of requests of an individual is related to the number of accepted requests of the previous month. Thus this strategy focuses on maximising the rate of requests of users for a period using a low proft/loss pay rate, to keep as many users active as possible, and then switching to a high profit rate. Unfortunately a high rate of requests could take months to build up, and then if the high profit pay rate leads to low accetances, the high request rate will disappear.

The Data¶

Data was provided from a similar service on 1,000 rides, including their pay and whether or not they were accepted. 527 were accepted and 473 were rejected. The mean pay for all rides was \$25.71, with the mean pay for accepted rides being \\$32.08 (\$5.40 min, \\$53.67 max), and the mean pay for rejected rides being \$18.62 (\\$0.00 min, \$43.15 max).

Logistic Combined

The above plot (Figure 1) shows the data, ordered by pay on the x-axis, with random IDs on the y-axis, and coloured by acceptance status, with blue being rejected and yellow being accepted.

Logistic Seperated

The above plot (Figure 2) changes the ID labels to seperate the two groups to better demonstrate the behaviours.

As we can see in Figure 1 there is a clear sigmoid relationship between the pay and the acceptance rate, and Figure 2 shows that the two groups aren't seperable, and the majority of pay values in the middle resulted in some accepted rides and some rejected rides.

Logistic Regression Model¶

I am interested in using this data to inform simulations of the process and dictate whether a propose ride and pay is accepted or rejected. To do this I need an estimate of the probability of acceptance for each pay value. For this, it thus seems appropriate to use a logistic regression model fit to the data to estimate the probability of acceptance, rather than used in its usual function as a categoriser.

Logistic Predicted

The above plot (Figure 3) shows the predicted probabilities of acceptance for all pay values from \$0.00 to \\$60.00.

Figure 3 demonstates the same relationship as we saw in the data, however it is worth noting some caveats. For instance we noted that there were no acceptances below \$5.40. The sample size was not incredibly large but it is reasonable to assume that drivers will not take a job that is below a certain threshold, and would never do a job for free, where as this model predicts the acceptance of jobs at a \\$0.00 pay rate is 0.18%. We also saw similar behaviour at the other extreme. For this reason I believe the model should only be trusted for the interpolated bounds, for instance between \$5.00 and \\$50.00.

The Simulation Algorithm¶

Using the above probabilities I can now build a simulation of each strategy, the results of which can be investigated to identify the optimal parameters of each strategy and the overall best strategy. The general algorothm is as follows, with the stratgy specific changes occuring in step 3.

For each month:

Activate up to 1,000 new riders, ensuring the new total riders activated to date is 10,000 or less.
For all active riders, generate the number of ride requests that month given their personal rate.
For each ride request, choose a pay, and generate the accepted status given the output of the logistic model.
Deactivate any riders that did not have any accepted rides this month.

The Results¶

Overall the fixed rate strategy resulted in the highest profits across the 12 months. The other two models were unable to recover the loses of the rides focusing on acceptance and resulted in a lower overall profit. In cases where the number of active users or accepted rides increased, it was not sufficient to increase profits.

After the simulators were built, a latin hypercube of parameters was constructed, chosen inteligently for the context. Additional values were tested, before narrowing down a likely range of optimal values and running 50 simulations for each. In every case a roughtly hyperbolic relationship can be idenitfied with an obvious global maxima. The large standard deviations mean that there is a lot of overlap in the possible profits of each parameter set, and the best parameter maximises the likelihood of maximal profits, but they could be lower.

Fixed Rate¶

The optimal pay rate given my expirments is \$25.75 per ride, resulting in a profit of \\$4.25 per ride assuming no other costs, and a total average profit across the 12 months of \$50,994.39, with a 95% confidence interval of (\\$49,099.81, \$52,888.97). The average number of requested rides was 21,587 (20,891, 22,282) and the average number of accepted requests was 11,999 (11,553, 12,444).

Fixed: Profit

The above plot (Figure 4) shows the average profit in dollars (y-axis) against the pay in dollars for the fixed rate strategy.

Fixed: Requests and Acceptances

The above plot (Figure 5) shows the average number of requests (blue) and the average number of accepted requests (orange).

In this case we can see a clear maxima at \$25.75. It is possible there is a better maxima at some value between \\$25.50 and \$26.00, especially given the behaviour of the fitted line, but the gains would be minimal. To perfect this investigation I would run far more simulations per pay point and at finer grain intervals.

Adaptive Rate¶

This model has two rates - the initial rate before a user has had an accepted ride that month, and the subsequent rate for all additional requests after the first accepted request. Given my experiments the optimal pair of pay rates is (\$26.00, \\$25.00), resulting in a profit per ride of (\$4.00, \\$5.00). The total average profit across the 12 months was \$50,488.30, with a 95% confidence interval of (\\$48,786.05, \$52,190.55). The average number of requested rides was 21,432 (20,853, 22,011) and the average number of accepted requests was 11,829 (11,443, 12,215).

Adapt: Profit

The above plot (Figure 6) shows the average profit in dollars (y-axis) against the pairs of pay rates in dollars for the adaptive rate strategy.

Adapt: Requests and Acceptances

The above plot (Figure 7) shows the average number of requests (blue) and the average number of accepted requests (orange).

In both plots the pair that results in the highest profit is marked by the red vertical line.

Bait and Switch¶

This model has two pay rates and a change point month parameter - the initial rate increases the users request rate and is used until the change point month, and the subsequent rate for all additional months after. Given my experiments the optimal pair of pay rates and change point are (\$25.00, \\$25.00, 7), though as it is a fixed rate any changepoint month is acceptable, resulting in a profit per ride of \$5.00. The total average profit across the 12 months was \\$50,640.50, with a 95% confidence interval of (\$48,946.83, \\$52,334.17). The average number of requested rides was 19,931 (19,287, 20,575) and the average number of accepted requests was 10,128 (9,789, 10,467).

BaS: Profit

The plot above (Figure 8) shows the average profit in dollars (y-axis) against the pairs of pay rates in dollars for each month change point for the Bait and Switch rate strategy. The set that results in the highest profit is marked by the red vertical line.

BaS: Requests

The plot above (Figure 9) shows the average number of requests for each month changepoint. The set that results in the highest profit is marked by the red vertical line.

BaS: Acceptances

The plot above (Figure 10) shows the average number of accepted requests for each month change point. The set that results in the highest profit is marked by the red vertical line.

Discussion¶

From the above we can evidently see that the optimal strategy for maximising the profit of this service under these assumptions is to have a fixed rate price in the region of \$24 - \\$26, and further simulations could identify a global maxima to two deicimal places. Even the special cases with more freedom to adapt the rate were both optimised by the same fixed rate across the whole period. There are other more complex adaptations of course, such as a bait and switch rate that changed every month, or an adaptive rate that adapted more intelligently or more actively, ie. focusing on those with high request rates. However, I would theorise that given the evidence we have seen so far, and the relationships present in the data, that these would also be optimised by the same fixed rate.

Once the system is in place and more data has been collected, further analysis could offer improved strategies. For instance the probability of a price point being accepted may vary depending on time of day, weekday vs. weekend effects, holidays, end of month, weather, or person to person. Likewise some of the assumptions of the model, such as the highly varying rate of requests month to month, may prove to be invalid.