Adam Davis
4 min readFeb 12, 2024

--

How to #7: A/B testing with DoWhy and EconML

“Painting of Gauss” -Christian Jensen

“It is not knowledge, but the act of learning, not possession but the act of getting there, which grants the greatest enjoyment.”-Carl Frederich Gauss

Causal inference allows us to draw conclusions about data. It can help us to know how groups would respond to treatment by measuring the potential impact on the outcome of an experiment. A/B testing is one area of causal inference where two groups are compared. One group receives a certain treatment and the other is the control group which receives no treatment. The outcomes of each are measured and a conclusion is drawn.

Background

Normally for an A/B test a trial or experiment is completed prior to the test. This can get a little costly as the cost of treatment can increase each at scale. Using an intent to treat strategy can help alleviate the need to apply treatment to as many subjects. Intent to treat uses an instrumental variable that is applied to a random segment of the population; i.e. not everybody gets it.

This experiment deals with data from a hotel group and contains points such as room rate, length of stay and the type of hotel or resort. The experiment is as follows:

We want to see if giving someone the ability to refund their deposit causes them to become a repeat guest (re-book) and the effect that has on the average length of stay.

Instrumental variable is the refundable deposit.

Treatment is becoming a repeat guest.

Outcome is the length of stay.

Confounding variables in this case is the guest previously cancelling, how many previous bookings not cancelled as well as the type of hotel or resort. The impact once found out can be fine-tuned through closer analysis on where the split happens for treatment.

Causal Graph of Model

First to select the model parameters. Three models are needed as we are using three variables:

Next to initialize and fit the model, identify the estimand and visualize the models performance on the test set. This will give us the conditional average treatment:

The estimate for the CATE is 4.57 or an increase in 4.57 days stayed by guests on average. We don’t know what confounding variables have an effect on this but first we will need to refute the accuracy of this estimate.

Coefficients Results Table for Model

Refutation

The goal of refutation is to confirm the accuracy of the CATE that the model has found. Output of these in this case are as follows:

We can see that the estimates are close to the original. One way to get them closer is to increase the number of simulations for each refuter. If the p values of all are consistently under .05 then the hypothesis that giving a customer a refundable ability for their deposit can be rejected. We would be able to assume that keeping deposits non-refundable could ultimately increase the number of stays stayed.

Policy Interpretation

Plot the tree interpreter
Model Uncertainty Graph

Tree interpreter shows which segment should have the strongest possible impact on the outcome.

Policy interpreter
Policy for Treatment Effect

The policy interpreter states the results if the predicted treatment effect is applied to each class. We can see that the segment that could use more analyzation are hotels that aren’t CityHotels. It’s important to remember that the point of this test is to see if a first time guest can add more value than repeated guest. Depending on what is most important to the business different metrics can benefit others more.

--

--