Two competing applications of non-parametric inference in practice – “Lazy” vs “Strict” Bootstrapping

At Airbnb, we quantify the performance of a given page via the Page Performance Score (PPS), a holistic measure of latency that incorporates several signals of user-perceived page performance, from rendering time to interactivity delays. As the event distribution for a given PPS subcomponent is not guaranteed to be gaussian, neither is the PPS distribution. This presents problems for classic parametric inference approaches in an experimentation context (eg: when asking “is ∆PPS = PPS_TREATMENT – PPS_CONTROL significant?”) In an experiment randomized at the user-level, we can compute the p-value associated with ∆PPS by implementing a cluster/block bootstrapping approach to ensure proper partitioning of events. This follows from a “Strict” interpretation of the Null Hypothesis; that users in Control are not fundamentally different from those in Treatment. However, if we revisit the underlying assumption of the Null Hypothesis (users in the two groups are not fundamentally different), this should imply that the distribution of events generated by the blended users are not fundamentally different and we can therefore blend all the events and resample at the performance-event level. This “Lazy” interpretation of the Null Hypothesis ignores the latent structure of the input data and can amplify the False Negative Rate by widening the Null Distribution, but this injection of symmetric noise ultimately generates more conservative p-values at the time of inference. Depending on the business context: (1) a desire for absolute certainty in identifying significant PPS improvement or regression or (2) a fast/directional read on the practical impact of a PPS shift either the “Strict” or “Lazy” bootstrapping algorithm can be applied. In practice, the “Lazy” approach to bootstrapping results in substantial speedup, reducing compute resources and runtime by ≈1.2x, with minimal loss in statistical Power compared to the “Strict” method.

View the slides for this session

Session Summary

Two competing applications of non-parametric inference in practice – “Lazy” vs “Strict” Bootstrapping

Matthew Schreiner

Code of Conduct

Refund Policy

Press Inquiries

Don't miss a thing!