The InstaCart dataset is a typical “Market Basket” problem faced by online retailers. I chose this because I wanted to find a relatable problem for my Google Data Analytics capstone project. Instead of the typical “showcase all the technical chops I can do” kind of post, which is great for flexing on the internet but not very useful in a real world working environment. I’ve written sort of a “data analyst journey” where I give an example

of how a data analyst might investigate, make a hyptothesis (High CLV customers buy different items than low CLV customers), test it, and report the findings to her manager. You can find the jupyter notebook here.