Let's talk about Ice Cream Sandwiches!

Shout out to another woman-owned business! These are amazing!!

Today we’re going to talk about the most delicious topic you’ll probably read all year in Analytics: Nightingale Chomps Ice Cream Sandwiches!

If you haven’t seen these yet, you need to run out to your favorite bougie grocery store and get you some. Might I suggest Three Sisters Provisions, if you happen to live on Boston’s South Shore?

This journey, dear reader, walks you through a simple story of data validation via Databricks where, because we invested in an independent record of our sales, our tiny mom and pop shop was able to find a critical reporting error in our POS system’s Item Library, report it, and ultimately make a better data-driven decision around restocking some of our most popular items!

The first lesson! Trust your ice cream sandwich-filled gut.

So, a bit about Ashley and I: She doesn’t have a sweet tooth. I most certainly do. Her inventory generally reflects a more mature set of taste buds, whereas mine are as juvenile as the average Kpop Demon Hunters viewer.

Wait, did he just make a random Kpop Demon Hunters reference? Yes… Yes, he did.

Her first order from Nightingale had Chocolate Blackout, Key Lime Pie, and Strawberry Shortcake. We did pretty solid sales of Chocolate Blackout and Key Lime, but less so the Strawberry Shortcake in our opening weeks.

Sprinkles… mmm, sprinkles.

So one Sunday morning, I took it upon myself to do the reordering for this brand, and I noticed they had other flavors, some of which our two boys would most definitely want to try, others, like Mocha Brownie, were right up my alley. One, in particular, that caught my eye, turned out to be the main subject in our data story: Birthday Cake Chomps 4-pack. You see, my 6 year old has never seen a sprinkle-covered product he doesn’t like. My purchasing theory was that he’s a pretty good proxy for all the other little ones coming in with their parents into our store and that we should probably cater to them. So, on that basis, I ordered our first batch of 7 Birthday Cake Chomps around two weeks ago.

When you’re in the retail biz, one thing I recommend you do, because it’s delicious, is try the new stock. So one night I pulled a box from our order, brought it home, and let the boys try it out. No surprise, they loved it. Fast forward a couple weeks, and we only had one box left in stock. For those math majors at home, that should mean we had actually sold 5 boxes of Birthday Cake Chomps. However our sales were all off. It only showed 1 sale from that original order. Our Retail Library confirmed our stock was accurate, and that meant there was no shrinkage at play, but a whole slew of curious details were off, from sales to revenue generated.

Notice a problem? It’s impossible to have 6 orders in 7 days and not also have at least 6 orders in the past 30 and 90 days.

Ralph, failing as hard as Toast’s sales math

The data has changed since I reported the issue, but not in a particularly good way. An example of the inaccuracies appearing today via our Retail Library is the fact that we show only one sale of Chocolate Blackout in the last 7 days, but nine in 30 days and again only one again in the last 90 days. That’s unpossible!

Houston: We have a problem.

In fact, the issue today seems to be a bit worse than it was when I first caught the problem. Here’s the latest sales data from our “system of record” on orders in the last 7, 30, and 90 days.

This also shows up in the currency denominated sales columns.

Now, if I had no independent record of our sales this story would be just about over. I’d have had to trust my ice cream sandwich-filled gut and file a vaguely worded support ticket like “Hey folks, ah, this seems weird can you check in on it?”

However, I had so much more to offer because I am armed with a super power: A whole retail POS data dump, exported from its API and structured in a (reasonably) clean Delta Table stored in Databricks built entirely with Databricks Assistant.

Quick detour, so you know a bit more about me, and why the data I’ll show you next looks a bit weird: I’m a nut for clean data. I also suspect this might be a solid lead for why the data looks off in the way it does. When you use our POS system to scan in new inventory, it makes up the name of a given bar code using an LLM. It’s pretty cool, but it can be a bit inconsistent, probably due to the different ways it shows up in other retail establishments data. One delicious flavor scans in as “Nightingale Mocha Brownie” whereas another scans in as “Nightingale Inc. - CHOMPS Chocolate Blackout 4 pack”. Yuck - Eww, and Gross!

As you might imagine, as a reformed data analyst, this causes me a small degree of distress. I’ve sought to standardize the names of all of our products around a “Brand Name - Product Name (optional size variation)” nomenclature, but I have to do so outside the heat of the moment when melty stock shows up at our doorstop, because no one likes melty ice cream sandwiches!

Once stocked in the freezer and cleaned up, the sandwiches and their data looks great and the clean names help to distinguish between similar products across two different brands. Imagine “Chocolate Chip Cookie” and … “Chocolate Chip Cookie” as compared to “Piping Plover Co - Chocolate Chip Cookie” versus “Chips Ahoy! - Chocolate Chip Cookie”. It’s actually really important when reporting sales and making restocking decisions.

Entropy, as imagined in Kpop Demon Hunters

Ok, so now that you know that I’ve been working against entropy in the form of temperature while also working against, well, entropy in the form of weird results from some AI-embedded naming model, you can see for yourself how our displayNames have evolved over the course of the past month. Also, I still have some work to do, apparently, in cleaning up the names!

So, what you can see here is that we’ve made some sales under legacy names, and then made more sales under newer, standardized names. At the time, the issue seemed to be related to how the POS system was only reporting based on the newer, standardized names. I’m less certain of that fact today, but it seemed plausible at the time. What I can say, is that it’s been delightful using the Databricks Assistant to help explore the data. A good example is this prompt, and the associated code.

This would have taken me significantly longer in a drag-and-drop UI like Alteryx.

Chomps Classic was only introduced in our second order, and it’s currently in the lead in sales and revenue!

So, whereas my “system of record” says I’ve variously only sold one or two Mocha Brownie, I’ve actually sold four, and regardless, my personal favorite Nightingale Ice Cream Sandwich is in danger of being cut from the roster.

The obvious takeaways from all this are threefold:

  1. You need to externally validate your systems of record.

  2. Databricks is a great platform for even non-coders to do so.

  3. Please go buy some Mocha Brownie Ice Cream Sandwiches from Three Sisters Provisions.

“But Joe” you might be saying, “I don’t have a team of people who can set up Databricks in a robust, high security environment required for my Enterprise”. Well, do I have the team for you! Give me a ring, and I’ll connect you to the excellent data professionals at Indicium. We’ll get you sorted out, migrated from your legacy platforms, with fresh data landing in your very own Delta Tables on Databricks where you, too, can leverage Databricks Assistant to get fresh answers to your pressing questions about ice cream sandwiches… or whatever else you might care about!