Metadata Can’t Solve Your “Cold Start” Problem

Jun 18

Written By Ryan Mach

When you’re building a recommendation engine, there’s no substitute for strong, reliable, first-party data.

Photo: Ranker

We talk a lot about the great big trove of preference data that powers our Watchworthy app. That’s not just because we’re very proud of that data (though that’s definitely part of it) — it’s because that data is what allowed us to deliver highly precise recommendations to TV viewers from the moment they downloaded the app. Because of that data, we were able to avoid the “cold start” problem.

What is the cold start problem?

If you’re unfamiliar with the term “cold start,” it refers to the difficulty that recommendation apps powered by first-party data face in gaining traction the first few months after release. Marketing a recommender app that doesn’t have any user data yet is like marketing a car with no gas in it: no one will appreciate how fast the car goes if you can’t drive it off the lot.

It’s a problem of circular logic:

How do we offer better recommendations to users?

By using first-party data.

How do we get more first-party data?

By getting more people to download the app.

How do we get more people to download our app?

By offering better recommendations to users.

No one wants to download an app that will offer great recommendations eventually — they want something that will be uniquely useful to them right there, right then. Many apps that offered users targeted TV show recommendations before Watchworthy ended up failing because they initially generating recommendations with metadata, hoping to slowly wean their recommendation algorithm off its reliance on metadata and replace it with first-party user data as more users began to download it.

What’s wrong with metadata?

There’s nothing inherently wrong with using metadata in most cases! It can be very useful for certain tasks: say, for example, that you’re trying to organize a catalogue of television content. Developers on the backend can make a somewhat subjective judgment call on what TV shows or movies are about (theme data), how they should make us feel (mood data), and how to categorize them (genre/micro-genre/keyword data). That metadata is useful because it’s sufficient for placing shows into intuitive categories that users can easily search through.

Metadata is not so useful when it comes to building a recommendation engine. Why? Because it doesn’t express the voice of the consumer. For instance, a metadata-driven engine might recommend Vinyl to fans of Empire because, according to the developer-authored metadata associated with them, both of these shows are about the music industry. But since Vinyl was a flop with audiences and Empire went on for six seasons, the user might be pretty disappointed with their metadata-driven recommendation.

We may think we know what makes fans of one show or movie like another, but human taste is often strange and variable — and that’s what makes the task of psychographic profiling so hard. The only way to accurately predict it is through a statistically relevant sample of first-person data. And if your brand-new app doesn’t already have a great deal of first-person data to use, your best option is to bring it in from somewhere else.

Want to learn more about how we built a TV recommendation engine using Ranker Insights data? We tell the whole story in our Watchworthy white paper, which you can download here for free.

Read the White Paper

Ryan Mach

Battle of the “Baby”s

Diving Into Black-ish