The Short Tale About The Long Tail

AI models should be simplified so that less technical people can fine-tune and customize them for niche groups and long-tail use cases in other organizations.

Guy Ernest

13.11.2022

A lot has changed since Chris Anderson published a famous 2004 article in Wired about the long tail. Back then, Amazon, Netflix, and Spotify were nice ideas benefiting from the growing popularity of the Internet. These tech companies proved that if you understand the value of the long tail and know how to build the technology to support it, you can outperform your traditional hits-only competitors, sometimes to dust. 

Is there more to the long tail that can benefit tech companies today beyond e-commerce and entertainment? We can learn much more from a better understanding of the economics of the long tail and the type of new technologies needed to support it. 

What is the long tail?

Hit-driven economics is a creation of an age without enough room to carry everything for everybody. Not enough shelf space for all the CDs, DVDs, and games produced. Not enough screens to show all the available movies. Not enough channels to broadcast all the TV programs, not enough radio waves to play all the music created, and not enough hours in the day to squeeze everything out through either of those sets of slots.
This is the world of scarcity. Now, with online distribution and retail, we are entering a world of abundance. And the differences are profound.
(CHRIS ANDERSON, “The Long Tail”)

The hit-driven economics was based on the simple economics observation that 80% of the sales come from 20% of the items in the catalog, and often 50% come from the top 5% of the items. This phenomenon was considered a law of nature and got its own statistical distribution — power law. Who wants to fight against nature?

Long tail/Power law diagram from Wikipedia

An example of a power law graph showing popularity ranking. To the right (yellow) is the long tail; to the left (green) are the few that dominate. In this example, the cutoff is chosen so that areas of both regions are equal.

When Jeff Bezos, the founder of Amazon.com, decided to leave his highly lucrative job on wall-street and build a company on this new Internet thing in 1994, he started by selling books. He didn't choose the most lucrative category at the time, nor the most glamorous. He decided on the category with the best long tail and the one that would benefit the most from the technologies he was about to develop at Amazon.com. Since physical bookstores can only show a dozen hits on the window display and only a few thousand books on their shelves, his online bookstore must build a personal recommendation engine that will show each user their top matches and a search engine that will search across millions of books in distributed catalogs and warehouses.

Another fine quality of books affects their fit for long tail business. Books are very personal, and the value that each individual gives to different books is highly diverse. You will get many different answers if you ask different people what their favorite books are.

              Long tail items have higher personal value, from Wikipedia

The tail becomes bigger and longer in new markets (depicted in red). In other words, whereas traditional retailers have focused on the area to the left of the chart, online bookstores derive more sales from the area to the right.

The higher value of personal long-tail items changes the economic calculation significantly. If the long tail has a potential of more than 20%, the investment in developing the technology to allow easy access and exposure of the long tail items has better ROI. 

Long Tail Technologies

The Internet enabled the rise of e-commerce and streaming services such as Amazon.com, Netflix, and Spotify. But the Internet was not enough, and these companies had to invest heavily in technologies such as Personalization and recommendation systems (Amazon.com research or Netflix research publications on this domain). They had to invent new ways to deliver the items in the fastest and best quality possible (Netflix research or Spotify's publication on this domain). Amazon invested heavily to cannibalize its physical book sales and developed Kindle technologies to allow readers to get their books immediately, which was the physical bookstores' main advantage. 

With these and many other technologies, these companies could give their customers a better experience than they used to have in the old world of hit-driven services. The users see a short list of items to choose from, an intuitive interface to browse additional items and search for specific ones, and finally, immediate access to the item. The way to benefit from the long-tail items was now open. 

What is next in long-tail economy factors

Since the original observation that long tail items can have a higher value on the personal level ("what are your favorite books?"), the tech companies discovered additional ways to improve their margins on the long-tail items by negotiating better deals with the creators of these items and taking a higher percentage of the sales to themselves. In return, they lowered the barriers to publishing and selling "indie" items and decreased the cost of many of these items. This creates a win-win-win situation where users discover new personalized items, creators find ways to sell their creations, and tech companies increase their profits. 

Chris Anderson (and Clay Shirky just before him) was very observant and accurate in his prediction regarding the future of the companies that he saw innovating in the long tail direction, such as Amazon.com, Netflix, and Rhapsody (Spotify was yet to come, and take over the music market). He couldn't see the next wave of user-created content in the likes of Instagram, YouTube, Twitch, TikTok, and Medium. Without saying anything about the quality of the content created and distributed on these services, it is clear that the monetary cost of creating and consuming content is almost zero. The scarce resource is still people's time and attention and nothing else. 

The other significant aspect that developed dramatically in recent years is the simplicity of creating niche groups that value and promote items on the long tail. There are hundreds of thousands of active subreddits on Reddit and communities on Discord, an endless number of groups on Facebook and WhatsApp, and an unlimited number of hashtags on Twitter and LinkedIn. Each serves a small but dedicated group of people interested in a niche set of items from the long tail. These groups can be scattered around the globe; however, they feel connected over the Internet.

These niche groups and low-cost creations will continue to shape the economics of long-tail markets and shape new services that will support them. 

What is next in long-tail technologies

Much was written on personalization and recommendation systems advancements as they manifest on TikTok (for example, NYTimes and Washington Post). Without arguing about the ethical and political aspects of such an algorithm, it is clear that it can process billions of signals in near-real-time and produce highly accurate match predictions for hundreds of millions of users and hundreds of thousands of clips. It probably can't get a much better recommendation system than that. 

Similar advancements were made in supply chain and logistics to allow fast and efficient distribution of long-tail physical goods. From robots picking and packing millions of orders daily in many Amazon fulfillment centers, drones that will deliver small packages in less than 30 minutes, and self-driving trucks that can distribute goods between cities and countries quickly and efficiently. 

Not surprisingly, many long-tail technologies are based on machine learning and artificial intelligence (AI). The main reason is that the most scarce resource is people. Scaling manual work is almost impossible, and people's capacity to perform tasks needed to operate a long-tail business is the main limiting factor. You can't hire, train, manage and retain enough people for long-tail business. Compared to the diminished profit from each item as you go down the long tail, the constant effort per item can't make any business sense. 

You have to use AI!

Currently, machine learning and artificial intelligence technologies are in the hands of very few data scientists. Therefore, very few systems are built today for very high-value (hits) problems. There is a need to simplify the ability of less technical people to fine-tune and customize AI models to fit the specific requirements of niche groups or long-tail use cases in other organizations. 

The large cloud providers offer AI services such as Automatic Speech Recognition (ASR), Natural Language Understanding (NLU), Machine Translation (MT), Question and Answers (Q&A), Computer Vision (CV), Chatbot Frameworks, and others. They even offer options to fine-tune and customize the AI models to more specific long-tail use cases. For example, Amazon Transcribe and Google Speech-to-text offer APIs for ASR customization, and Amazon Rekognition and Azure Vision offer similar APIs for CV.

Nevertheless, there is a need for better and simpler services to make it more accessible and valuable for enterprise companies to adopt AI technologies. Google Cloud announced in the recent Google Next a good step forward in building computer vision-based systems. The cloud providers will announce many other similar services in the future.

We are still far away from the day you can buy a robot and teach it to cook for your family, your grandmother's secret recipes, or how to organize your closet the way you like it. We are also still far away from the day that each company can hire AI into its work processes with its specific business language, policies, and workflows. This is going to be the next big long-tail revolution. When we move from hit-driven AI models of Tesla's or MobileEye self-driving-car or Google Assistant, Amazon Alexa, or Apple Siri ASR models to be able to have AI models that serve specific niche needs, the total value of the long tail will be uncovered. This is where aiOla and a few others (for example, Landing.ai in CV) focus on building the foundation of the second wave of long-tail technologies and businesses. 

Takeaways

It is easy to focus our attention on the most popular movie or book, the largest company or customer, the winning sports team or athlete, and ignore the long tail. There is a massive opportunity with the improvement in AI technologies and social shifts with the rise in awareness for diversity and the right to be weird and unique. 

Look around and not just "under the lamppost," embrace individualism and the need for personalization, form small groups around niche interests, and create new art and technology to improve people's lives. We can all flourish in the long tail.