AI in the Long Tail
By Kirk Borne  |  Oct 08, 2021
AI in the Long Tail
Image courtesy of and under license from
This cogent article by Kirk Borne employs mathematical reasoning to explore how smaller companies’ collective investments - the ‘long tail’ - drive the advance of artificial intelligence even more than outlays by bigger players, particularly in healthcare.

COLUMBIA, MARYLAND - Long tail distributions are very useful in many statistical applications. One example well known to statistics fans is Zipf's Law, which applies in many domains. It shows an inverse relation between rank and frequency, e.g., the N'th most common word used in a language occurs 1/N times as frequently as the No.1 ranked word.

Zipf’s Law has been applied to the national size distribution of cities, to the distribution in lengths of rivers, to the distribution in sizes of businesses and more. It is no fundamental law, like the law of gravity, but it is an empirically observed relationship that seems to hold up in many situations.

The value of the 1/N function increases systematically (i.e., asymptotically) to its peak value as values of N get ever smaller, nearing its peak value with the top-ranked item in the list at N=1. Conversely, the value of the 1/N function decreases slowly and systematically as values of N (the rank) grow ever larger, decreasing ever more slowly with each new value of N at very large N -- that's called the long tail of the distribution. It can be very long indeed.

One application of the long tail in e-commerce online sales is that the total (aggregate sum) of sales to all small buyers (high values of N, thus not in the category of highest-ranked buyers) is comparable to the total combined sales of the highest ranked buyers.

Mathematically, this last statement is strictly true in decade ranges, e.g., the aggregate sum of sales in the range ‘ranks N=1 through N=10’ is roughly equivalent to the aggregate sum of sales in the range ’ranks N=10 through N=100,’ which is roughly equivalent to the aggregate sum of sales in the range ’ranks N=100 through N=1000.’ As long as items are on the ranked list, and the total N keeps increasing, the total volume of sales keeps growing. These approximations are based on an empirically observed relationship, not a natural law of math

The content herein is subject to copyright by The Yuan. All rights reserved. The content of the services is owned or licensed to The Yuan. The copying or storing of any content for anything other than personal use is expressly prohibited without prior written permission from The Yuan, or the copyright holder identified in the copyright notice contained in the content.
Continue reading
Sign up now to read this story for free.
Get started