How many shape characters does Torus need to identify tunes? Currently it uses an average of 12.9 characters, although the number needed varies between 7 and 26.

Looking in more detail, suppose we pick tunes in the database
uniformly at random and give Torus the first `k` characters of
their shape strings, for some fixed `k`. Then we can define two
possible measures:

- the
**uniqueness probability**= the probability that a tune is identified uniquely; - the
**resolution**= the reciprocal of the expected number of matches found.

We can express each of these measures as a number between 0 and 1:
the higher the value, the better Torus is at identifying tunes by
using `k` shape characters. We can then plot these measures
against `k` to get an idea of the number of shape characters we
might need to use:

The graph above suggests some puzzles: What shape should we expect the curves to be? Do they approach each other as the number of tunes becomes large?

This page is maintained by Thomas Bending,
and was last modified on 8 March 2020.

Comments, criticisms and suggestions are welcome.
Copyright © Thomas Bending 2021