Project avatar

My interests

English Alphabet Database Submission

Download: 11 samples of English Audio
(if you download this file then please add your voice to the database below)

We are building the largest possible audio database of people speaking the letters of the English Alphabet that can be used by all. In the hope that this data will be used to aid speech recognition efforts and be put to clever uses that we cannot even forsee. Please help us out, all you need to do is record yourself saying the alphabet!

Your submissions are anonymous. Everybody will be able to hear your voice but it ends there, we don't track anything about you; we even generate a random filename for your sumbission.

Instructions

  1. Make a recording of yourself saying each letter of the english alphabet, one after the other, with a little pause in between each letter. Like this:
  2. Open this web app to record yourself or use an app on your mobile phone to make a recording. (You may use any recording program you wish).
  3. Download your recording and then submit it in the form below.

Audio File Upload

Agreement
Note to spammers: your audio spam will be filtered using automated tools we have developed.

Why are we collecting this data?

To have an open and freely avaliable set of data of people speaking English Alphabet letters; for any purpose any human can think of. This project seeks to enable projects that would not otherwise be possible without such a rich dataset.

If you run a google search for "number of english speakers" then you will see that there are approximately 335 million speakers of English on the planet. With so many English speakers you might think that there would already be an open, free and widely avaliable database of audio samples of people speaking English letters with thousands of samples; but there is not. We want to fix this.

On a more personal level, we wish to collect this data to aid our efforts in furthering Digital Signal Processing. With a massive database of people speaking English letters we think that the Speech Recognition community could use the data to furher their research efforts. Hopefully even culminating in better computer recognition of spoken language.

The current project that we are working on is spoken letter recognition. Wish us luck!

Where can I get a copy of the data?

Download: 11 samples of English Audio
(if you download this file then please add your voice to the database below)

We will be releasing the data periodically but we need your help. This database only gets better with more submissions so we would really appreciate it if you added your voice to the database. When we reach 100 submissions then we will release the next version of the database. Please submit your voice to get us closer to our next release.

We want to maintain a very high quality and trustworthy database. Unfortunately, this means that we need to review all of the files in the database before we can release them. This prevents spammers and trolls from polluting the database with files that are not appropriate for this dataset. Luckily we have some automated tools to help us sort out the good submissions from the bad, but this filtering process will still take time.

What license is this database under? / What can I use the audio samples for?

The idea of such a database is that it should be:

  • Freely avaliable to everybody.
  • To be used non-commercially or commercially.

For this reason we are releasing the English Alphabet Audio Database under the following license:
Creative Commons License
English Alphabet Database is licensed under a Creative Commons Attribution 4.0 International License.

If you contribute Audio files then just be aware that you will be releasing it under this license.