Find Jobs
Hire Freelancers

software development: transcribe speech to text using public APIs

$250-750 USD

Dibatalkan
Dibuat lebih dari 7 tahun yang lalu

$250-750 USD

Dibayar ketika dikirim
1. Task You are asked to write a program that: - takes an audio file as input, - chops it into clips at sentence boundaries, - sends these audio clips one by one to three different public speech recognition services, - saves audio clips, together with their text transcribed by the above services, into MySQL database: timestamp, length, audio clip, Google, Baidu, iFlyTek, flag where: timestamp (4-byte time): audio clip starting time in original audio file length (4-byte integer): audio clip length in millisecond audio clip (binary): 16-bit 16KHz single channel PCM Google transcription (text): utf-8 Baidu transcription (text): utf-8 iFlyTek transcription (text): utf-8 flag (integer): 0 if all three transcriptions are the same, 1 if two matches, 2 if all different. 2. Audio Source The audio could be in mp3/m4a/aac/ogg/wma format. It's extracted from youtube video. Our target is educational lectures. One example is this youtube video: [login to view URL] you can extract audio with [login to view URL] the downloadable mp3 result is at [login to view URL] You can use this for any YouTube content. 3. Audio Segmentation If you view audio file with a tool (many out there), you will visually see separation between silences and voices. Some silences are merely word boundaries or even just syllable boundaries. The rule we ask to implement is, either the silence is enough long, or the "sentence" is already 7 seconds long. In the latter case we need to chop at a locally longest silence gap. I see this sentence boundary identification as the most challenging one to those not familiar with audio signal processing. So I outline the logic above. Still, the next question is, how to really calculate "silence"?! Please follow up with methods listed in this page: [login to view URL] As one of this project acceptance criteria, we will randomly (use a random number generator on the Internet) select 50 audio clips, listen to them, and confirm the sentence boundary error rate is less than 5%. 4. Speech Recognition The three speech recognition engines are: Google: [login to view URL] Baidu: A python wrap for Baidu Yuyin API [login to view URL] [login to view URL] iflytek (Xunfei): Integrate iflytek SDK to Implement Chinese Voice Recognition in AOSP [login to view URL] Note, it is required to integrate with all above three speech recognition engines. That is, you need to do three integrations, each with its own complexities, such as applying for a free account and receiving tokens/keys. For both Baidu and iFlyTek, you are encouraged to use Google Translate, as lots of content are in Chinese. Both Google and Baidu are simple REST APIs, which allows you to implement in essentially any platform and language. But iFlyTek API is really an SDK. The best example I found is the above given Android version. So put together your only choice is Android application. 5. Implementation We are open to suggestions. But given the above, we expect a pure Android APK implementation. I will first push/copy several extracted/converted audio files into an Android phone or tablet, and then run your Android APK and get results in corresponding set of files, either in MySQL database or simply CSV format. I will then pull/copy these files back to my computer. You shall provide a way for me to randomly go to a clip, play out its audio clip, and read the transcribed text, place it into, say, Google web service and see results.
ID Proyek: 11391108

Tentang proyek

15 proposal
Proyek remot
Aktif 8 tahun yang lalu

Ingin menghasilkan uang?

Keuntungan menawar di Freelancer

Tentukan anggaran dan garis waktu Anda
Dapatkan bayaran atas pekerjaan Anda
Uraikan proposal Anda
Gratis mendaftar dan menawar pekerjaan
15 freelancer menawar dengan rata-rata $582 USD untuk pekerjaan ini
Avatar Pengguna
I want to discuss this project with you further, let me know the best suitable time for you to schedule the meeting, Feel free to message me at any time, i used to be online 14 hrs in a day on this website so probably you will get a quick response from my end.
$773 USD dalam 10 hari
5,0 (11 ulasan)
6,5
6,5
Avatar Pengguna
do u have any api in mind to implement ?
$555 USD dalam 5 hari
4,7 (8 ulasan)
4,6
4,6
Avatar Pengguna
I am a person with strong Analytical ability in Mathematics / Statistics/Economics/Finance having BSc. (specialized in statistics), MBA (specialized in Finance), MSc. (specialized in Financial Mathematics). On time delivery, clear communication, hardworking I my key attributes. I have performed number of data analysis for qualitative and quantitative research. Correlation Analysis (Correlation test/Crosstabulation/Chi-square test / Granger Causality test), Variance Analysis (ANOVA/MANOVA), Regression Analysis (Simple Liner Regression / Logistic Regression / Multiple Regression/Logistic Regression),Time Series Analysis (ARIMA),Econometric Analysis (VAR),Experimental Design (DOE) , Factor analysis ( CFA /EFA) I am familiar with SPSS, Stat, MINITAB, Eviews , LISREL, EQS
$250 USD dalam 10 hari
5,0 (11 ulasan)
4,0
4,0
Avatar Pengguna
I'm interested, but no project description so I don't know what to write here. Message me back with info if you up for it. Cheers, Alek
$555 USD dalam 10 hari
5,0 (1 ulasan)
3,0
3,0
Avatar Pengguna
Hello, Professional developers with similar expertise here. We are posting our bid as an expression of interest and appreciate further discussion in private message board. We are waiting for your message to communicate further in this regard so i can provide you with the detailed proposal with pricing and timeline.
$526 USD dalam 10 hari
4,4 (1 ulasan)
2,4
2,4
Avatar Pengguna
A proposal has not yet been provided
$500 USD dalam 8 hari
0,0 (0 ulasan)
2,4
2,4

Tentang klien

Bendera UNITED STATES
Cupertino, United States
5,0
1
Memverifikasi Metode pembayaran
Anggota sejak Agu 29, 2016

Verifikasi Klien

Terima kasih! Kami telah mengirim Anda email untuk mengklaim kredit gratis Anda.
Anda sesuatu yang salah saat mengirimkan Anda email. Silakan coba lagi.
Pengguna Terdaftar Total Pekerjaan Terpasang
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Memuat pratinjau
Izin diberikan untuk Geolokasi.
Sesi login Anda telah kedaluwarsa dan Anda sudah keluar. Silakan login kembali.