AI Maandishi hadi Sauti Jobs
I am looking for a freelance developer or team to create a local AI avatar system with real-time voice interaction and facial/lip synchronization for Latin American Spanish. Currently, we already have a basic avatar that can display responses, but it does not speak or animate facial movements naturally. The goal is to build an avatar that can: Speak directly using AI-generated voice (TTS) Synchronize mouth/facial movements with speech Simulate realistic modulation using at least the 5 main vowel mouth shapes (visemes/phonemes) Run locally (offline or local server environment) Allow flexible integration with different AI providers Work primarily in Latin American Spanish Main requirements: • Local execution The system must run locally using CPU/GPU resources. Cloud dependence shou...
I want to bring everyday Hindi-English-Marathi conversations into one streamlined Android app. The idea is simple: I point the camera at a street sign or menu, the app grabs the text with Optical Character Recognition, instantly translates it, and then reads the result back to me through a clear Text-to-Speech engine. The same flow should work when I speak or type a phrase—I receive a fast, accurate translation plus an optional audio playback so I can mimic the correct pronunciation on the spot. Even for communication with auto rickshaw driver, sabzi mandi, hawkers etc. I want to make it as a communication tool Core flow • Capture text with OCR from photos or the live camera view • Translate bi-directionally between Hindi, English and Marathi • Convert transl...
**AI-Powered YouTube Thumbnail & Hook Generator (ChatGPT Image 2.0 & Gemini 3.1 Pro)** ## **Project Description:** I am looking for an expert Full-Stack Developer to build a web application that automates the creation of high-converting YouTube thumbnails and viral "hook" titles. This platform will cater to content creators by turning simple ideas into cinematic, "click-worthy" visuals using a specialized template system. ### **Key Technical Requirements:** * **Image Generation:** Integration with **ChatGPT Image 2.0 (GPT Image 2)** API. The app must support high-resolution (2K) output and **multi-turn editing (inpainting)** to allow users to add or edit elements on the same image. * **Text & Logic:** Integration with **Gemini 3.1 Pro** API to generate...
I have a series of educational modules that need lively, AI-generated voiceovers. The goal is an energetic tone that keeps learners engaged while still sounding clear and professional. Here’s what I need from you: 1. Select or create an AI voice that fits an upbeat classroom style—nothing robotic or monotone. 2. Generate polished audio for each lesson script I supply (roughly 8-10 minutes per lesson, delivered as high-quality WAV or MP3). 3. Tweak pacing, emphasis, and pronunciation so technical terms come through naturally and the material flows. 4. Deliver a separate, neatly labeled file for every lesson plus a master list of any pronunciation rules you had to set. I’m open to whichever tool—ElevenLabs, PlayHT, Amazon Polly, Google Cloud TTS, or anot...
I run a small-scale internet radio station using Radio Automation Software (Azura Cast and liquid soap). I want an AI voice that automatically joins the stream for a three-minute world-news break at the top of each hour. The core requirements are straightforward: • Content source – pull the latest headlines and short summaries on-the-fly from Worthy and Rapture Ready News (Can be rss but prefer data be interpreted from just web page without need for rss). If one feed is down the system may fall back on any publicly available source you feel fits the same style. • Scheduling in Azura cast – simple on/off toggle plus a field where I can list specific hours (e.g., 08:00-22:00) when the bulletin should air; outside those hours it stays silent. • Custom intro &n...
realistic AI voice generation. It allows users to create natural-sounding speech from text in multiple languages and voices. ​Here is a breakdown of what makes their platform stand out: ​Key Features ​Text-to-Speech (TTS): Converts written text into high-quality, human-like audio. It excels at capturing nuance, emotion, and natural pacing, making it sound less robotic than traditional TTS engines. ​Voice Cloning: Allows you to upload a short sample of a real human voice and create a digital replica. You can then use that cloned voice to read any text. ​Voice Library: A community-driven library where you can browse and use thousands of pre-made voices suited for different tones, ages, and styles. ​Speech-to-Speech (STS): Lets you speak into a microphone and transform your audio into another...
Im after someone that is a master at ai Claude etc I have alot of tasks to be done that im to busy to try do myself Website creation, seo, email signature animations, google my business review help and alot more
I need a reliable partner who can turn my catalogue of English-language, non-fiction e-books into clear, listener-friendly audiobooks using a high-quality computer voice. The library sits at about 100 titles now—each one between 150 and 300 pages—and I add new books every month, so this is an ongoing arrangement. Standard pronunciation is perfectly acceptable; no specialist jargon guidance is required. What matters most to me is a consistent sound profile across titles and a straightforward, per-book rate that I can factor into my release schedule. Deliverables • One finished audio file (MP3 or WAV) per book, edited for pauses, volume-leveled, and ready for distribution • File named with the exact book title plus author name • Turnaround time and cost c...
I need a production-ready AI agent that plugs straight into my WhatsApp Business number, chats fluently in Arabic, and closes sales on its own. The conversation must feel human, handle objections, and—crucially—offer personalised product recommendations rather than canned replies. Alongside the chat flow, every new lead or confirmed buyer has to be written instantly to my spreadsheet (Google Sheets or another cloud sheet you suggest) as a fresh row containing name, phone, product chosen, order value, and timestamp. No editing of existing rows is required at this stage—just clean “append-only” entries so I can track conversions. Core expectations • WhatsApp Business API or Twilio-for-WhatsApp integration • LLM-driven dialogue (OpenAI, Claude...
I’m implementing a fully-automated sales closer that can greet every new lead, hold a natural back-and-forth conversation, qualify intent, answer common questions, book appointments and then push the entire interaction straight into our CRM for sales follow-up. The assistant must converse on the channels my customers actually use: WhatsApp, Email, Website chat, SMS, LinkedIn DMs, Instagram and Facebook Messenger. For prospects who prefer to talk, an AI voice agent should be able to pick up the phone, speak conversationally, and keep the call summary and recording logged back to the CRM as well. Key workflow requirements • Unified multichannel inbox and bot that responds instantly 24/7, hands off to a live rep when needed, and preserves the full chat or call history. &b...
I want to build a fully automated outbound voice agent on the Sarvam AI platform that can conduct sales calls from start to finish. The core purpose is clear: it must dial prospects, qualify them in real time, generate fresh leads, and, when appropriate, schedule or trigger follow-up calls to keep the conversation alive until conversion. Key features I need working flawlessly: • Lead generation during the first call, with captured data immediately pushed back to my CRM. • Follow-up calls that reference previous interactions, so prospects feel the continuity of a single sales rep rather than a bot. Because this is all outbound, the agent has to handle call pacing, voicemail detection, basic objection handling, and dynamic script branching. Sarvam AI’s speech synthesis and ...
I’m ready to deploy a voice-driven AI that can greet callers, answer their questions, and, when needed, lock in an appointment without human intervention. The most important capability is smooth, accurate handling of customer inquiries, yet I also want the same agent to transition naturally into scheduling. Key objectives • Natural-sounding, 24/7 phone agent that fields FAQs, store hours, service explanations, basic troubleshooting, and similar requests. • Appointment booking that asks for the usual details—name, phone or email, preferred date/time, and reason—then writes the slot straight to my calendar or CRM. • Conversational fluency in more than two languages, with the architecture left open for additional languages we may add later. • Sea...
We are looking for an experienced AI development partner/team to build an Offline AI Learning Engine for deployment in Government school classrooms. The system will include: * AI Teaching Assistant (LLM-based) * RAG (Retrieval-Augmented Generation) using curriculum content * AI-based assessment and quiz generation * Teacher Copilot features * Offline-first deployment on local classroom systems * Central sync with Data Centre Scope of Work * Design and develop RAG-based AI system using provided educational content * Implement local AI model for offline usage * Fine-tune model for: - school-level explanations - bilingual support (English + regional language) - curriculum-bound responses * Build APIs and backend for integration with classroom system * Develop synchronization system...
I need an AI-powered calling workflow that will telephone 250 airports and 250 tourism boards spread across roughly 250 different states or regions worldwide. For each organisation the agent must confirm, on the call, the following: • the full name, direct phone number and email address of the Head of Marketing • the official name of their Advertising & PR agency (or agencies) I already hold many of the numbers, and I will add the rest to the dial list before we start. Every call has to sound natural—human-like, not robotic—and must be recorded end-to-end. I will later review both the MP3 files and the call logs, so clear labelling of each recording is essential. Because calls will span several continents, the voice bot needs to switch fluidly between at le...
I need a sub-60-second, photorealistic video that mirrors the pacing and feel of this reference clip: The piece will demonstrate safe crossing methods at sea and clearly distinguish permitted from restricted areas, all presented in Arabic. I’ll supply the exact Arabic terms and place-names, plus a concise script. Once the wording is locked, we can run it through an AI voice engine for the background narration, so no live voice talent is required—just clean integration and lip-sync where needed. The final result should look as if it was shot with a high-end camera at sea: realistic water physics, believable vessel movement, accurate lighting, and subtle graphic call-outs that echo the style of the reference. Motion design must stay elegant and restrained; visuals carry the me...
Nakala zilizopendekezwa kwa ajili yako tu
How user testing can make your product great
Get your product into the hands of test users and you'll walk away with valuable insights that could make the difference between success and failure.