Software
YC-backed EzDubs Aims for Translation Consumer Market
2024-12-11
The global translation service market holds a significant value, estimated to be around $40 billion by various analysts. Within this vast market, enterprise services play a crucial role. Meanwhile, for consumers, apps like Google Translate and Apple Translate have gained dominance, yet they lack the capability to handle calls and voice messages effectively.

EzDubs: Transforming Person-to-Person Translation

Y Combinator-backed EzDubs aims to address the issue of person-to-person translation through its innovative app. This app supports translation for calls, voice messages, and text in over 30 languages. It was founded in 2023 by Padmanabhan Krishnamurthy, Amrutavarsh Kinagi, and Kareem Nassar. Krishnamurthy and Kinagi first met during college in Hong Kong and built a project that read lips and translated speech into text for those with hearing loss. In 2021, they moved to Columbia University and began working on video dubbing. There, they met Nassar, who was leading Cisco's Speech AI Group while pursuing a master's degree part-time. Nassar had extensive experience in real-time speech AI products and later created Voicea, which was sold to Cisco. The trio officially started building EzDubs in 2023 with a dubbing model and an early version of a translation tool.The company initially launched a Twitter/X bot in January 2023 that translated clips on the platform. This bot has amassed over 340,000 followers and receives more than 500 dub requests daily, with translated videos getting over a million daily views. In July 2023, they launched a bot on WhatsApp, enabling users to translate voice messages and videos. However, users had to forward voice messages to the bot, get the reply translated, and then send it back. To overcome this, they decided to build their own app and released an early version this year.Krishnamurthy emphasized that partners at Y Combinator advised them that if they could solve the problem of communication latency in different languages, a communication platform would have a more significant downstream effect than video. This led them to opt for building a communication app instead of focusing solely on video dubbing.The EzDubs app, available on iOS and Android, offers real-time translation for calls with support for 30 languages. Users can call someone speaking a different language and receive an instant translation. The other person doesn't need to have the EzDubs app. The app also provides translation for text, voice, and video messages.Although users can share translated voice or video messages outside the app using a link, the founders believe that for those communicating in multiple languages, platforms like WhatsApp and iMessage are not sufficient. EzDubs has noted that many people use their app to make hundreds of calls daily, with an average call time of 17 minutes. The top use cases include people dating across cultures and professionals communicating with locals while abroad.At the core, EzDubs has two main models: one for voice cloning while maintaining the conveyed emotions and another for translation. The translation model is designed to handle interruptions and doesn't wait for a person to finish a whole sentence before starting internal translation.EzDubs has raised $4.2 million in seed funding led by Venture Highway founded by ex-WhatsApp CBO Neeraj Arora. Other participants include Y Combinator partner Jared Friedman, Replit CEO Amjad Masad, Replit president Michele Catasta, Applied Intuition's CEO Qasar Younis, and a16z-backed cloud startup Replicate's CEO Ben Fishman.Friedman pointed out that the company is on a path to making translation tools easily accessible to users. He believes in the product due to the founders' rich history in speech and language learning. He added that even within a company, there are language barriers that can hinder efficient communication, and EzDubs can remove these barriers.The company is soon launching a feature that allows users to scan a QR code and initiate an EzDubs call instantly without downloading the app. The founders stated that while apps like Google Translate have a real-time mode, it requires passing the phone back and forth. Eventually, they aim to make EzDubs a default phone calling app to handle incoming calls as well.In the coming months, EzDubs plans to build an extension for apps like Google Meet, Microsoft Teams, Zoom, and Slack.
Google's Gemini Now Capable of In-depth Research
2024-12-11
Google is making significant advancements in its chatbot platform, Gemini. With the introduction of "Deep Research," users can now have a powerful research assistant at their fingertips. This feature uses advanced reasoning and long context capabilities to generate comprehensive research briefs. The briefs are not only presented in Gemini apps but can also be exported to Google Docs for further editing. Currently, Deep Research is exclusive to Gemini Advanced, available through the Google One AI Premium Plan at $20 per month.

Unlock the Power of Google's Gemini with Deep Research

How Deep Research Works

When a user poses a question, Deep Research creates a "multi-step research plan" that the user can either revise or approve. Once approved, Deep Research refines its analysis over a few minutes. It searches for potentially interesting information, saves it, and then initiates new searches based on what it has learned. This process repeats multiple times until a report of key findings is generated. Initially, Deep Research is only available in English on desktop and the mobile web, with plans to expand to Gemini mobile apps in early 2025. Users can access it by selecting the "Gemini 1.5 Pro with Deep Research" option in the model's drop-down menu.

For example, imagine a student working on a research project. By using Deep Research, they can get a detailed research plan and quickly gather relevant information from across the web. This saves them a lot of time and effort compared to traditional research methods. It acts as a valuable tool to assist in the research process and help students achieve better results.

Potential Harms and Ethical Considerations

While Deep Research is an impressive feat, it also raises several ethical questions. Just like all AI, it makes mistakes and can hallucinate. This can have serious consequences, especially in education. As Jessica Grose pointed out in a recent op-ed in The New York Times, students are increasingly relying on generative AI to outsource brainstorming and writing. This risks them losing the ability to think critically and overcome frustration with difficult tasks.

There is also a potential financial impact on publishers. By scraping information from websites and compiling it into briefs, Deep Research could deprive these sites of valuable ad revenue. One study has shown that since the launch of AI Overviews, publishers have seen a 5% to 10% decrease in traffic from search. An expert estimated that AI-generated overviews could lead to more than $2 billion in losses for publishers.

However, Google claims that Deep Research can "connect users to relevant websites they might not have found otherwise so they can dive deeper to learn more." It remains to be seen whether this promise will be fulfilled and if the feature will truly enhance the user's research experience without causing harm to publishers.

Gemini 2.0 Flash: The New Flagship AI Model

Starting today, both free and paying Gemini users will have access to Gemini 2.0 Flash, Google's newest flagship AI model. This is an experimental version optimized for chat, with the full version set to arrive in January. Google claims that 2.0 Flash should deliver better performance across various tasks and faster responses. Users can select it from the Gemini model drop-down on desktop and the mobile web (but not the mobile apps yet).

Although the company cautions that some Gemini features may not be compatible with the experimental model, it still offers exciting possibilities. For instance, in a chat-based scenario, users can expect quicker responses and more accurate answers. This could lead to a more seamless and efficient user experience.

See More
Google's AI Overviews to Handle Math & Coding Queries Soon
2024-12-11
AI Overviews, the summaries provided by Google for certain search queries, is set to handle more complex topics and multimodal/multi-step searches. The newly launched Gemini 2.0 model drives these expanded capabilities, which are expected to enhance search speed and quality. A limited test will start this week, followed by a broad rollout next year.

Unlock New Search Possibilities with AI Overviews

Expanded Capabilities Driven by Gemini 2.0

Google's AI Overviews is on the verge of handling more complex and diverse search queries. The newly launched Gemini 2.0 model plays a crucial role in this expansion. It is designed to deliver improved search speed and quality, enabling users to get answers more quickly. This is a significant step forward in the evolution of search technology.For example, advanced math questions and coding problems can now be addressed more effectively. Users will no longer have to struggle with finding the right information. The Gemini 2.0 model is like a powerful tool that empowers users to explore new frontiers of knowledge.

Limited Test and Broad Rollout

A limited test of the new AI Overviews feature is set to begin this week. This is an exciting development as it allows Google to gather valuable feedback and make any necessary adjustments. Once the test is successful, a broad rollout will follow early next year, making these enhanced search capabilities available to a wider audience.This gradual approach ensures that the technology is refined and optimized before being made widely available. It also shows Google's commitment to providing a seamless user experience. By starting with a limited test, they can identify and address any potential issues early on.

Controversies and Challenges

Since its launch this spring, AI Overviews has been the subject of much controversy. It has gone viral for its dubious statements and questionable advice, such as recommending adding glue to pizza. A recent report from SE Ranking found that it cites websites that are not entirely reliable or evidence-based, including outdated studies and paid product listings.The main issue is that AI Overviews sometimes has a hard time distinguishing between fact, fiction, satire, and serious information. Over the past few months, Google has made changes to how AI Overviews works, limiting answers related to current events and health topics. However, they acknowledge that the system is not perfect and is constantly evolving.Despite these challenges, Google believes that AI Overviews has led to a boost in search engagement, especially among the 18-24 age group, which is a key demographic for the company.

Monetization and Antitrust Concerns

Google recently took steps to monetize AI Overviews by adding ads on mobile for certain relevant queries. This has caused concern among publishers who say the feature is negatively affecting their traffic. Google, however, claims to be taking publishers' concerns into account in the development of its AI search experiences.AI Overviews is also a target in the Justice Department's antitrust lawsuit against Google. The DOJ aims to break up what a judge ruled to be an illegal monopoly in search. One of the key requests is that Google allow sites to opt out of AI Overviews without being penalized in search results.This highlights the complex issues surrounding AI in search and the need for careful regulation to ensure a fair and competitive marketplace. Google is facing significant challenges as it navigates the evolving landscape of search technology and user expectations.
See More