-->

Connecting With Computers: The Next Breakthrough?

Article Featured Image

The recent artificial intelligence breakthrough initiative from China, DeepSeek, showed it is possible to compete with existing generative AI models using efficiently designed neural nets rather than simply making them bigger. The bigger-is-better approach seems to be behind the huge investments that the technology giants are shoveling into ever-larger computer centers, leading to these questions about the long-term sustainability of that approach:

  1. Will the larger neural networks surprise us by exhibiting what has been called “artificial general intelligence” (AGI), showing humanlike reasoning capabilities with a speed that puts humans to shame?
  2. If the larger-is-better approach turns out to be an abject failure, is there another application for the increased computing power such an initiative generates that justifies the investment?

Allow me to predict that the increased computing power will fail to create AGI, but that there will in fact be another application for that computing power: digital assistants. With some revisions in how they operate, a new generation of digital assistants, one that will connect computers with humans through language, has the potential to finally deliver on the long-promised concept of augmented intelligence, making us all smarter.

Exponential Growth in Computing Power

Growing computer processing power has long been important to our economy and everyday experiences. In the age of AI, we’ve forgotten that earlier versions of computing power harnessed the computer’s simple ability to do math more quickly and remember more data than any human could. These fundamental capabilities have escalated into much more complex applications such as web search; more than 8.5 billion searches are said to be done on Google each day.

The growth in computing power today is going well beyond the doubling every two years predicted by Moore’s Law. Since the early 1970s, the number of transistors that fit on a chip has increased by a factor of 100 million.

But computing power is growing even faster than the numbers of transistors that fit on a chip. A familiar example is the graphical processing units of companies like Nvidia; those chips do many computations simultaneously rather than the one instruction at a time of microprocessor chips.

Another factor is simply the rapid growth in the number of chips—computer centers getting larger. This is further driven by the growth of cloud computing services, which give access priced by usage to what are essentially supercomputers—e.g., through Amazon Web Services or Micro­soft Azure—as well as the computer centers of services using huge computing power internally, such as Google’s web search or OpenAI’s ChatGPT.

A further trend is the competition between countries as AI becomes critical to economies and militaries. International competition suggests we can expect total computing power to continue to grow despite its cost. OpenAI, the developer of ChatGPT, for example, announced the Stargate project in January 2025. Stargate is a new company that has said it intends to invest $500 billion over the next four years, building new AI infrastructure for OpenAI in the United States.

The implicit goal of Stargate and similar investments is to take genAI to a new level with a very large neural network. But a bigger net requires more data to learn from. The growth in the amount of data driving machine learning must be significantly more than the growth in the number of parameters of a bigger net or collection of nets based on the fundamentals of pattern recognition. Where will reliable data of that magnitude come from? The web and even today’s journalism is full of people taking positions based on misinformation or a political agenda. That is unlikely to generate AGI that reasons in a way we would call advanced intelligence.

A Better Use for All That Computing Power

Is there another use for all that computing power that will justify the cost? After all, these are general-purpose computers. The big neural net project is just one use of a huge computer center. I believe there is an important alternative. A bit of history that I have documented in detail in my just-published book, The Lost History of “Talking to Computers,” will help us understand that opportunity.

Science fiction has inspired us with the goal of talking to computers, e.g., talking to HAL in 2001 or to the robots in Star Trek. Talking to a digital assistant such as Apple’s Siri, Google Assistant, or Amazon’s Alexa are today’s embodiment of that vision. With smartphones, the assistant can always be available. Other than brain implants, human language is the most direct connection we can have with a computer. The concept of an always-available digital assistant that makes us smarter has for decades caught the imagination of some of our most thoughtful experts.

For example, in a special report in the February 23, 1998, issue of Business Week—more than a quarter-century ago—entitled “Speech technology is the next big thing in computing,” Bill Gates of Microsoft was quoted as saying, “Speech is not just the future of Windows, but the future of computing itself.”

An article in the February 2014 issue of Wired by Vlad Sejnoha, the CTO of Nuance (now part of Microsoft), suggested that AI could stand for “amplification of intelligence.” He asked, “What would life be like, if through the use of their assistants, everyone’s effective IQ jumped by 50 points?”

This vision hasn’t developed as quickly as I and many others expected. Technology that gives us concise answers has proved difficult, so it is common for the general digital assistants to say, “Here’s what I found,” default to a web search, and drop out of a conversation. (Generative AI may help avoid this problem.) Another issue is that most companies haven’t provided company-specific “actions” that allow the general digital assistants to connect to specialized assistants, despite all companies feeling they must have a website. Current general digital assistants simply aren’t thought of as an alternative to text-based web search.

Why the lack of such general functionality? For starters, it isn’t possible to always use a voice interface in public due to privacy or politeness issues. The concept of an assistant that becomes a universal user interface to digital systems certainly suffers if the assistant stops being available when other people are around.

The solution to this problem seems obvious: The digital assistant’s instructions should simply be “talk or type with me.” A key point is that the assistant has to act the same and have the same personal information whether it is talking or texting with you; it has to be the same assistant with the same personality. If we adopt this talk-or-type paradigm, the concept of a language user interface (LUI) rather than a voice user interface (VUI) becomes dominant. Will this happen?

The answer appears to be yes. “Apple Intelligence” is a feature of new Apple iPhones. Apple’s website in December 2024 indicated that as part of this initiative, “With a double tap on the bottom of your iPhone or iPad screen, you can type to Siri from anywhere in the system when you don’t want to speak out loud.”

And Microsoft’s CoPilot is a digital assistant focused on text, using generative AI to provide answers. However, if you ask CoPilot whether it supports speech recognition, it will enable that option, a microphone icon will appear, and you can converse with CoPilot.

Google Assistant’s basic operation is by voice. However, on some platforms there is a keyboard icon that allows you to type to the Assistant when clicked.

Samsung’s virtual assistant Bixby helps owners use Samsung’s phones or tablets more efficiently. A user can interact with Bixby using voice, text, or taps. The tap functionality adds another dimension available on touchscreens. Samsung has highlighted that conversing with computers could involve displaying information or images onscreen.

None of these talk-or-type features of current digital assistants currently receive much publicity. But they may be the start of a full LUI allowing interaction with the same digital assistant by voice or text, supplemented by a screen when applicable. Full support of this functionality could lead to digital assistants as constant companions and bring the concept of augmented intelligence closer to reality—a constant resource that in effect makes us smarter, expanding what we “know” beyond the capability of human memory. 

William Meisel has just published The Lost History of “Talking to Computers”: And What It Teaches Us About AI Exuberance. Meisel published the first technical book on machine intelligence, Computer-Oriented Approaches to Pattern Recognition, while a professor of electrical engineering and computer science at USC, founded and ran a company developing speech recognition technology that lasted 20 years, and published a newsletter on commercial developments in speech recognition and language technology for 27 years, on which the current book is based.

SpeechTek Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues