No one disputes that women are underrepresented in STEM fields. The United Nations Educational, Scientific and Cultural Organization, UNESCO, reported that only 28% of the world’s researchers are women, current female student enrollment is low in areas such as information and communications technology at 3%, and only 5% of current enrollees in natural science, mathematics, and statistics are women.
Despite this lack of diversity in STEM fields, women are not only advancing technology in their respective fields but also founding STEM-related businesses. This article highlights three CEOs who have spent years refining technology that controls and manipulates communication signals. Their solutions give speech to the voiceless, listen to and recognize verbal commands, and efficiently amplify and deliver wireless signals that allow millions of people to talk, text, and surf the internet.
Modern communication involves technology. There are cell phone towers, mobile phones, Wi-Fi-connected gadgets, smart speakers, and wireless headsets and earphones. Inside all of them are computer chips and algorithms designed to convert electromagnetic radio waves into wireless signals or audio. How each works is as unique as the companies built around the innovations—and typically requires complex engineering skills combined with remarkable measures of creativity.
Each of these innovators has pushed the limits of her field, finding success in completely new markets or rising above the competition in others. Their technologies shift paradigms.
Rupal Patel, who directs the Communication Analysis and Design Laboratory at Northeastern University in Boston, started her career as a speech pathologist and later went on to obtain her doctorate in speech acoustics. A number of years ago, while attending an assistive technology conference, she caught sight of a little girl having a conversation with a grown man. Neither of them could speak using their own voices, so they relied on speech synthesizers that created audio from words typed on a computer. Patel was shocked to hear that the girl and the man had the same generic computerized voice.
In 2014, Patel founded VocaliD, which uses state-of-the-art machine learning and speech-based algorithms to customize unique and more realistic synthetic voices. Millions have difficulties using their voice to speak, and while not all of them need a computerized device to communicate, many do. These devices generate generic voices, some of them similar to what was used by the famous astrophysicist Stephen Hawking.
But it doesn’t have to be that way. It turns out that people who cannot speak are still able to produce utterances from their voice boxes; it’s the vocal tract, the chambers in the head and neck, that don’t work properly to filter the sound into the consonants and vowels of speech.
Patel’s idea, she says, was to record vocalizations from a person who couldn’t speak and filter them through the words borrowed from another person who was about the same age, size, and gender. The research team at VocaliD used MATLAB® to prototype a method for separating the vocal source from the surrogate’s voice. Patel says every person’s voice has four distinct characteristics: pitch, loudness, breathiness, and nasality, which defines whether the sound is resonating more in the head or chest. Combining these four characteristics produces 16 possible voice types. Pinpointing the voice type of a surrogate and that of an end user enables the speech engineers to find ideal matches.
As Patel and her team were developing the technology, they discovered another group of people who needed synthesized speech just as badly. These were patients who, because of a disease or cancer, required a surgery that would leave them unable to talk. Knowing that they will lose this ability, these patients could record their voice ahead of their hospital stay and save it to generate a synthetic one that sounds just like them.
Patel tells the story of a Texas man in his 60s who had never smoked but had somehow developed throat cancer. About a day or two before his scheduled surgery, he read an article in a magazine that described VocaliD’s innovation. He emailed Patel immediately and asked if she could help him. She wasn’t sure if there was time, but she encouraged him to visit the “Human Voicebank” on VocaliD’s web site and record as many samples as he could. He managed to say 1300 sentences, and Patel and her team were able to reconstruct his voice to use after his surgery.
“It’s exciting that we’re making progress, but I also feel like we’re not reaching enough people,” says Patel.
Reaching more people means generating more revenue to raise awareness. To that end, VocaliD has also been working within the corporate world to create unique voices tied to a product, a company, and even modes of public transportation, such as buses or subways. One project has them synthesizing the voice of a well-known sportscaster for a commemorative event.
“It’s not just people with disabilities that can benefit from this technology. We need broader industry adoption to push the boundaries of the technology for people with disabilities forward,” says Patel.
"It was the first a-ha moment of recognizing that people who couldn’t speak and used a device to talk were using a limited set of voices."Rupal Patel, CEO of VocaliD
With the ubiquity of text-to-audio dictation, digital voice assistants such as Siri and Alexa, and electronic devices like phones, smartwatches, and earphones that respond to voice commands, the keyboard as an interface could be largely gone in five years. Mouna Elkhatib and her AON team are ready. In 2018, she cofounded the company AONDevices (AON) to develop robust, low-power, on-chip algorithms that use artificial intelligence (AI) to give battery-powered devices the always-on capability of listening for and responding to voice and audio.
So-called “hearables” not only listen for voice commands but may soon discern environmental sounds relevant to the user. Imagine earphones that know you’re in the street and can alert you when there are sounds you need to pay attention to. Imagine a baby monitor that knows the difference between a gurgle and a cry and can notify the parents that the baby is awake. Imagine a security system that hears the sound of breaking glass and sounds an alert.
"I have a lot of passion for voice and audio."Mouna Elkhatib, CEO of AONDevices
As the company’s CEO, Elkhatib draws from her extensive career in voice and audio. She worked in leadership roles at Conexant, a semiconductor company that provides products for voice and audio processing; Qualcomm, a semiconductor and telecommunications equipment company; and BrainChip, an artificial intelligence computer solutions company.
“I have a lot of passion for voice and audio,” says Elkhatib, who holds 11 patents and four provisional patents.
She knows the technology inside and out, from the architecture of the computer chip down to the level of the circuits. As a result, she’s constantly on the lookout for ways to improve it. Over the years, one problem has nagged at her. Traditional algorithms that handle digital signal processing can’t perfectly recognize audio when the background is very noisy. The problem worsens for always-on battery-operated devices because the standard algorithms require too much power to resolve the issues, quickly draining the battery.
AON researched the issue and found a solution in AI: deep learning neural networks can be used to solve audio problems in a range of applications.
AON builds these algorithms from scratch and uses MATLAB in the development and solution optimization stages for the problem they are trying to resolve. For instance, they may feed the algorithm some audio data that contains only voice commands and tell it, “This is only voice.” Next, they will feed it background noise and tell it, “This is background noise.” Then they may feed it both and ask it to find the voice commands buried in the background noise. As the algorithms get better at distinguishing the command from the background noise, the researcher makes the tests more difficult or slims the algorithm to use less and less power, while achieving the same result.
Now they have ultra-low-power algorithms that perform at very high levels, higher than anything achieved with traditional algorithms, says Elkhatib.
As a child in South Korea, Helen Kim, CEO of NanoSemi, loved science. She admired Marie Curie so much that by age 10 she was performing chemistry experiments at home. “My parents allowed me to have a chemistry set and blow up things,” she says. When she was in high school, her parents moved the family to Los Angeles, and Kim became fascinated with computers and electronics. “I was awed by the possibilities of technology beyond pure science,” she says.
Those possibilities set her on a new track that ultimately resulted in a Ph.D. in electrical engineering from Columbia University. She went on to work 12 years at Bell Labs and 10 years at the Massachusetts Institute of Technology’s Lincoln Laboratory, and in 2014, she cofounded NanoSemi, a software company that improves wireless communications, with Alexandre Megretski, Yan Li, and Kevin Chuang.
At the heart of NanoSemi are algorithms designed to improve the performance and efficiency of radio frequency power amplifiers that deliver wireless signals. These amps deliver familiar signals such as 4G, LTE, and Wi-Fi. But next-generation spectrum, 5G, provides faster speeds and more bandwidth. Data rates are expected to be about 40 times higher than those of 4G, and Wi-Fi 6 is expected to be four times faster than the latest version of Wi-Fi, 802.11ac, according to Cisco. Because the presence of 5G won’t eliminate 3G, 4G, or other wireless standards, electronic devices and equipment will have to accommodate all of these offerings in less space.
"NanoSemi’s approach is truly ground-breaking in that we are reducing power consumption while improving performance. Not only will your cell phone work better due to the improved connection, but the battery will last longer too."Helen Kim, CEO of NanoSemi
That’s a big challenge. Computer chips on mobile phones and computers and in communication base stations are physically limited to how much signal they can amplify. If pushed beyond their limits, amplifiers can create distortions in signals; produce “junk” signals, known as spurs; or even spill signals over into other radio channels, interfering with signals meant to be there. One way to solve these problems is to pack more amplifiers onto an electronic device. But space is limited and adding more electrons means adding more heat, which in turn requires more energy to keep cool.
NanoSemi’s solution addresses the physical limitation of amplifiers with algorithms that use predictive machine learning models to adapt to wireless signals in real time. NanoSemi’s team developed an approach that’s able to generate unique math functions within an algorithm to precisely predistort a signal’s input. Adding distortion to the input cancels out any distortion that would have occurred on the output. The result is a clear and reliable signal.
NanoSemi’s team is split into three technical groups: one that creates the algorithms, another to identify physical limitations of radio frequency amplifiers and to validate algorithms, and a third that converts the finished algorithm into a design that can be embedded on a semiconductor chip. Kim says the first two teams use MATLAB to create and validate those algorithms as well as run the test equipment. The final design improves the performance and efficiency of the radio frequency power amplifiers that ultimately deliver wireless signals. “We clean up those signals while pushing the amp power,” says Kim.
NanoSemi’s customers include manufacturers of 5G mobile devices, wireless infrastructure, and signal processing test equipment.
“NanoSemi’s approach is truly ground-breaking in that we are reducing the power consumption while improving the performance,” says Kim. “Not only will your cell phone work better due to the improved connection, but the battery will last longer too.”