Home Tech Voice programming, the next cutting-edge technology in the field of software development?

Voice programming, the next cutting-edge technology in the field of software development?

1
0

Author | Rina Diane Caballar

Translator | Sambodhi Planning | Liu Yan In the programmer community, there is such a special group-blind programmers. Blind programmers rely on screen readers, braille displays, etc. to help write code. And for those programmers who suffer from hand diseases, they can’t use the keyboard, how to program? From speech to code : There are two leading language programming platforms today, which provide different ways to “reciting” codes to the computer. One of them is called Serenade, which is a bit like a digital assistant: it allows you to describe the instructions you are writing code without requiring you to dictate each instruction verbatim; the other is called Talon, which provides more fine-grained control over each line. It also requires a more detailed understanding of each task programmed into the computer. A simple example in this article is a step-by-step guide for generating Python code in Serenade and Talon, which will print “hello” on the screen. Through conversations with gadgets, we communicate with them Interactions are becoming more frequent . Nowadays, old friends like Alexa and Siri have joined the ranks of car assistants like Apple CarPlay and Android Auto, and even joined apps that are sensitive to voice biometrics and commands. But what if the technology itself can be built with voice? That is the premise of voice programming. Voice programming is a software development method that uses voice instead of keyboard and mouse to write code. On the voice programming platform, programmers “speak” commands to manipulate the code, creating custom commands that adapt and automatically execute the workflow. Voice programming is not as simple as it seems, there are many complicated technologies behind it. For example, Serenade, a speech programming application, has a speech-to-text engine developed specifically for code. Unlike Google’s speech-to-text API, it is designed for conversational speech. When the software engineer speaks the code, Serenade’s engine feeds it back to the natural language processing layer, and its machine learning model is trained to recognize and convert common programming structures into syntactically effective code. Serenade raised $2.1 million in its seed round in 2020. When co-founder Matt Wiethoff was diagnosed with repetitive force injury (RSI) in 2019. Common occupational diseases), Serenade came into being. “I gave up my position as a software engineer at Quora because I can’t do this job anymore,” he said. “Either choose a job that doesn’t have to type so many words, or come up with some solutions.” Ryan Hileman also followed the same path. After suffering from hand pain a year ago, he quit his full-time job as a software engineer in 2017. So Hileman started to create Talon, a hands-free programming platform. He said: “Talon’s purpose is to completely replace the keyboard and mouse.” Talon has several components: speech recognition, eye tracking, and noise recognition. Talon’s speech recognition engine is based on Facebook’s Wav2letter automatic speech recognition system, which Hileman has expanded to adapt to voice programming commands. At the same time, Talon’s eye tracking and noise recognition functions can simulate using a mouse to navigate, move the cursor on the screen according to eye movements, and click according to the popping sound of the mouth. “This kind of sound is easy to make. This method is effortless and only requires a low delay to be recognized, so this nonverbal way of clicking the mouse is faster and does not cause sound fatigue.” Hileman said. Programming with Talon sounds like speaking in another language, software engineer and speech programmer Emily Shea said during a conference speech in 2019. Her speech video is full of voice commands, such as “slap” (click enter), “undo” (delete), “spring 3” (go to the third line of the file), and “phrase name op equals snake extract word paren mad” (the result is this line of code: name = extract_word(m)). When programming with Serenade, it follows a more natural way of speaking code. You can say “delete import” to delete the import instructions at the top of the file, or “build” to run custom build commands. You can also say “Add function factorial” to create a function, calculate the factorial in JavaScript, and the application will process the syntax-including the “function” keyword, parentheses and braces-so you don’t need to explicitly declare each element. Voice programming does require a decent microphone, especially when you want to remove background noise. Serenade’s model is trained based on the audio generated by the microphone on the laptop. If you want to run Talon with eye tracking, you also need eye tracking hardware. However, Talon can operate normally without such hardware. Open source speech programming platforms, such as Aenea and Caster, are free, but they all rely on the Dragon speech recognition engine and users must purchase them by themselves. This means that Caster supports Kaldi and Windows speech recognition, the former is an open source speech recognition toolkit, and the latter is pre-installed on Windows. Tommy MacWilliam, co-founder of Serenade Labs, said these results are sufficient to illustrate the problem. “It’s so simple to be able to describe what you want to do,” he said. “Compared with typing or pressing keyboard shortcuts, saying’move these three lines down’ or’duplicate this method’ will be smoother.” Voice programming can also allow those with injuries or chronic pain to continue their careers. “Being able to use voice, just remove my arm from the equation, and you can open up a less restrictive way to use the computer.” Shea said. Programming through voice can also lower the barriers to entry for software development. “If they can think about the code they want to write in a logical and structured way,” MacWilliam said, “then we can make machine learning the last mile and turn these ideas into syntactically effective code.” Voice programming is still in its infancy, and whether it can be widely adopted also depends on the extent to which software engineers are bound to the traditional keyboard and mouse coding mode. But language programming has given us all kinds of possibilities. Perhaps in the future, brain-computer interfaces will directly convert what you want into code, or the software itself. about the author Rina Diane Caballar, a reporter, worked as a software engineer and lives in Wellington, New Zealand. https://spectrum.ieee.org/computing/software/programming-by-voice-may-be-the-next-frontier-in-software-development