Are Voice tools finally coming of age?

In 2008 / 09 when at Microsoft I had some exposure to the language team when working on the search product. Back then they thought they were very close to finally nailing voice to text translation. I remember the discussion was along the lines of “we are 95% there, But the last 5% is the most difficult”.

For me as someone who was never fast on the keyboard this was great news, but how wrong we all were on timing. Finally, with the evolution of large language models, Chat GPT and the like, it seems like we might be getting there. Not only that, some of the direct translation tools to convert foreign language into your own language are now compelling.

I now use Microsoft tools when writing emails or longer form documents such as this blog. Yes, you do need to go back and edit, but the time saved and the ease of getting your thoughts down I am finding more and more superior to typing. Auto punctuation remains a weakness, so I always leave the auto function off but I'm confident that this is going to improve rapidly. With Microsoft, there are tips on punctuation prompts which take a bit of getting used to but work well once you get the hang of it.

Gmail also have some new tools in beta and are rumoured to be releasing these new tools soon, let's hope they use the same prompts so we don't have to relearn everything! Apple is also working on “on device AI” and have se aside 4 billion U.S. dollars this year in AI servers and back end. I look forward to seeing what apple releases later this year both from a privacy point of view and also in reducing the lag time that you often see when using these cloud-based products.

Interestingly this works well when I'm working from home or if you have your own office. I'm not sure how well these tools will go in an open plan workspace where you have noise and other voices surrounding you. The mobile products suffer from this especially when on public transport or in open spaces. It will also be interesting to see how and if these tools affect workplace design and layout.

This is an area where large language models are going to have a huge impact. With the rapid advancement we're seeing in other areas I can see by the end of this year that use of these voice tools are going to become more and more standard.

Next
Next

The re-writing of history