BLOG
Are Voice tools finally coming of age?
In 2008 / 09 when at Microsoft I had some exposure to the language team when working on the search product. Back then they thought they were very close to finally nailing voice to text translation. I remember the discussion was along the lines of “we are 95% there, But the last 5% is the most difficult”.
For me as someone who was never fast on the keyboard this was great news, but how wrong we all were on timing. Finally, with the evolution of large language models, Chat GPT and the like, it seems like we might be getting there. Not only that, some of the direct translation tools to convert foreign language into your own language are now compelling.
I now use Microsoft tools when writing emails or longer form documents such as this blog. Yes, you do need to go back and edit, but the time saved and the ease of getting your thoughts down I am finding more and more superior to typing. Auto punctuation remains a weakness, so I always leave the auto function off but I'm confident that this is going to improve rapidly. With Microsoft, there are tips on punctuation prompts which take a bit of getting used to but work well once you get the hang of it.
Gmail also have some new tools in beta and are rumoured to be releasing these new tools soon, let's hope they use the same prompts so we don't have to relearn everything! Apple is also working on “on device AI” and have se aside 4 billion U.S. dollars this year in AI servers and back end. I look forward to seeing what apple releases later this year both from a privacy point of view and also in reducing the lag time that you often see when using these cloud-based products.
Interestingly this works well when I'm working from home or if you have your own office. I'm not sure how well these tools will go in an open plan workspace where you have noise and other voices surrounding you. The mobile products suffer from this especially when on public transport or in open spaces. It will also be interesting to see how and if these tools affect workplace design and layout.
This is an area where large language models are going to have a huge impact. With the rapid advancement we're seeing in other areas I can see by the end of this year that use of these voice tools are going to become more and more standard.
The re-writing of history
A fascinating article from wired magazine (Link below) highlights potential weaknesses in the LLM’s data grab.
The scenario, according to the article is that over 80% of all mainstream news content is now blocking access to data collection from Chat GPT and the likes. On the other hand, right wing news such as Fox and Breitbart, are welcoming the ingestion of data.
So if a LLM learns from the data it ingests, then there is legitimate concern that these models will reference the data where there are significantly more sources and volume. So if you have a motivated section of the media and population who are aligned on a certain topic, and who are active in promoting said topic, then could this activity skew the bias of the LLM taking on the data.
If for example, you have enough sources saying Trump won the election, then in response to related search queries would the LLM return a result reinforcing this belief? Taken to another level, where, as is the belief in tech circles, these LLM’s will eventually replace our reliance on Google, then the ability to re-write history is very possible!
AI is boring!
One of my favourite articles from 2023 was written by Alex Murrell, called “the Age of Average” (link below).
The main thrust of the article is how technology is making the world bland. The insta’ global access to the latest in technology, trends and styles has inadvertently sent us down the path of sameness. From coffee houses, to AirBnB accommodation to car design… there are too many examples to touch on. We are heading towards product and cultural conformity.
In 2004 an English think tank, The “New Economics Foundation” (NEF) released research, since updated, called “Clone Town Britain”. The argument being that cities the UK are losing their local identity, and most city centres are indistinguishable from one another. Whilst driven by scale and economics, it could be argued that those economics are being driven by the trends pointed out by Murrell. In Scandinavia we are much further advanced than the UK.
The AI of today is based on existing data. It is creativity looking backwards. The LLM’s driving the AI recommendations we are getting run the risk of flattening out the outliers, and generating outputs that are essentially average. Think Spotify recommendations… once you have subscribed for a while it is very hard for you to escape what Spotify has decided you want to hear!
The importance of art, diversity and creativity have never been greater.
AI and the future of customer service
Yep, that’s my Kindle nailed to the wall. I had a lot of books, made a number of purchases with Amazon and had their app on all my devices. One attempted purchase went weird and my account was suspended, then cancelled. End of conversation. I tried to contact Amazon in various ways, but to no avail. I still do not know why. It was the phone conversation with Amazon «customer service» that got me thinking… I concluded that the voice on the other end was either AI or being prompted by one.
Companies’ are rapidly adopting AI in customer service, but more as a cost management solution than to improve the customer experience. The LLM’s driving this technology are still very binary, and cannot reason. So if your problem does not fit a prescribed solution, then good luck trying to get help. In a lot of companies, there simply is no escalation procedure. There is a feeling of complete helplessness that follows the type of response I got from Amazon.
Smaller companies are much better at this… they are more agile, and the staff more engaged. As companies get larger, service gets more and more automated and the personal touch, (the ability to apply reason to a problem) goes away. Amazon have decided they no longer want me as a customer, so my business goes to bookshop.org and Ebay, amongst others.
For me, I now begin my purchases by contacting customer service first. For anyone representing a company that might be reading this, I suggest you make sure there are escalation procedures that sit above your AI solutions. Finally, companies must make sure there is a way to capture customer feedback that is reviewed and measured.
AR and Retail, and the friction of the user experience.
I am still waiting to see the first real AR implementations with retail.
It surprises me that many of the «cutting edge» technical implementations still revolve around scanning QR codes or bar codes.
Ultimately, if we are to use our phones as an layer to the retail experience it has to add value, and add to the experience. At the moment, there are simply too many hoops to jump through for the customer, adding friction and raising barriers to adoption. Think download app, find offer, scan code, show code to cashier for offer etc.
Object recognition has promise, but the layout of the average store does not lend itself to successfully sorting products, let alone intelligent / dynamic feedback and / or gamification of the experience. Dynamic pricing or using AI to optimise the store such as developed by Kvass.ai work efficiently from the store perspective, but are not adding excitement or engagement for the customer.
I am excited by what I see from Idtag.no and the ability to automate all of the above. Also, moving away from QR Codes and such enhances the experience, and makes discovery much easier. As a customer, If I am going to download an app then the benefits must be compelling. Not only that, what is going to keep me coming back? Retention is one of the 3 pillars of business!
I am interested in other real world examples of AR in retail… ping me if you have any!
/M
Fake News, “you ‘aint seen nothing yet!”
As the saying goes, a picture is worth a thousand words… so what about a video?
Search “deep fakes” and you will see where the technology is heading. Then think of this technology in the wrong hands. (Insert dictator, malicious actor etc etc) Through a round table conversation at Kvass.ai, a comment was made that “in the very near future, video will be inadmissable in a court of law because the fakes will be so good”.
Meaning experts will not be able to tell the difference between a fake video or a real one. Put another way, the best technology in the wrong hands could be very disruptive.
Think about it…
Imagine the political power in releasing a timely video to the masses that puts a political opponent in a compromising position. Imagine the power that a well funded defendant could have in producing “evidence”.
Brace yourselves, it will happen sooner than we think!
/M