Tag Archives: Siri

Turn taking and chatbots

Turn taking is a natural part of conversation that we subconsciously engage in order for the discourse to flow. Here is an example:

A: “Good morning”

B: “Morning. How are you? Good weekend?”

A: “Yes thanks, and you? How was Brighton?”

For the Cambridge main suite speaking exams, candidates are assessed on their turn-taking ability under the criteria of ‘Interactive Communication’. In other words, this means the candidates’ ability to:

  • Interact with the other candidate easily and effectively.
  • Listening to the other candidate and answering in a way that makes sense.
  • The ability to start a discussion and keep it going with their partner/s.
  • The ability to think of new ideas to add to the discussion.

Along with the onslaught of technological advances came advance in automated responses from portable digital devices. These conversational agents or dialogue systems are capable of single interactions or up to 6 task-oriented turns. An example of these dialogue agents would be Siri, and an example of a talk-oriented interaction would be: “Siri call Dad”.

Chatbots are not a ‘new’ invention per say. Eliza, created between 1964-1966 at the MIT was a natural language processing computer programme that demonstrated the same characteristics of chatbots day, but on a less sophisticated scale, and with less complex interaction. The aim of chatbot builders is to create natural language processing programmes that replicate human-human interaction by enabling more turns and therefore extended conversations.

The interesting challenge then becomes, how to use each turn as a springboard for the next, and ensure that each one prompts a response that has been pre-programmed, in order not to receive a generic message like: “I’m sorry, but I’m not sure what you mean by that”, when the user is expressing a specific request or a expressing a turn that is not recognised. More about chatbots soon!

Just what do we expect from Chatbots?

Chatbots are the future of conversation intelligence, and can be used to stimulate human conversations. But just what do we expect from chatbots? On the one hand are those that firmly believe intelligent systems will dissipate the element of human interaction in years to come. On the other hand others revel in the delights of giving Siri instructions to challenge her intelligence and gauge the level of response.

Personally I feel that benefits for intelligent systems (chatbots) outweigh the disadvantages, but I am convinced that the advantages will depend on our behaviour and receptiveness to accept their merits. AI cynics were delighted when Microsoft’s Tay was morphed to demonstrate bad behaviour. At last there was proof to substantiate the argument in favour of the severe dangers of AI.

Users of Alexa were slightly disturbed by her random outbursts of laughter, to the extent that her code was re-written to disable a reaction when requested to do so, and to avoid reactions to false positives that try to trick her. This all leads to the question of the levels of humanness we expect from intelligent systems and chatbots, or more to the point the level of humanness we, as ‘humans’, are comfortable with accepting from ‘machines’.

AI: a new currency or the next industrial revolution?

A question that has been on my lips recently is whether AI is set to be the next industrial revolution, or a new currency of the future.

AI Past

The industrial revolution as its name denotes, revolutionised modern industry and manufacturing as we know it today. When the internet emerged in the late 1980s it seemed unimaginable that less than 30 years later, wireless connections and digital devices would have such a pervasive presence in society. New inventions come and go, and technological innovations are created whether they are successful or not, but in most cases they are shaped by the demands of people.

The origins of AI date back to Turing’s computational machine more commonly known as the Turning machine, built in 1935, however the term was coined later in 1955 by McCarthy who defined it as “the science and engineering of making intelligent machines, especially intelligent computer programmes” (ibid 2001:02), in other words trying to understand human intelligence by using computers.

AI Present

During the last 80 years, advances in AI technology have reached astounding levels. It has clearly had a prolific impact on society, to the extent that it has been transformed into a tool in all aspects of life; from banking and email pop ups, to ‘personalised’ selected products, and Siri and Alexa the intelligent personal assistants, and chatbots.

AI Future

Both academic and business investigation and reporting in the field of AI, consider it to be one of the biggest influencers for the future of the market and society. Predicted revenues from AI are unprecedented, resulting in extensive funding and investment from private companies and governments, which highlights the significance of AI in society. China has recently announced they are building a $2.1 billion industrial park for AI research. The past year has witnessed an increasing amount of nations realising the importance of AI in shaping the economics of the future, some even consider it a currency. Bitcoins stand aside, AI is the new currency..

 

 

Speech synthesis, voice recognition and humanoid robots

Speech synthesis or the artificial production of human speech had been around long before daleks on Doctor Who. Apparently, the first speech-generating device was prototyped in the UK in 1960, in the shape of a sip and puff typewriter controller, the POSSUM. Wolfgang von Kempleton preceded all of this with a a speaking machine built in leather and wood that had great significance in the early study of phonetics. Today, text to speech computers and synthesisers are widely used by those with speech impediments to facilitate communication.

Speech to text systems became more prominent thanks to the IBM typewriter Tangora which held a remarkable 20,000-word vocabulary by the mid 1980s. Nowadays speech to text has advanced phenomenally with the Dragon Dictation iOS software being a highly favoured choice. Our world is increasingly becoming dominated by voice automation, from customer service choices by phone to personal assistants like Siri. Voice and speech recognition has been used for identification purposes by banks too since 2014.

I’m curious how these systems work, how they are programmed, what corpus is used and which accents are taken into consideration. Why, because robots fascinate me, and I wonder if it will be possible to “ humanize” digital voices to such an extent that humanoid robots will appear more human than ever because of their voice production and recognition capabilities. It seems like a far cry from the days of speak and spell the kids speech synthesizer of the 80s, but it is looking increasingly more probable as advances in AI develop.

Developments have gone as far as Hiroshi Ishiguro’s Gemonoid HI-1 Android Prototype Humanoid Robot. Hiroshi is a Roboticist at Osaka University Japan, who create a Germaoid robot in 2010 that is a life size replica of himself. He used silicone rubber, pneumatic actuators, powerful electronics, and hair from his own scalp.

Gemonoid is basically a doppelganger droid which is controlled by a motion-capture interface. It can imitate Ishiguro’s body and facial movements, and it can reproduce his voice in sync with his motion and posture. Ishiguro hopes to develop the robot’s human-like presence to such a degree that he could use it to teach classes remotely, lecturing from home  while the Germonoid interacts with his classes at Osaka Univerisity.

You can see a demonstration of Gemonoid here