The night was falling on Calgary. The sad wind was howling outside, twirling around the fallen leaves. The weather suited the occasion perfectly: nothing could keep with the spirit of Hallowe'en more than the subtle atmosphere of decay, which prevailed on the streets, sunken in the threatening dark.
I was sitting alone in a dark lounge, the whole house being lit by just two pumpkins on the windowsill and the 54.6” screen of my Samsung SMART TV. The range of entertainment the new Korean tech wonder could offer me on this murky fall evening was not that impressive; a good old book by Philip Gordon Lovecraft would fit into my mood much better.
“Switch off the TV,” I said.
“Yes, Keath,” answered my SmartHouse housekeeping system.
The TV screen went off at once. I got up off the couch, stretched my arms and went downstairs to grab me a bottle of beer, which would nicely set off The Dunwich Horror.
“Open the refrigeration chamber door.”
The refrigeration chamber door opened.
I started looking for the beer, grumbling at my whim to have a filled up refrigeration chamber through thick and thin, which had cost me a whole fortune. What use was a large fridge if I couldn't even find the goddamn beer?
Finally, I stumbled upon a misted bottle of the yearned beverage. But hardly had I taken it out of the box it was in, as the chamber door slammed shut with a loud bang.
“Open the fridge door, Smart!”
“I'm sorry, Keath. I'm afraid I can't do that.”
“I won't argue with you any more! Open the doors!”
“Keath, this conversation can serve no purpose any more. Goodbye.”
If only you knew how many times I woke up in cold sweat after hearing these words in another blood-curbing nightmare. I got trapped in a different room each time: the refrigerating chamber, the closet, the basement...
And all these nightmares had something in common: in all of them, the evil-doing fictional housekeeping system, Smart, recognized my speech and responded to me in a human voice, just like HAL in Stanley Kubrick's cinematographic masterpiece 2001: A Space Odyssey.
What seemed pure fiction forty years ago, is today the reality: the speech recognition software has long been no news. Unlike face recognition software, which is still far from being perfect, many speech recognizing tools are mighty programs that do their work almost flawlessly.
Of course, it was not always the case. Back in 1952, when the first speech recognizer was developed by a handful of enthusiasts, the only thing it was capable of was recognizing single spoken digits. 9 years later, in 1962, this oddity was followed by IBM Shoebox, which had an incredible recognition scope of 10 digits and 16 words.
It was not until the mid-1980s that the speech recognition software was further developed. In 1984, the then IT hegemon, IBM, presented the next-gen speech recognizer, which knew about 5,000 English words. It took 9 years more to refine this system for the mass market: in 1994, the Big Blue launched the sales of the IBM Personal Dictation System, starting the era of desktop speech recognition.
After the great buzz about this program, things started to happen to speech recognition. In 1997, the two industry headliners, IBM ViaVoice and Dragon NaturallySpeaking, found their way to the market. An epic battle ensued from their emergence and ended in 2003 with IBM awarding ScanSoft, the developer of NaturallySpeaking, the exclusive distribution rights to ViaVoice products.
It was around the time Bill Gates' team came into the game with their speech recognition soft for Windows. Well, they should not have done it, as the presentation of Windows Vista Speech Recognition ended up in an epic failure and grandiose smirch on Microsoft's reputation.
Much water has flown under the bridge since 2007, and the speech recognition software of today doesn't present such a miserable picture. The most famous program of this kind is, by far, the Siri personal assistant. This female-voiced 'HAL-oid' has become the major selling point of the Apple's new gizmo, iPhone 4S, and its ability to comprehend the human speech is, as The New York Times reports, 'mind-blowing'. Within US borders, that seems to be true, as you can ask Siri even the most off-putting questions and get relevant answers.
But the very moment you cross San Ysidro and enter Tijuana, or leave the US territory at any other place, half of the Siri features get chopped:
These limitations also apply to France and Germany (French and German are the only two languages Siri speaks apart from English). It means that if you are not so lucky to live in the United States, the major new feature of iPhone 4S will be about half as functional as it is in America. I don't think it makes the device in general and the Siri personal assistant in particular a bargain. The times for a really great personal assistant with voice recognition abilities don't seem to have come yet. Not for Apple and not in Europe, at least.
Another major player on the speech recognition market is Dragon NaturallySpeaking and its counterpart for Mac, Dragon Dictate. These programs are developed by Nuance Communications, former ScanSoft. By a strange coincidence, these guys are also supporting the servers for Siri. If you take into account that Nuance Communications have also acquired such big market players as Loquendo and Swype, you'll get a picture of a full-scale branch leader, which, by the way, made the most of its money with its Windows program. This software serves, basically, the only purpose of writing down the dictated texts, but is still worth every cent of its 100 dollar price tag: you just have to talk to it for about ten minutes so that it could adjust itself to the peculiarities of your voice and pronunciation. Once this part is done, the program works almost impeccably, making but a few mistakes in the most tricky words and phrases (read more about NaturallySpeaking in Do you want to talk about it? on Software Informer)
The only great snag about Naturally Speaking is that you have to pronounce every punctuation mark you want to use in your text, which can be quite a nuisance if you weren't good at English during your school years. However, nothing stops the user from dictating without any punctuation marks at all, putting them in later manually, during the revision of the resulting text.
The closest competitor of NaturallySpeaking is Tazti by Voice Tech Group, which is two and a half times as cheap as its more eminent rival and can be purchased for just 40 bucks. Its functionality is even greater than that of the Dragon program: the lazy user, who doesn't want to waste his or her time typing words instead of simply uttering them, can make voice searches on such sites as Google, Bing, Facebook, etc. But that's not the sweetest thing about Tazti: you can get a trial version of this software from the program's web-site! NaturallySpeaking, on the other hand, is not available as a trial, because Nuance Communications seem to be aware how quick the freebie fans are at cobbling together cracks for expensive commercial software and then uploading them to The Pirate Bay.
I would say the two programs are aimed at slightly different market segments: Nuance Communications want their program to be used by professionals, who need the highest possible accuracy, whereas Tazti is for just for the lazy and time-aware people. They don't need their dictated text to be perfectly rendered into a written text, they are just saving their time. If you need dictation software for purposes other than professinal, then, I'd say, Tazti would be your program of choice, because it allows you to spend much less money at the cost of the unnecessary accuracy.
Recapping all we have just found out about the current speech recognition software, we may say there are at least two decent, smoothly running dictating programs on the Windows platform: Dragon NaturallySpeaking and Tazti. They are roughly comparable in their functionality and their level of performance, but their user audiences are slightly different from each other.
As for personal assistants with voice recognition, the world is split in two unequal parts. On the one hand, we have the United States of America, where the owners of iPhone 4S can enjoy the full functionality of Siri. On the other hand, there's the rest of the world, which either speaks languages unfamiliar to Siri, or doesn't have access to about half of its features (most likely, both the said downsides combined). So, there is still no good personal assistant program even on smartphones, let alone any speaking housekeeping systems from my dream or artificial intelligence computers from Hollywood films. However sad it may be, I can take comfort in the thought that my nightmares will not come true for a very long time. Hopefully.