Massachusetts Innovation Breakfast Friday, 20 November at 8:30am at the IBM Innovation Center in Waltham – Ronald Croen, CEO of Nuance, will be speaking November 19, 2009
Posted by HubTechInsider in events, VUI Voice User Interface.Tags: events, gatherings, meetups, networking, news, topics, VUI, waltham
1 comment so far
Massachusetts Innovation Breakfast kicks off this week with a Friday morning (8:30 a.m.- 10:00 a.m.) casual get-together at the IBM Innovation Center in Waltham.
There’s limited space so sign up now. (You must RSVP by the end of the business day Thursday to be part of the group.)
There will be a chance to look around the IBM Innovation Center, and there will be a special guest speaker, Ronald Croen, Founder, CEO and Chairman of Nuance Communications, who will be speaking on Entrepreneurship and Innovation in 2009 and Beyond: Massachusetts vs. Silicon Valley.
There is a Stereo mp3 audio transcription of this presentation, as well as the excellent question-and-answers session which followed, posted on this site below:
Entrepreneurship and Innovation in 2009 and Beyond: Massachusetts vs. Silicon Valley – Ronald A. Croen
There are some great videos of Ron Croen’s presentation available here.
What is the Mu-Law PCM voice coding standard used in North American T-Carrier telecommunications transmission systems? June 8, 2009
Posted by HubTechInsider in Definitions, Telecommunications, VUI Voice User Interface.Tags: Bell Telephone Company, IVR, Paul Seibert, Public switched telephone network, Pulse Code Modulation, Telecommunication, Telecommunications, Telephone, VoIP, VUI
1 comment so far
Mu-Law encoding is the PCM voice coding standard used in Japan and North America. It is a companding standard, both compressing the input and expanding the data upon opening after transmission. Mu Law is a PCM (Pulse Code Modulation) encoding algorithm where the analog voice signal is sampled eight thousand times per second, with each sample being represented by eight bits, thus yielding a raw transmission rate of 64 Kps. Each sample consists of a sign bit, a three bit segment which specifies a logarithmic rqange, and a four bit step offset into the range. The bits of the sample are inverted before transmission. A Law encoding is the voice coding standard which is used in Europe.
Want to know more?
You’re reading Boston’s Hub Tech Insider, a blog stuffed with years of articles about Boston technology startups and venture capital-backed companies, software development, Agile project management, managing software teams, designing web-based business applications, running successful software development projects, ecommerce and telecommunications.
About the author.
I’m Paul Seibert, Editor of Boston’s Hub Tech Insider, a Boston focused technology blog. You can connect with me on LinkedIn, follow me on Twitter, even friend me on Facebook if you’re cool. I own and am trying to sell a dual-zoned, residential & commercial Office Building in Natick, MA. I have a background in entrepreneurship, ecommerce, telecommunications and software development, I’m the Senior Technical Project Manager at eSpendWise, I’m a serial entrepreneur and the co-founder of Tshirtnow.net.
More Articles From Boston’s Hub Tech Insider:
- Twelve Tips For Agile Project Planning and Estimating
- Eight ways to tell if your Project Team is on the Way Up, or on the Way Down
- The Twenty Laws of Testing Computer Software
- Why Designing for a VUI is harder than designing for a GUI
- The Hub Tech Insider Glossary of Mobile Web Terminology
- The Hub Tech Insider Glossary of Stock Options Terminology
- How many Stock Options should executives at a startup be granted?
- Agile Development In Practice
- What is ‘Management By Walking Around’?
- Boston Area Video Game Companies
- Demandware eCommerce
- How to expand your professional network on LinkedIn
- How to use LinkedIn in your job search
- Twitter and network effects
- How much bandwidth does a smartphone use? How much bandwidth does an Apple iPad use? How much bandwidth does an Apple iPhone use?
- What is Scrum?
- What is a “Use Case”?
- What is a “User Story”?
- What is Indirect Spend?
- What is EDIINT? What is AS2, AS1, AS3 and AS4?
What is the frequency response of the North American Public Switched Telephone Network? June 3, 2009
Posted by HubTechInsider in Telecommunications, VoIP, VUI Voice User Interface, Wireless Applications.Tags: Frequency, Hearing range, Hertz, IVR, Public switched telephone network, Telecommunications, VoIP, VUI
1 comment so far
The conventional North American Public Switched Telephone Network, or PSTN, has a frequency response range of 300 Hz to 3,400 Hz. The normal hearing range of humans is typically 30 Hz to 20,000 Hz. So the conventional telephone transmission system is unable to carry bright, high-frequency and deep, low-frequency tones.
But, somewhat surprisingly, because our ears are so used to hearing poor-quality audio over the telephone, our brains actually “fill in” the missing frequencies. As an example, the crisp “s” sound in the word “Christmas”. So in effect, the telephone audio often sounds better than it actually is to us.
More Articles From Boston’s Hub Tech Insider:
- Twelve Tips For Agile Project Planning and Estimating
- Eight ways to tell if your Project Team is on the Way Up, or on the Way Down
- The Twenty Laws of Testing Computer Software
- Why Designing for a VUI is harder than designing for a GUI
- The Hub Tech Insider Glossary of Mobile Web Terminology
- The Hub Tech Insider Glossary of Stock Options Terminology
- How many Stock Options should executives at a startup be granted?
- Agile Development In Practice
- What is ‘Management By Walking Around’?
- Boston Area Video Game Companies
- Demandware eCommerce
- How to expand your professional network on LinkedIn
- How to use LinkedIn in your job search
- Twitter and network effects
- How much bandwidth does a smartphone use? How much bandwidth does an Apple iPad use? How much bandwidth does an Apple iPhone use?
- What is Scrum?
- What is a “Use Case”?
- What is a “User Story”?
- What is Indirect Spend?
- What is EDIINT? What is AS2, AS1, AS3 and AS4?
- The History of Cell Phones (socyberty.com)
An explanation of the Nyquist Theorem and its importance to Mu-Law Encoding in North American T-Carrier Telecommunications Systems June 2, 2009
Posted by HubTechInsider in Definitions, Fiber Optics, Mobile Software Applications, Telecommunications, VUI Voice User Interface, Wireless Applications.Tags: IVR, Telecommunications, VoIP, VUI
add a comment
The Nyquist theorem established the principle of sampling continuous signals to convert them to digital signals. In communications theory, the Nyquist theorem is a formula stating that two samples per cycle is all that is needed to properly represent an analog signal digitally. The theorem simply states that the sampling rate must be double the highest frequency of the signal. So, for example, a 4KHz analog voice channel must be sampled 8000 times per second. The Nyquist Theorem is the mathematical underpinning of the Mu-Law encoding technique used in T-Carrier transmission systems. T-Carrier is used in North American telecommunications networks. In Europe, where E-carrier transmission systems are used, a similar but incompatible theorem, Shannon’s Law, is used in the European A-Law encoding technique. This is the reason why Mu-Law encoding is used in North America and A-Law encoding is used in Europe.
The author of the Nyquist Theorem was named Harry Nyquist. Harry worked in the research department at AT&T and later at Bell Telephone Laboratories. In 1924, he published a paper titled “Certain Factors Affecting Telegraph Speed”, which analyzed the correlation between the speed of the telegraph system and the number of signal values it used. Harry refined his paper in 1928, when he republished his work under the title “Certain Topics in Telegraph Transmission Theory”. It was in this paper that Harry expressed the Nyquist Theorem, which established the principle of using sampling to convert a continuous analog signal into a digital signal. Claude Shannon, the author of Shannon’s Law, cited both of Nyquist’s papers in the first paragraph of his classic paper “The Mathematical Theory of Communication”. Harry Nyquist is also known for his explanation of thermal noise, also sometimes known as “Nyquist noise” as well as AT&T’s 1924 version of a fax machine, called “telephotography”.
His remarkable career included advances in the improvement of long-distance telephone circuits, picture transmission systems, and television. Dr. Nyquist’s professional, technical, and scientific accomplishments are recognized worldwide. It has been claimed that Dr. Nyquist and Dr. Claude Shannon are responsible for virtually all the theoretical advances in modern telecommunications. He was credited with nearly 150 patents during his 37-year career. His accomplishments underscore the excellent preparation in engineering that he received at the University of North Dakota. In addition to Nyquist’s theoretical work, he was a prolific inventor and is credited with 138 patents relating to telecommunications.
Want to know more?
You’re reading Boston’s Hub Tech Insider, a blog stuffed with years of articles about Boston technology startups and venture capital-backed companies, software development, Agile project management, managing software teams, designing web-based business applications, running successful software development projects, ecommerce and telecommunications.
About the author.
I’m Paul Seibert, Editor of Boston’s Hub Tech Insider, a Boston focused technology blog. You can connect with me on LinkedIn, follow me on Twitter, even friend me on Facebook if you’re cool. I own and am trying to sell a dual-zoned, residential & commercial Office Building in Natick, MA. I have a background in entrepreneurship, ecommerce, telecommunications and software development, I’m the Senior Technical Project Manager at eSpendWise, I’m a serial entrepreneur and the co-founder of Tshirtnow.net.
Why designing for a VUI is more difficult than designing for a GUI May 11, 2009
Posted by HubTechInsider in Mobile Software Applications, VoIP, VUI Voice User Interface, Wireless Applications.Tags: IVR, Mobile Applications, mobile software, mobile web, Telecommunications, VoIP, VUI
4 comments
Despite the fact that many Automated telephony and IVR vendors advertise that their web-based SaaS offerings can seemingly make the development, testing, deployment and maintenance of an IVR application seem easy and straightforward, this over-confidence in the VUI design abilities of untrained, non-technical business analysts and enterprise services managers is woefully misplaced. This mistaken impression is borne out by the simple fact that just because a software tool may be easy to use (even though all of these SaaS web-based vendors provide VUI tools with horrific interfaces and GUI designs, such as reliance on stone-age Java applets) only cursory thought, if any thinking at all, has been invested into how these untrained resources should use that tool. This can and often does lead to catastrophic results.
I frequently encounter the mistaken prevailing notion that designing a VUI consists of nothing more than taking a GUI and “simplifying it” for use on the telephone. As the thinking goes, we can all talk on the telephone; Not all of us can navigate a complex forms-based web site. But despite this mistaken general impression (perpetuated by IVR and automated telephony vendors and many software development teams within them, as well as their clients), some basic realities persist in shattering these ill-conceived concepts: People can read faster than they can listen with comprehension, speak faster than they can type, and talk much more quickly than they can process the meaning behind spoken words. So even though, based on initial impressions, designing an effective VUI might seem easier than designing a first-rate GUI, the converse is true: designing a great VUI is far more difficult than designing a GUI.
A VUI is inextricably linked with Time
When a user is navigating a GUI, they can read text at any location on the web page or application screen. The user can skip ahead visually to the section they are interested in. With a VUI, the user is a “prisoner” of the VUI design. The attention is captive: they must listen with (or without) patience to each word before they can hear the one that follows it. With this in mind, some best practices for VUI design emerge:
1. Long prompts are Bad: The longer the prompt, the more the user’s patience is being taxed. Introductory or “tutorial” prompts explaining how the system works may be required for an outbound IVR application or alternatively provided for the benefit of novice users, however they should not be forced upon returning visitors or outbound IVR call recipients that have received similar IVR communications in the past.
2. Long VUI menus are Bad: Again to use the GUI as a contrasting example, on a web page you can present many menu options to the user, even hiding numerous options in a drop-down menu. A VUI menu, on the other hand, should never exceed five or six items at the most.
3. Get to the gist of the communication quickly: Forcing your captive “audience” to listen through introductory marketing copy written into an outbound IVR or inbound VRU script will become annoying very quickly to the user. Script your important information into the beginning of your prompts.
4. Allow ‘barge-in’: Expert users who know how to use the system and know what they want to do desire the ability to speed up the automated interaction with the system. Allow them to issue their commands to the system without forcing them to wait for the system to finish talking.
5. Give expert users global hotwords: Global “hotwords”, or application-level shortcuts, allow users to “cut to the chase”, enabling them to cut through menus and enjoy the feeling of enablement that a responsive VUI system can provide.
6. Allow the user to pause the interaction: The GUI has another crucial advantage over the VUI – the ability to stop and start again exactly where you left off after an indeterminate interval. While providing the exact same level of interaction control to the user is impossible in a VUI, if within your VUI design you are asking the user to provide the system with a membership number in a COB (Coordination of Benefits) automated telephony call for a health care provider, or asking them for their account number in an inbound VRU application, or if the system wants the user to write down a confirmation code or other information, then design your VUI so that the call recipient or caller can get their pencil and paper ready, find their membership card, and say “continue” when they are ready.
The One-way Temporal Flow of the User
Of course, the spoken word is not only temporally linear, but also one-way. In the same manner in which time is a “one way street”, so is speech a “one way medium”. When you are listening to a prerecorded voice prompt, you can’t easily hit the nonexistant rewind button on your telephone. A VUI is not like watching a ball game on your DVR or Tivo, either. You can’t easily go back and listen to the prompt again. This is in stark contrast to the GUI world, where the user can jump back-and-forth within the text on the page or screen. Three simple techniques can help to alleviate this conundrum:
1. Always let the user ask to have the system repeat the prompt: Perhaps the most elementary technique to mitigate the one-way temporal flow of the user is to have the system offer to repeat the last prompt. The user must be made aware of the fact that they can have any prompt repeated to them at any time during the IVR interaction.
2. Make Help available to the user: Information or instructions that are crucial to the task completion ability of the call recipient or caller presented at the beginning of the interaction must be made available to the user at any point in the IVR interaction. Offer help to the user not only at the beginning of the call but also at moments where the user seems to have arrived at an impasse in the interaction. The need to offer help to the user is acute at “no input”, “Out of Grammar (OOG)” or “no match” states.
3. Present a summation of the gathered data: In form-filling dialogs or IVR interactions where the caller is being asked to provide information to the system, a marvelous approach to overcome the one-way temporal flow nature of the IVR interaction is to offer the call recipient or caller a summation of the data that has been gathered from them during the course of the IVR interaction so far.
Persistence in a VUI is not visible to the user as in a GUI
Callers or call recipients perhaps show the most frustration when they feel they have lost track of “where they are” in the course of traversing a scripted IVR inbound or outbound interaction. Aggravation mounts as the user becomes increasingly unsure of what to do next, and what the system expects the user to do next. Whereas a web page or application screen typically provides a multitude of visual ques, such as a menu tree, “breadcrumb” navigation path, or something similar, even something as simple and effective as a URL web address window on a browser is unavailable in the VUI world. Some approaches to mitigate these factors emerge to the experienced VUI designer:
1. Auditorily “Announce” the user’s position in the IVR exchange: In the same manner that a properly designed web page or application screen will tell the caller or call recipient where they are in terms of navigating a site or application, so should a well-designed voice interface let the user know their exact position in the IVR interaction. A simple and efective technique for providing the user with such “mental markers” is to use a word or two to announce this position to the user: “Main Menu”…”Here’s the drugs in your prescription refill:”, etc.
2. Audio breadcrumbs: The VUI version of the “breadcrumb navigation” trails that are featured so prominently on web sites in the GUI world can be emulated in the VUI world, where they prove no less useful. Each “voice page” that requires interaction with the user can be associated with a “position page” that announces the user’s position within the dialog tree. “Prescription, Reorder, Address”, as an example, would very nicely indicate to the user that they chose “prescriptions”, then “Reorder”,a nd are now confirming their prescription reorder address on file with the system. A “Go Back” provision or option should be offered to users at these “position page” states.
3. Audio Icons: Auditory icons, or “earcons”, are VUI equivalents of the GUI’s icons. These audio icons can be extremely useful to both the VUI designer as well as the call recipient or caller by either annoucing to the user that a particular action is about to be undertaken or positioning the user within a IVR menu structure or transaction path. “Wait audio”, or sounds played to the user to indicate that the system is busy performing a record lookup or other function can prevent the user from interpreting a system crash or IVR interaction end when faced with an absolute extended silence.
GUIs present one fundamental advantage over VUIs: the user navigating a web page or an application screen has control over the medium, the message, and the interaction itself. Although a poor GUI can make the user feel helplessly confused, a VUI faced with the challenges outlined above has to be near-perfect to prevent the user abandoning the IVR interaction entirely by the simple and universal act of hanging up the telephone. VUI designers should always be aware of the significant differences between designing an effective and useful GUI and VUI. It would be ill-advised to enter into a VUI design task or project of any size while carrying into the endeavor the familiar GUI design assumptions.
Want to know more?
You’re reading Boston’s Hub Tech Insider, a blog stuffed with years of articles about Boston technology startups and venture capital-backed companies,software development, Agile project management, managing software teams, designing web-based business applications, running successful software development projects, ecommerce and telecommunications.
About the author.
I’m Paul Seibert, Editor of Boston’s Hub Tech Insider, a Boston focused technology blog. You can connect with me on LinkedIn, follow me on Twitter, even friend me on Facebook if you’re cool. I own and am trying to sell a dual-zoned, residential & commercial Office Building in Natick, MA. I have a background in entrepreneurship, ecommerce, telecommunications andsoftware development, I’m the Director, Technical Projects at eSpendWise, I’m a serial entrepreneur and the co-founder of Tshirtnow.net.