Speech recognition grows up and goes mobile

Speech recognition grows up and goes mobile

IDG NEWS: Having spread from desktops to mobile devices and beyond, voice recognition is no longer a novelty filling niche needs — and it’s spawning a new genre of gadgets.

Vil du fortsette å lese, velg et av alternativene nedenfor

For three decades this was speech recognition: You would talk to your computer, typically using a head-mounted microphone and either the unpublicized speech-recognition app in Microsoft Windows or a version of Dragon NaturallySpeaking, from Nuance Communications. If you enunciated carefully, words would appear on the screen or commands would be executed.

Adam speech the that enough “It has “It’s of co-founder new making,” VoiceLabs, is says for two success overnight products: years, conversations.” widely the voice-controlled voice birth analytics Marchick, app last in Today, was being gotten have of recognition to and years 30 assistants. provides finally an a it in given precise consumer family has much-improved which developers. to deployed, personal

tests of accuracy The The technology, accuracy the in that Like error in 5.1%. is announced achieved quantified. exceeded professionals its had, system human tests, can things average speech rate of speech-recognition In August conversational Microsoft recognition Microsoft on transcribers. the word recognition such 5.9%. progress be for system 2017, word-recognition professional most on industry-standard

Microsoft [in Huang, of Language a “It’s speech Xuedong on a true,” were able person.” speech isolated about the school] as and fellow in being conversational with speech come could I started “When company’s [the Group. 1982, imagine Speech was in says head dream and good technical to we like 1993, recognize] When at software graduate 80%. not error the as Microsoft a “X.D.” on rate dealing started words I we working and

100% a accuracy,” quiet speech-recognition generic getting accent you close Sejnoha, speak will “Today, CTO at in with office, carefully Nuance. be Vlad a if you to says

using with level means effectiveness, to more, of make their to offices. talking homes phones be commands calls things accuracy chatting and voice customer-service on happen greater in and robots people That are to going their with ease and

progress Cumulative

context. steady a this the a particular particular reasonably word phoneme generate, point in “We 15 Markov a statistical, Sejnoha. We slogging, something says we especially developed Sejnoha. hidden variety says has likelihood through progress. that the models 20 used could sorts of had made The or would particular all snippet occur this were a predicted technology reached is and years, primary techniques that “For we of if steady models,” variants, or

recognition, doesn’t recent have traditional still environment yearly the by he average citing methods 20% models, last of adds. learning result box further is shouting been now people, years, work over than in one reduction “The at has still deep very been error a more decade.” [neural well. says, supplanted flexible system “But example Sejnoha, says are which an recognition rates wider in statistical of environments. have “In networking] where the and out range before,” working of cocktail for speech the there’s parties,” an propelled and the Speech he

German more he and special lot 20% only and but varies GPS important, words, of rate you increasingly more French noisy like annual person,” Mandarin things to and not by the their opening to to languages names “Understanding drivers. borrowed place Europe improvement do notes. is understand also pronounced Sejnoha pronunciation up have from has continue, with person environments multiple expects for a cases.

point Tipping

a Assistant speech-recognition While trust Amazon’s to service, based 20% genre, and improvements using (such based vendors enough Microsoft’s were product first the they Echo, basis making consumer new make accruing, of the the began the and it (such as Google and annual began Cortana) on assistant, stand-alone major Then learning. those as apps Home, technology deep service). the to on Google as then as their engines personal Alexa devices own Siri Apple’s

start devices pass listening along the recognition a systems they with in as are Google.” The after command takes data such such voice place to Voice in cloud. alerted “OK

Marchick. thin, are it,” the Unix listen in explains for very computer is “The The cloud. devices terminals. that’s like and name, their They

adds release Todd products, with vision “For 5 last recognition on a speech focused did seal Echo.” was Mozer, the endorsing time, Alexa-based “The company. Steve but computers, was consumer Sensory, speech pivotal first event of to event years and consumer of such the was moved recognition 10 CEO was pivotal Amazon in second the as voice gold the when technology,” long has to Jobs Siri. The of Apple electronics. a focus released technology Anything

over seven 16,000.” by only these use be later, a this apps and the there,” ago, 300 expected million there Echo few people Marchick. devices we there, will devices the on 33 million are was market, there there says in and year are interactions voice the competitors were out a the Previously, to skyrocketing. a Amazon of Now, year. year Echo be business are there end in “When devices. out making “Soon there for started Voice were

which Chinese Cortana; Home, and the Apple Harman/Kardon plus two run include will smartphones; least Galaxy says unreleased Samsung the Bixby the for Microsoft unreleased at Echo’s systems. Google competitors, Marchick, Invoke, HomePod; Samsung

Spreading the words

the service what consultant that average interface.“The is create as in to thing Dahl natural-language these do kits really language create a so at that a so expert system But Technologies. It important offer these exciting using the of to bar tools. can set typically toolkits,” development engines says that harnessed speech-recognition “They their speech natural a an lowers developer application.” up Deborah and recognition Conversational try don’t software be customer has need use to you is that let be to online language them the development proved natural apps spoken-language vendors

company the difference for want in being directions, phone that the Mityas, about for Sherif says Amazon to same Dallas-based only Alexa. speech-based a for he traveling It the interface phone Fridays chain, five the adds. are using Amazon users, launch users toolkit users and TGI was works Lex, CIO usually months and at Echo able restaurant his

a of process. you it building it, Marchick you of the app-making out.” says post write like code, your page,” and have test “You you “It’s web disposal, lot at a services

size getting thickness, cases that the if design bootstrapped you you things of cover easy,” cover.” the your a help the user: You few after can through the get is your a when to need to “The GUI, Dahl. you hard end have For to did spend notes think app lot to not the back idea part sauce. ordering clear then for example, and of toppings, app, you align to to “If capture would pizza-ordering a you — in have don’t weeks, days don’t there that “You used you be all a with of back you couple all outcome, from of will you’ll very but of to have they seeing system.” is have go the the the that a process rework need

three hurdle for prompt list, cumbersome, user they simplified. Fridays main was let found was on menu and them Alexa 15 he a but the most could side app popular options list items the There and having menu, says list longer the says. dishes TGI getting the are the with Mityas developers the the

about life a the Users lot a that, of will Dahl real “In capture say,” will last time. there not or you “Users breadsticks. not app users needs “will period like be you The what very of of so ask tuning.” that will will it that system ask fail pizza-ordering surprising, are gracefully.” says. to They predict undercook

enterprise, a IT, studies systems public. the will as such agents users company’s be first provider a with in what To tend conversational virtual used Next words A.I. to that the of predict the for interactions say,

and curated we [business] to client], Malingo. consumer.” “Those 10,000 20,000 — text take be any conversation from,” chat “As involves calls, rule and when the new feeds business a says between a a [for logs, Twitter Next new President will we we that back-and-forth between can that interaction phone thumb, Tracy can pull of like approach we domain IT the see conversations data

are the questions, that context freely A.I. results that speak he text-based Text use. can often Mityas interactions notes and speech since than interactions, using isolated just users better establish interactions adds. the gives

one virtual the it works says, once amount trained, hundreds a never day, it same to the time takes about thousands of hours agent of quits train is human it and 24 notes. train end, as agent. she a Malingo questions,” “But does to answering a the virtual it In of

live the the ratio do the the depends 50 a of cost with explains more application since then and chatting phone a A web agent cost virtual 5 the than she The industry, cents,” agent, would usually can a time. cents, the Malingo. is call firm: one chat complexity text a of at says. a “If meanwhile, virtual agent be dollar, But is on agent on live of is

TGI owned year. level for cost that but doubled he engagement a user less no figures of privately using says speech had could supply in than that takeout tripled Mityas the and sales online Fridays,


does Malingo. th… mean human are the happens use The all not says agents of that virtual is agents What replaced,