Now we are talking.

原文链接:http://econ.st/2iL7HZL

ANY sufficiently advanced technology, noted Arthur C. Clarke, a British science-fiction writer, is indistinguishable from magic. The fast-emerging technology of voice computing proves his point. Using it is just like casting a spell: say a few words into the air, and a nearby device can grant your wish.

The Amazon Echo, a voice-driven cylindrical computer that sits on a table top and answers to the name Alexa, can call up music tracks and radio stations, tell jokes, answer trivia questions and control smart appliances; even before Christmas it was already resident in about 4% of American households. Voice assistants are proliferating in smartphones, too: Apple's Siri handles over 2bn commands a week, and 20% of Google searches on Android-powered handsets in America are input by voice. Dictating e-mails and text messages now works reliably enough to be useful. Why type when you can talk?

This is a huge shift. Simple though it may seem, voice has the power to transform computing, by providing a natural means of interaction. Windows, icons and menus, and then touchscreens, were welcomed as more intuitive ways to deal with computers than entering complex keyboard commands. But being able to talk to computers abolishes the need for the abstraction of a “user interface” at all. Just as mobile phones were more than existing phones without wires, and cars were more than carriages without horses, so computers without screens and keyboards have the potential to be more useful, powerful and ubiquitous than people can imagine today.

Voice will not wholly replace other forms of input and output. Sometimes it will remain more convenient to converse with a machine by typing rather than talking (Amazon is said to be working on an Echo device with a built-in screen). But voice is destined to account for a growing share of people's interactions with the technology around them, from washing machines that tell you how much of the cycle they have left to virtual assistants in corporate call centres. However, to reach its full potential, the technology requires further breakthroughs-and a resolution of the tricky questions it raises around the trade-off between convenience and privacy.

Alexa, what is deep learning?

Computer-dictation systems have been around for years. But they were unreliable and required lengthy training to learn a specific user's voice. Computers' new ability to recognise almost anyone's speech dependably without training is the latest manifestation of the power of “deep learning”, an artificial-intelligence technique in
which a software system is trained using millions of examples, usually culled from the internet. Thanks to deep learning, machines now nearly equal humans in transcription accuracy, computerized translation systems are improving rapidly and text- to-speech systems are becoming less robotic and more natural-sounding. Computers are, in short, getting much better at handling natural language in all its forms.

Although deep learning means that machines can recognise speech more reliably and talk in a less stilted manner, they still don't understand the meaning of language. That is the most difficult aspect of the problem and, if voice-driven computing is truly to flourish, one that must be overcome. Computers must be able to understand context in order to maintain a coherent conversation about something, rather than just responding to simple, one-off voice commands, as they mostly do today (“Hey, Siri, set a timer for ten minutes”). Researchers in universities and at companies large and small are working on this very problem, building “bots” that can hold more elaborate conversations about more complex tasks, from retrieving information to advising on
mortgages to making travel arrangements. (Amazon is offering a $1m prize for a bot that can converse “coherently and engagingly” for 20 minutes.)

When spells replace spelling

Consumers and regulators also have a role to play in determining how voice computing develops. Even in its current, relatively primitive form, the technology poses a dilemma: voice-driven systems are most useful when they are personalised, and are granted wide access to sources of data such as calendars, e-mails and other sensitive information. That raises privacy and security concerns.

Consumers and regulators also have a role to play in determining how voice computing develops. Even in its current, relatively primitive form, the technology poses a dilemma: voice-driven systems are most useful when they are personalised, and are granted wide access to sources of data such as calendars, e-mails and other sensitive information. That raises privacy and security concerns.

To further complicate matters, many voice-driven devices are always listening, waiting to be activated. Some people are already concerned about the implications of internet-connected micro-phones listening in every room and from every smartphone. Not all audio is sent to the cloud-devices wait for a trigger phrase (“Alexa”, “OK, Google”, “Hey, Cortana”, or “Hey, Siri”) before they start relaying the user's voice to the servers that actually handle the requests-but when it comes to storing audio, it is unclear who keeps what and when.

Police investigating a murder in Arkansas, which may have been overheard by an Amazon Echo, have asked the company for access to any audio that might have been captured. Amazon has refused to co-operate, arguing (with the backing of privacy advocates) that the legal status of such requests is unclear. The situation is analogous to Apple's refusal in 2016 to help FBI investigators unlock a terrorist's iPhone; both cases highlight the need for rules that specify when and what intrusions into persona1 privacy are justified in the interests of security.

Consumers will adopt voice computing even if such issues remain unresolved. In many situations voice is far more convenient and natural than any other means of communication. Uniquely, it can also be used while doing something else (driving, working out or walking down the street). It can extend the power of computing to people unable, for one reason or another, to use screens and key-boards. And it could have a dramatic impact not just on computing, but on the use of language itself Computerised simultaneous translation could render the need to speak a foreign language irrelevant for many people; and in a world where machines can talk, minor languages may be more likely to survive. The arrival of the touchscreen was the last big shift in the way humans interact with computers. The leap to speech matters more.


单词:

  • daunting [dɔ:ntɪŋ]
    adj. 令人畏惧的; 使人气馁的; 令人怯步的;
    v. 威吓(daunt的现在分词); 使气馁; 使害怕; 使失去信心

  • accessible [ækˈsɛsəbəl]
    adj. 易接近的; 可理解的; 易相处的; 易感的;

  • sufficiently [səˈfɪʃəntlɪ]
    adv. 足够地,充分地; 十分,相当;

  • Indistinguishable [ˌɪndɪˈstɪŋɡwɪʃəbəl]
    adj. 难区分的,不能分辨的; 不易察觉的; 无特征的;

  • grant [grænt]
    vt. 授予; 承认; 同意; 准许;
    n. 拨款; 补助金; 授给物(如财产、授地、专有权、补助、拨款等);
    vi. 同意;

  • cylindrical [səˈlɪndrɪkəl]
    adj. 圆筒形; 圆柱形的,圆筒状的,气缸(或滚筒)的;

  • track [træk]
    n. 小路,小道; 痕迹,踪迹; 轨道,音轨; 方针,路线;
    vt. 跟踪; 追踪; 监看,监测;
    vi. 沿着轨道前进; 

  • resident [ˈrɛzɪdənt, -ˌdɛnt]
    adj. 定居的,常驻的;
    n. 居民; (旅馆的) 住宿者; 住院医师;

  • proliferate [prəˈlɪfəˌret]
    vi. 扩散; 激增; 增生,增殖;
    vt. 使激增; 使扩散;

  • reliably [rɪˈlaɪəblɪ]
    adv. 可靠地,确实地;

  • Intuitive [ɪnˈtu:ɪtɪv]
    adj. 直观的; 直觉的; 凭直觉获知的;

  • abolish [əˈbɑ:lɪʃ]
    vt. 消灭; 撤销; 废除,废止; 取消,革除;

  • carriage [ˈkærɪdʒ]
    n. 运费; 运输,输送;

  • ubiquitous  [juˈbɪkwɪtəs]
    adj. 无所不在的; 普遍存在的;

  • convenient  [kənˈvinjənt]
    adj. 方便的; [废语] 适当的;

  • destined  [ˈdɛstɪn]
    vt. 注定; 预定; 命定; 指定;

  • account  [əˈkaʊnt]
    n. 账,账目; 存款; 记述,报告; 理由;
    vi. 解释; 导致; 报账;
    vt. 认为; 把…视作;

  • Breakthrough [ˈbrekˌθru]
    n. 突破; 穿透; 重要技术成就; 炉衬烧穿;

  • tricky  [ˈtrɪki]
    adj. 狡猾的; 微妙的;

  • trade-off [ˈtredˌɔf, -ˌɑf]
    n. 权衡; 交易;

  • dictation  [dɪkˈteʃən]
    n. 命令; 口授,听写; 听写测验;

  • recognise  ['rekəgnaɪz]
    vt. 认可; 认出,

  • manifestation  [ˌmænəfɛˈsteʃən]
    n. 表示,显示; 示威;

  • culle [kʌl]
    vt. 精选; 挑选,剔除; 采,摘(花);
    n. 剔除; 拣出的东西;

  • transcription  [trænˈskrɪpʃən]
    n. 抄写; 抄本; 录音; 翻译;

  • robotic  [roʊˈbɑ:tɪk]
    adj. 机器人的; 自动的; 像机器人似的; 呆板的;

  • stilted  [ˈstɪltɪd]
    adj. (动作或言语) 生硬的,不自然的;

  • flourish [ˈflɜ:rɪʃ]
    vi. 挥舞; 茂盛,繁荣; 活跃,蓬勃;
    vt. 挥动,挥舞;
    n. 挥舞,挥动; 花样,

  • coherent  [koʊˈhɪrənt]
    adj. 连贯的; 一致的; 条理分明的; 清楚明白的;

  • elaborate  [ɪˈlæbəret]
    vi. 详尽说明; 变得复杂;
    vt. 详细制定; 详尽阐述;
    adj. 精心制作的; 精巧的; 

  • retrieve [rɪˈtriv]
    vt. 取回; 恢复; [计] 检索; 重新得到;
    vi. 找回猎物;
    n. 取回; 恢复,挽回; [计] 检索;

  • mortgage [ˈmɔ:rgɪdʒ]
    n. 抵押; 抵押单据,抵押证明; 抵押权,债权;
    vt. 抵押;

  • engagingly [ɪn'ɡeɪdʒɪŋlɪ]
    adv. 有吸引力地; 动人地,吸引人地;

  • regulator [ˈrɛɡjəˌletɚ]
    n. 校准者,[机]调整器,校准器,调节器

  • primitive  [ˈprɪmɪtɪv]
    adj. 原始的; 发展水平低的; 落后的;
    n. 原始人; 早期的艺术家

  • calendar [ˈkæləndɚ]
    n. 日历; 历法; 日程表;
    vt. 把…记入日程表中; 把…列入表中; 

  • trigger  [ˈtrɪɡɚ]
    n. (枪) 扳机; 起动装置,扳柄;
    vt. 引发,触发; 扣

  • overheard  [ˌoʊvərˈhɜ:rd]
    v. 偶然听到(overhear的过去式和过去分词);

  • capture [ˈkæptʃɚ]
    vt. 俘获; 夺取; 夺得; 引起(注意、想像、兴趣);
    n. 捕获; 占领; 捕获物; [计算机] 捕捉;

  • analogous [əˈnæləɡəs]
    adj. 相似的,可比拟的;

  • refusal  [rɪˈfjuzəl]
    n. 拒绝; 优先取舍权;

  • specify  [ˈspɛsəˌfaɪ]
    vt. 指定; 详述; 提出…的条件; 使具有特性;
    vi. 明确提出,详细说明;

  • Intrusion [ɪnˈtruʒən]
    n. 闯入; 打扰; (对某事的) 干扰; 干涉;

  • security [səˈkjʊrəti]
    n. 安全; 保证,担保; 保护,防护; 有价证券;
    adj. 安全的,保安的,保密的;

  • uniquely [jʊ'ni:klɪ]
    adv. 独特地,唯一地,珍奇地;

  • dramatic  [drəˈmætɪk]
    adj. 引人注目的; 戏剧的,戏剧性的; 激动人心的;

  • Irrelevant [ɪˈrɛləvənt]
    adj. 不相干的; 不恰当; 缺乏时代性的
    w

  • minor [ˈmaɪnɚ]
    adj. 较小的,少数的,小…; 未成年的;
    n. 未成年人; 副修科目; 小公司;
    vi. [主美国英语] 副修,选修,兼修;

  • advocates [ˈædvəˌket]
    vt. 提倡; 鼓吹; 拥护; 为…辩护;
    n. 提倡者; (辩护) 律师; 支持者;

  • justified [ˈdʒʌstəˌfaɪd]
    adj. 有正当理由的,合理的; 事出有因的;
    v. 调整; 证明…有道理

  • adopt [əˈdɑ:pt]
    vt. 收养; 采用,采取,采纳; 正式接受,接受; 批准;

  • extend [ɪkˈstɛnd]
    vt. 延伸; 扩大; 推广;
    vt. 延长; 伸展; 给予; 发出(邀请、欢迎等);
    vi. 延伸; 伸出; 增加;

  • dramatic [drəˈmætɪk]
    adj. 引人注目的; 戏剧的,戏剧性的; 激动人心的;

  • leap [lip]
    vi. 跳; 冲动的行动;
    vt. 跳过,跃过; 使跳跃;
    n. 跳跃,飞跃; 跳跃的距离;

长难句:

The Amazon Echo, a voice-driven cylindrical computer that sits on a table top and answers to the name Alexa, can call up music tracks and radio stations, tell jokes, answer trivia questions and control smart appliances; even before Christmas it was already resident in about 4% of American households.

崩溃了么,不慌。我们把这个句子肢解一下,主语是什么,The Amazon Echo,逗号后面是同位语a voice-driven cylindrical computer,还是在说这个东西吧,咱们再稍微花点心思看一下插入语部分是怎么层层修饰的,that sits on a table top and answers to the name Alexa 定语从句修饰a voice-driven cylindrical computer搁在桌子上,名字叫Alexa,类似苹果的语音助手叫 siri,微软的聊天机器人叫小冰喽。插入语往往两个逗号隔开,第二个逗号后面咱们去找谓语动词,第二个逗号后面找到了第一个谓语,再往后发现谓语是一堆谓语的并列结构,这句话的句子主干就是the Amazon Echo can all up ## , tell ##, answer and control ##,再看分号后面是另外一个分句,这个分句其实并不难,说的就是超过4%的美国家庭已经开始使用,但是注意特别的表达resident in(居住于),说的这家伙好像有生命一样。

Simple though it may seem, voice has the power to transform computing, by providing a natural means of interaction.

simple though it may seem,这个很多同学看起来很奇怪,那是因为它的语序不正常,though是从句引导词,though it may seem simple才是真正健康的适合我们素人理解的表达哈。
尽管看起来挺简单的,这里什么感觉啊,出现了让步对不对,让步之后按照正常人的逻辑一定会出现什么呀?一定会出现转折啊。当然有同学会说,哎呀,没有看见转折词啊,这就涉及到了中英互译时候的另一个策略了,补充缺失信息使得译语的表达更符合目标群众的表达习惯,这样说可能有些绕口,其实就是你自己在阅读文章的时候,一定要知道让步之后是会有转折的,即使没出现转折词,我们在做语言转换处理的时候也应该加上,即:尽管看似简单,但通过提供一种自然的互动方式,声音拥有变革computing的魔力······,是不是补充上了but,更加顺畅呢。这个其实在中英文互译中叫做增译,以后还会遇到很多这样的情况。


Windows, icons and menus, and then touchscreens, were welcomed as more intuitive ways to deal with computers than entering complex keyboard commands. But being able to talk to computers abolishes the need for the abstraction of a “user interface” at all.

第一句中出现了复杂比较结构,把Windows, icons and menus, and then touchscreens,
和entering complex keyboard commands进行比较,觉得前者是more intuitive ways to deal with computers。But转折之后的句子简单。
指出在于计算机交流的方式上,先是视窗,icons(图标)、菜单,之后是触摸屏,都因为比输入复杂的键盘命令更直观而备受欢迎,而语音说话彻底的消除了对“用户界面”这一抽象概念的需要,这里想要插一下,解释一下用户界面,在互联网的程序猿,设计师以及产品鲸鲤这个群体中,常称之为UI,也就是user interface的简称,广义上是指用户可以和计算机进行交互的硬件或软件,狭义上就是指软件中可见的外观及其底层与用户交互部分(也就是平常我们用一些软件的时候可以点来点去操作的部分)。


Just as mobile phones were more than existing phones without wires, and cars were more than carriages without horses, so computers without screens and keyboards have the potential to be more useful, powerful and ubiquitous than people can imagine today.

这个句式就是Just as···so,反应快的同学应该很快就能够看出,just as后面其实是在举例,为了引出或者说佐证so之后的观点,比如这句话就是就像手机远不止是没有线的电话,汽车远不止是没有马的车,接着so后面给出观点,指出没有了显示屏和键盘的电脑有潜力变得比人们今天所想象的更有用,更强大且无处不在。


Computers' new ability to recognise almost anyone's speech dependably without training is the latest manifestation of the power of “deep learning”, an artificial-intelligence technique in which a software system is trained using millions of examples, usually culled from the Internet.

Computers’ new ability是名词放在句首作主语,后面的不定式结构to recognize是后置定语。句子的就是我们标出的红色部分,Computers' new ability is the latest manifestation,manifestation后面的介词短语是修饰语。逗号后面是什么?逗号后面就是deep learning的同位语,进一步修饰AI technique的是一个定语从句,注意后面就是一层一层的修饰关系。到此为止,句子分析完了,可是有的同学说了,我知道他们之间的修饰关系,但是就是不知道这句话说得是什么,感觉特别混乱。好,这里我就来教教大家遇到长难句的时候,修饰比较多的时候,我们应该怎么理解句子。刚刚分析的时候已经讲过了,逗号之后就是在解释deep learning,我们首先看逗号之前的部分。Computers' new ability to recognise almost anyone's speech dependably without training is the latest manifestation of the power of “deep learning”,这半句其实就说了两个事情,首先你要搞清楚哪个是重要的,肯定是主干表达的信息更重要吧,也就是计算机的新能力是深度学习力量的最新体现,什么新能力,无需训练就能够识别几乎任何人的语言吧,我们刚刚说了,汉语要先把相对不重要的部分讲出来,也就是“无需训练就能够识别几乎任何人的语言,计算机的新能力是深度学习力量的深度体现”,有的同学说,啊,听着还是很奇怪,对,那是因为这两句话之间我们没有体现其关联性,首先中文表达要先出主语,主语后加不太重要的事情,包袱放在最后,因此调整一下就是“计算机无需训练即能识别几乎任何人的语言,”重点来了,如何体现前后分句的关联性,使用指代嘛“这一新能力是深度学习力量的最新体现”,好我们接着看后半句,an artificial-intelligence technique in which a software system is trained using millions of examples, usually culled from the internet. 这里面出现了同位语和定语从句,注意在处理的时候一定要体现关联性,深度思考是一种人工智能技术(同位语处理掉),这一技术(定语从句的关联体现)用通常来自互联网的数百万个范例来训练某一系统

下载PDF版

来源: http://econ.st/2iL7HZL


下载音频

ANY sufficiently advanced technology, noted Arthur C. Clarke, a British science-fiction writer, is indistinguishable from magic. The fast-emerging technology of voice computing proves his point. Using it is just like casting a spell: say a few words into the air, and a nearby device can grant your wish.

The Amazon Echo, a voice-driven cylindrical computer that sits on a table top and answers to the name Alexa, can call up music tracks and radio stations, tell jokes, answer trivia questions and control smart appliances; even before Christmas it was already resident in about 4% of American households. Voice assistants are proliferating in smartphones, too: Apple's Siri handles over 2bn commands a week, and 20% of Google searches on Android-powered handsets in America are input by voice. Dictating e-mails and text messages now works reliably enough to be useful. Why type when you can talk?

This is a huge shift. Simple though it may seem, voice has the power to transform computing, by providing a natural means of interaction. Windows, icons and menus, and then touchscreens, were welcomed as more intuitive ways to deal with computers than entering complex keyboard commands. But being able to talk to computers abolishes the need for the abstraction of a “user interface” at all. Just as mobile phones were more than existing phones without wires, and cars were more than carriages without horses, so computers without screens and keyboards have the potential to be more useful, powerful and ubiquitous than people can imagine today.

Voice will not wholly replace other forms of input and output. Sometimes it will remain more convenient to converse with a machine by typing rather than talking (Amazon is said to be working on an Echo device with a built-in screen). But voice is destined to account for a growing share of people's interactions with the technology around them, from washing machines that tell you how much of the cycle they have left to virtual assistants in corporate call centres. However, to reach its full potential, the technology requires further breakthroughs-and a resolution of the tricky questions it raises around the trade-off between convenience and privacy.

Alexa, what is deep learning?

Computer-dictation systems have been around for years. But they were unreliable and required lengthy training to learn a specific user's voice. Computers' new ability to recognise almost anyone's speech dependably without training is the latest manifestation of the power of “deep learning”, an artificial-intelligence technique in which a software system is trained using millions of examples, usually culled from the internet. Thanks to deep learning, machines now nearly equal humans in transcription accuracy, computerized translation systems are improving rapidly and text- to-speech systems are becoming less robotic and more natural-sounding. Computers are, in short, getting much better at handling natural language in all its forms.

Although deep learning means that machines can recognise speech more reliably and talk in a less stilted manner, they still don't understand the meaning of language. That is the most difficult aspect of the problem and, if voice-driven computing is truly to flourish, one that must be overcome. Computers must be able to understand context in order to maintain a coherent conversation about something, rather than just responding to simple, one-off voice commands, as they mostly do today (“Hey, Siri, set a timer for ten minutes”). Researchers in universities and at companies large and small are working on this very problem, building “bots” that can hold more elaborate conversations about more complex tasks, from retrieving information to advising on mortgages to making travel arrangements. (Amazon is offering a $1m prize for a bot that can converse “coherently and engagingly” for 20 minutes.)

When spells replace spelling

Consumers and regulators also have a role to play in determining how voice computing develops. Even in its current, relatively primitive form, the technology poses a dilemma: voice-driven systems are most useful when they are personalised, and are granted wide access to sources of data such as calendars, e-mails and other sensitive information. That raises privacy and security concerns.

Consumers and regulators also have a role to play in determining how voice computing develops. Even in its current, relatively primitive form, the technology poses a dilemma: voice-driven systems are most useful when they are personalised, and are granted wide access to sources of data such as calendars, e-mails and other sensitive information. That raises privacy and security concerns.

To further complicate matters, many voice-driven devices are always listening, waiting to be activated. Some people are already concerned about the implications of internet-connected micro-phones listening in every room and from every smartphone. Not all audio is sent to the cloud-devices wait for a trigger phrase (“Alexa”, “OK, Google”, “Hey, Cortana”, or “Hey, Siri”) before they start relaying the user's voice to the servers that actually handle the requests-but when it comes to storing audio, it is unclear who keeps what and when.

Police investigating a murder in Arkansas, which may have been overheard by an Amazon Echo, have asked the company for access to any audio that might have been captured. Amazon has refused to co-operate, arguing (with the backing of privacy advocates) that the legal status of such requests is unclear. The situation is analogous to Apple's refusal in 2016 to help FBI investigators unlock a terrorist's iPhone; both cases highlight the need for rules that specify when and what intrusions into persona1 privacy are justified in the interests of security.

Consumers will adopt voice computing even if such issues remain unresolved. In many situations voice is far more convenient and natural than any other means of communication. Uniquely, it can also be used while doing something else (driving, working out or walking down the street). It can extend the power of computing to people unable, for one reason or another, to use screens and key-boards. And it could have a dramatic impact not just on computing, but on the use of language itself Computerised simultaneous translation could render the need to speak a foreign language irrelevant for many people; and in a world where machines can talk, minor languages may be more likely to survive. The arrival of the touchscreen was the last big shift in the way humans interact with computers. The leap to speech matters more.

下载PDF版