Speech and Language Processing Laboratory
- National Institute of Information and Communications Technology
- Edinburgh University
- Nagoya University
- National Institute of Informatics
- Tokyo Metropolitan University
- Techno-Speech, Inc.
Since 1 Apr. 2017~
Until 31 Mar. 2021
Prof. Keiichi Tokuda
- Prof. Keiichi Tokuda
- Prof. Lee Akinobu
- Assoc. Prof. Yoshihiko Nankaku
- Assoc. Prof. Daisuke Yamamoto
- Assoc. Prof. Shinji Sako
- Specially Appointed Assoc. Prof. Keiichiro Oura
- Specially Appointed Assoc. Prof. Kei Hashimoto
- Emeritus. Prof. Tadashi Kitamura
- Emeritus. Prof. Naohisa Takahashi
- Prof. for project of NITech Hisashi Kawai (National Institute of Information and Communications Technology)
- Prof. for project of NITech Steve Renals (Edinburgh University)
- Prof. for project of NITech Simon King (Edinburgh University)
- Assoc. Prof. for project of NITech Junichi Yamagishi (National Institute of Informatics)
- Assis. Prof. for project of NITech Shinji Takaki (National Institute of Informatics)
- Prof. for project of NITech Tomoki Toda (Nagoya University)
- Assis. Prof. for project of NITech Akira Tamamori (Nagoya University)
- Assis. Prof. for project of NITech Sayaka Shiota (Tokyo Metropolitan University)
- Assis. Prof. for project of NITech Kazuhiro Nakamura (Techno-Speech, Inc.)
- Assoc. Prof. for project of NITech Heiga Zen (Techno-Speech, Inc.)
A framework for providing services using GPS information, biological data, and other types of information obtained constantly via mobile devices like smartphones and small devices connected to the Internet (called the Internet of Things (IoT)) has been growing rapidly in recent years. At the same time, the collection of information obtained from individual users and the analysis of that data in its entirety as statistical information is making it possible to acquire new information that could not be obtained from stand-alone data. This newfound information is being used to improve services and create new services on a nearly daily basis.
However, despite the fact that speech is the most basic means of conveying information for human beings, there are practically no trials being held at present on the continuous collection, integration, and use of speech. A major reason for this is user rejection due to privacy concerns. On the other hand, voice-search services such as Google Voice Search from Google Inc. are already be used to record speech when the user performs a voice search with the aim of improving system performance. Consequently, once the idea that "collecting and integrating voice data can lead to better services" circulates among users and the actual convenience of voice-based services overcomes their hesitance, we can expect the constant collection and integration of all kinds of voice data to become socially acceptable. Under these conditions, it is important that we waste no time in considering how to appropriately use continuously collected voice data and resolve privacy issues. The aim here is to turn a massive amount of continuously amassed voice data into a valuable asset for all of mankind.
Humans and machines always hold a sound environment in common, so the possibility arises of extracting not just text but various types of information from speech and of analyzing that information to identify diverse actions and phenomena of individuals and society. This should lead to the provision of totally new and diverse voice services such as "voice retouching" and "voice telescope" (see attachment, p. 8, "Research Plans and Execution," Showcase). In this research topic, we give the name "Super Auditory Human" to technology that exceeds the normal voice-information processing ability of humans. This technology will enable the optimal use of diverse types of voice data that are continuously being collected and integrated in large quantities in an "ambient sound environment." Our objective in achieving Super Auditory Human technology is to provide extensive support for human-to-human and human-to-machine voice communication and to dramatically enhance voice communication abilities in human intellectual activities.
|Address||(NITech) Keiichi Tokuda|