Home
Invited Speakers
Important Dates
Submissions
Program
Venue
Registration
Registrant Only
Contact
Supported by
TTIJ.jpg   TTIC.jpg
   AIP.jpg     AIRC.png
osaka-u.jpg
Additional cooperation from
ISM2.jpg tokyotech2.png

Fifth International Workshop on Symbolic-Neural Learning (SNL-2021)

June 29-July 2, 2021
Online

Keynote Talks:

  1. June 29 (Thursday), JST 9:00-10:00

    In So Kweon (Korea Advanced Institute of Science and Technology)

    Towards Diverse and Robust High-level Scene Understanding

    Abstract:
    High-level scene understanding is a task to describe the content in a scene with a natural sentence, which can be a core technique for extensive practical vision & language applications, such as language-based image search or helping visually impaired people. One of the fundamental challenges in high-level scene understanding, especially in image captioning, is the low diversity of the captions generated from the models. In this talk, we explore several possible factors that harm the diversity of the caption generation, such as data bias and lack of data. For example, we find that any high-level scene understanding model might suffer from dataset bias due to the complexity of the labels, such as the combination of words. Also, constructing human-labeled datasets for high-level scene understanding frameworks is hugely laborious and time-consuming. Then, we introduce our solutions to address each of the issues step-by-step, including our novel relational captioning task and semi-supervised image captioning framework, by comparing our methods with existing approaches. First, we introduce a novel dense relational image captioning task, a new image captioning task that generates multiple captions grounded to relational information between objects in an image. This novel image captioning framework can provide significantly dense, diverse, rich, and informative image representation. We introduce several applications of our framework as an application, including "caption graph" generation and sentence-based image region-pair retrieval tasks. Next, we propose a novel framework for training an image captioner with the unpaired image-caption data and a small amount of paired data. We also devise a new semi-supervised learning approach by the novel usage of the GAN discriminator. We theoretically and empirically show the effectiveness of our method in various challenging image captioning setups, including our scarcely-paired COCO dataset, compared to strong competing methods.

    Bio:
    Professor In So Kweon received the B.S. and the M.S. degrees in Mechanical Design and Production Engineering from Seoul National University, Korea, in 1981 and 1983, respectively, and the Ph.D. degree in Robotics from the Robotics Institute at Carnegie Mellon University in 1990. He worked for Toshiba R&D Center, and joined KAIST in 1992. He is a KEPCO Chair Professor of School of Electrical Engineering and had been the director for the National Core Research Center - P3 DigiCar Center at KAIST (2010~2017). His research focuses on Computer Vision and Robotics. He has published 3 research books, more than 500 papers in leading journals and conference proceedings, including 100+ in prestigious CVPR, ICCV, ECCV and etc. He is also active in professional service. Currently, he is the President of Asia Federation of Computer Vision (AFCV). He served for the Editorial Board of International Journal of Computer Vision for ten years since 2005. He has also organized 5 international conferences either as a general chair or a program chair, including IEEE-CVF ICCV 2019. He was awarded the Grand Prize for Academic Excellence in celebration of the 50th anniversary of the university's founding in 2021 and the Faculty Research Excellence Award by KAIST in 2016, and conferred a Prime Minister Award by the Korean Government for his contribution on DRC-HUBO+ to win the DARPA Robotics Challenge Finals in 2015.

  2. June 30 (Wednesday), JST 18:00-19:00

    Jure Leskovec (Stanford University)

    Reasoning with Language and Knowledge Graphs

    Abstract:
    Knowledge can be implicitly encoded in large language models pre-trained on unstructured text or explicitly represented in structured knowledge graphs, such as Freebase or ConceptNet, where entities are represented as nodes and relations between them as edges. Language models have a broad coverage of knowledge, but they do not perform well on structured reasoning tasks, KGs, on the other hand, are more suited for structured reasoning, but may lack coverage and be noisy. In this talk I will discuss recent advancements in combining the strengths of language models and knowledge graphs for common-sense question answering as well as complex logical reasoning over knowledge graphs.

    Bio:
    Jure Leskovec (Personal homepage) is Associate Professor of Computer Science at Stanford University, Chief Scientist at Pinterest, and investigator at Chan Zuckerberg Biohub. Dr. Leskovec was the co-founder of a machine learning startup Kosei, which was later acquired by Pinterest. His research focuses on machine learning and data mining large social, information, and biological networks. Computation over massive data is at the heart of his research and has applications in computer science, social sciences, marketing, and biomedicine. This research has won several awards including a Lagrange Prize, Microsoft Research Faculty Fellowship, the Alfred P. Sloan Fellowship, and numerous best paper and test of time awards. It has also been featured in popular press outlets such as the New York Times and the Wall Street Journal. Leskovec received his bachelor's degree in computer science from University of Ljubljana, Slovenia, PhD in machine learning from Carnegie Mellon University and postdoctoral training at Cornell University.

  3. July 1 (Thursday), JST 9:00-10:00

    Kyunghyun Cho (New York University)

    Prompt Engineering in GPT-3 - Can we actually do it?

    Abstract:
    Pretrained language models (LMs) perform well on many tasks even when learning from a few examples, but prior work uses many held-out examples to tune various aspects of learning, such as hyperparameters, training objectives, and natural language templates (``prompts''). Here, we evaluate the few-shot ability of LMs when such held-out examples are unavailable, a setting we call \textit{true few-shot learning}. We test two model selection criteria, cross-validation and minimum description length, for choosing LM prompts and hyperparameters in the true few-shot setting. On average, both marginally outperform random selection and greatly underperform selection based on held-out examples. Moreover, selection criteria often prefer models that perform significantly worse than randomly-selected ones. We find similar results even when taking into account our uncertainty in a model's true performance during selection, as well as when varying the amount of computation and number of examples used for selection. Overall, our findings suggest that prior work significantly overestimated the true few-shot ability of LMs given the difficulty of few-shot model selection.

    Bio:
    Kyunghyun Cho is an associate professor of computer science and data science at New York University and CIFAR Fellow of Learning in Machines & Brains. He was a research scientist at Facebook AI Research from June 2017 to May 2020 and a postdoctoral fellow at University of Montreal until Summer 2015 under the supervision of Prof. Yoshua Bengio, after receiving PhD and MSc degrees from Aalto University April 2011 and April 2014, respectively, under the supervision of Prof. Juha Karhunen, Dr. Tapani Raiko and Dr. Alexander Ilin. He tries his best to find a balance among machine learning, natural language processing, and life, but almost always fails to do so.

  4. July 2 (Friday), JST 9:00-10:00

    Abhinav Gupta (Carnegie Mellon University/Facebook AI Research)

    Towards Self-supervised Curious Robots

    Abstract:
    In the last decade, we have made significant advances in the field of computer vision thanks to supervised learning. But this passive supervision of our models has now become our biggest bottleneck. In this talk, I will discuss our efforts towards scaling up and empowering visual and robotic learning via self-supervised learning. First, I will then describe how self-supervised learning can be used for passive visual representation learning. I will describe some of the pitfalls of recent approaches and ways to overcome them. Next, I will discuss how embodiment is crucial for self-supervised learning -- our agents live in the physical world and need the ability to interact in the physical world. Towards this goal, I will finally present our efforts in large-scale learning of embodied agents in robotics. Next, I will discuss how we can move from passive supervision to active exploration -- the ability of agents to create their own training data. One of the critical aspect of human-learning is ability to build and reuse knowledge. Inspired by this, I will finally present an approach to use knowledge-graph for learning to generalize in task-oriented grasping.

    Bio:
    Abhinav Gupta is an Associate Professor at the Robotics Institute, Carnegie Mellon University and Research Manager at Facebook AI Research (FAIR). Abhinav's research focuses on scaling up learning by building self-supervised, lifelong and interactive learning systems. Specifically, he is interested in how self-supervised systems can effectively use data to learn visual representation, common sense and representation for actions in robots. Abhinav is a recipient of several awards including IAPR 2020 JK Aggarwal Prize, ONR Young Investigator Award, PAMI Young Researcher Award, Sloan Research Fellowship, Okawa Foundation Grant, Bosch Young Faculty Fellowship, YPO Fellowship, IJCAI Early Career Spotlight, ICRA Best Student Paper award, and the ECCV Best Paper Runner-up Award. His research has also been featured in Newsweek, BBC, Wall Street Journal, Wired and Slashdot.

Invited Talks:

  1. June 29 (Tuesday), JST 10:00-10:30

    Hirokatsu Kataoka (National Institute of Advanced Industrial Science and Technology)

    Pre-training without Natural Images

    Abstract:
    Is it possible to use convolutional neural networks pre-trained without any natural images to assist natural image understanding? The presentation introduces a novel concept, Formula-driven Supervised Learning. We automatically generate image patterns and their category labels by assigning fractals, which are based on a natural law existing in the background knowledge of the real world. Theoretically, the use of automatically generated images instead of natural images in the pre-training phase allows us to generate an infinite scale dataset of labeled images. Although the models pre-trained with the proposed Fractal DataBase (FractalDB), a database without natural images, does not necessarily outperform models pre-trained with human annotated datasets at all settings, we are able to partially surpass the accuracy of ImageNet/Places pre-trained models. The image representation with the proposed FractalDB captures a unique feature in the visualization of convolutional layers and attentions.
    HP1: https://hirokatsukataoka16.github.io/Pretraining-without-Natural-Images/
    HP2: https://hirokatsukataoka16.github.io/Vision-Transformers-without-Natural-Images/

    Bio:
    Hirokatsu Kataoka received his Ph.D. in engineering from Keio University in 2014. He is a Research Scientist at National Institute of Advanced Industrial Science and Technology (AIST). His research interest includes computer vision and pattern recognition, especially in large-scale dataset for image and video recognition. He has received ACCV 2020 Best Paper Honorable Mention Award, AIST 2019 Best Paper Award, and ECCV 2016 Workshop Brave New Idea.

  2. June 29 (Tuesday), JST 10:30-11:00

    Yasushi Makihara (Osaka University)

    On Gait Relative Attributes

    Abstract:
    Gait is considered as one of behavioral biometric modalities which is available even at a distance from a camera without subject cooperation. We can perceive a variety of information from gait: identity, age, gender, emotion, situation (e.g., nervous and relaxed), health status, aesthetic attributes (e.g., beautiful, graceful, imposing). Of these, human perception-based aesthetic attributes are important, because people who pay attention to their fashion style and body shape, may also pay attention to their gaits, i.e., whether their gaits look nice or not. In this talk, I'll give a brief introduction of video-based gait analysis and then move on our framework of the human perception-based gait attribute estimation. Specifically, sample-wise annotation of the aesthetic gait attributes (e.g., five level evaluations) is generally difficult and tends to get inconsistent, we employ relative annotation framework where each annotator is shown a pair of gait videos and then give relative score (i.e., the first one is better, similar, the second one is better). We then design a Siamese-type deep neural network where each stream outputs the aesthetic gait attribute and also design a loss function called signed quadratic contrastive loss suited for learning from relative annotations. Experiments with our own constructed gait relative attribute datasets show the possibility that the machine can perceive the aesthetic gait attributes.

    Bio:
    Yasushi Makihara received the B.S., M.S., and Ph.D. degrees in Engineering from Osaka University in 2001, 2002, and 2005, respectively. He was appointed as a specially appointed assistant professor (full-time), an assistant professor, and an associate professor at The Institute of Scientific and Industrial Research, Osaka University, in 2005, 2006, and 2014, respectively. He is currently a professor of the Institute for Advanced Co-Creation Studies, Osaka University. His research interests are computer vision, pattern recognition, and image processing including gait recognition, pedestrian detection, morphing, and temporal super resolution. He is a member of IPSJ, IEICE, RSJ, and JSME. He has obtained several honors and awards, including the 2nd Int. Workshop on Biometrics and Forensics (IWBF 2014), IAPR Best Paper Award, the 9th IAPR Int. Conf. on Biometrics (ICB 2016), Honorable Mention Paper Award, the 28th British Machine Vision Conf. (BMVC 2017), Outstanding Reviewers, the 11th IEEE Int. Conf. on Automatic Face and Gesture Recognition (FG 2015), Outstanding Reviewers, the 30th IEEE Conf. on Computer Vision and Pattern Recognition (CVPR 2017), Outstanding Reviewers, and the Commendation for Science and Technology by the Minister of Education, Culture, Sports, Science and Technology, Prizes for Science and Technology, Research Category in 2014. He has served as an associate editor in chief of IEICE Trans. on Information and Systems, an associate editor of IPSJ Transactions on Computer Vision and Applications (CVA), a program co-chair of the 4th Asian Conf. on Pattern Recognition (ACPR 2017), area chairs of ICCV 2019, CVPR 2020, ECCV 2020, and reviewers of journals such as T-PAMI, T-IP, T-CSVT, T-IFS, IJCV, Pattern Recognition, and international conferences such as CVPR, ICCV, ECCV, ACCV, ICPR, FG, etc.

  3. June 29 (Tuesday), JST 11:15-11:45

    Michael Maire (University of Chicago/adjoint at TTIC)

    Unifying Memory, Perception, and Attention with Convolutional Neural Architectures

    Abstract:
    I will describe our recent work on endowing neural networks with long-term, large-scale memory. Distinct from strategies that connect neural networks to external memory banks via intricately crafted controllers and hand-designed attention mechanisms, our memory is internal, distributed, co-located alongside computation, and implicitly addressed, while being drastically simpler than prior efforts. This simplicity is achieved through an architectural design principle -- multigrid structure and connectivity -- that grants convolutional neural networks an emergent capacity for attentional behavior. Experiments demonstrate these networks are capable of mastering a range of synthetic tasks that require perception alongside sequential reasoning.

    Joint work with Tri Huynh and Matthew R. Walter.

    Bio:
    Michael Maire is an assistant professor in the Department of Computer Science at the University of Chicago. He was previously a research assistant professor at the Toyota Technological Institute at Chicago (TTIC), where he maintains a courtesy appointment. Prior to TTIC, he was a postdoctoral scholar at the California Institute of Technology. He received a PhD in computer science from the University of California, Berkeley in 2009. His research interests span computer vision, with emphasis on perceptual organization and object recognition, and deep learning, with focus on neural network architectures and optimization.

  4. June 30 (Wednesday), JST 19:00-19:30

    Hideki Nakayama (University of Tokyo/AIRC)

    Efficient Training of Neural Module Networks and Applications

    Abstract:
    Although deep neural networks (DNNs) have dramatically innovated modern AI systems, there are many known drawbacks such as the need of huge datasets and the lack of explainability and fairness. To mitigate the problems, fusion of DNNs and top-down algorithms (i.e., rule-based modules) has been regarded as a promising approach in recent years. However, such modules often involve discrete operations and are not differentiable, making it non-trivial to integrate them with the standard DNNs. In this talk, I will introduce our recent works on training such a hybrid network of deterministic modules and DNNs, Neural Module Networks, and their applications on visual reasoning and data augmentation.

    Bio:
    Hideki Nakayama received the master's and Ph.D. degrees in information science from the University of Tokyo, Japan, in 2008 and 2011, respectively. From 2012 to 2018, he was an assistant professor at the Graduate School of Information Science and Technology, the University of Tokyo, where he has been an associate professor since April 2018. He is also an affiliated faculty of the International Research Center for Neurointelligence (IRCN) and a visiting researcher at the National Institute of Advanced Industrial Science and Technology (AIST). His research interests include generic image recognition, natural language processing, and deep learning.

  5. June 30 (Wednesday), JST TBD

    Ikura Sato (Tokyo Institute of Technology / Denso IT Laboratory)

    Classifier Anonymization Technique for Co-Adaptation Breaking

    Abstract:
    This talk shares recent progress from our group about breaking inter-layer co-adaptation by classifier anonymization [I. Sato+, ICML2019]. The co-adaptation in a neural network has been known to potentially bring situations where neurons are tied in a very specific way to carry information so that the distribution of the representations becomes excessively complex. In the talk, we introduce our proposed algorithm to circumvent the unwanted co-adaptation, then we show our theoretical and empirical studies.

    Bio:
    Ikuro Sato obtained his Ph.D. in physics at University of Maryland, US in 2005. After spending a postdoc period in Lawrence Berkeley National Laboratory, USA, he joined the R&D group of Denso IT Laboratory, Japan. He has been working on computer vision and machine learning for autonomous applications. In 2020, he was cross-appointed to Tokyo Institute of Technology as a specially appointed associate professor.

  6. June 30 (Wednesday), JST 21:00-21:30

    Steve Hanneke (Toyota Technological Institute at Chicago)

    A Theory of Adversarially Robust Learning

    Abstract:
    It is known that many learning algorithms are unstable, in the sense that even if they are correct on a given test example, an adversary can change the learner's prediction by perturbing the example an imperceptible amount. There has recently been a tremendous amount of effort to design learning algorithms that are robust to such adversarial perturbations. Our work on the subject studies the problem from an abstract theoretical perspective. In particular, we argue that the natural loss-minimization approach, known as "adversarial training", can fail spectacularly, even for very simple concept classes. However, approaching the problem from a different perspective, not relying on uniform convergence of loss estimates, we propose a new learning algorithm that is provably robust to such adversarial attacks. Joint work with Omar Montasser and Nathan Srebro.

    Bio:
    Steve Hanneke is a Research Assistant Professor at the Toyota Technological Institute at Chicago. His research explores the theory of machine learning, with a focus on reducing the number of training examples sufficient for learning. His work develops new approaches to supervised, semi-supervised, active, and transfer learning, and also revisits the basic probabilistic assumptions at the foundation of learning theory. Steve earned a Bachelor of Science degree in Computer Science from UIUC in 2005 and a Ph.D. in Machine Learning from Carnegie Mellon University in 2009 with a dissertation on the theoretical foundations of active learning.

  7. June 30 (Wednesday), JST 21:30-22:00

    Tadahiro Taniguchi (Ritsumeikan University)

    Symbol Emergence in Robotics: Probabilistic Generative Models for Realizing Real-world Cognition

    Abstract:
    Symbol emergence in robotics aims to develop a robot that can adapt to the real-world environment, human linguistic communications, and acquire language from sensorimotor information alone, i.e., in an unsupervised manner. This line of studies is essential not only for creating a robot that can collaborate with people through human-robot interactions but also for understanding human cognitive development. This invited lecture introduces the recent development of integrative probabilistic generative models for language learning, e.g., spatial concept formation with simultaneous localization and mapping, and vision of symbol emergence in robotics. I will also introduce challenges related to the integration of probabilistic generative models and deep learning for language learning by robots.

    Bio:
    Tadahiro Taniguchi received the ME and Ph.D. degrees from Kyoto University, in 2003 and 2006, respectively. From April 2008 to March 2010, he was an Assistant Professor at the Department of Human and Computer Intelligence, Ritsumeikan University. From April 2010 to March 2017, he was an Associate Professor at the same department. From September 2015 to September 2016, he is a Visiting Associate Professor at the Department of Electrical and Electronic Engineering, Imperial College London. From April 2017, he has been a Professor at the Department of Information Science and Engineering, Ritsumeikan University. From April 2017, he has been a visiting general chief scientist, Technology Division, Panasonic, as well. He has been engaged in research on AI, symbol emergence in robotics, machine learning, and cognitive science.

  8. July 1 (Thursday), JST 10:00-10:30

    Graham Neubig (Carnegie Mellon University)

    How Can We Know What and When Language Models Know?

    Abstract:
    One recent remarkable finding in natural language processing is that by training a model to simply predict words in a sentence, language models can learn a significant amount of world knowledge that would traditionally be expressed by symbolic knowledge bases. In this presentation, I will present research regarding two questions. First: how can we most effectively elicit this knowledge from language models by designing textual prompts that allow the model to predict particular facts? Second: how can we best know when these predictions are accurate, and when they are no better than a random guess? I will also try to discuss the potential of this rapidly growing research paradigm, and point to some open research questions in the future.

    Bio:
    Graham Neubig is an associate professor at the Language Technologies Institute of Carnegie Mellon University. His work focuses on natural language processing, specifically multi-lingual models that work in many different languages, and natural language interfaces that allow humans to communicate with computers in their own language. Much of this work relies on machine learning, and he is also active in developing methods and algorithms for machine learning over natural language data. He publishes regularly in the top venues in natural language processing, machine learning, and speech, and his work has won awards at EMNLP 2016, EACL 2017, and NAACL 2019.

  9. July 1 (Thursday), JST 10:30-11:00

    Chi-Chun Lee (National Tsing Hua University)

    Deep Learning Methods for Speech Emotion Recognition

    Abstract:
    In this talk, we will share our recent works in using deep learning methods to address the ``low resource" issue in speech emotion recognition (SER). The ``low resource" issue centers around two main components: high individual variability and scarcity in emotion labeling. The natural emotion-related individual idiosyncracy makes it practically infeasible to capture all relevant information about a subject to perform SER, and the range of possible emotion states that occurs in real world makes it impractical to collect labeled data for each and every scenario. We will share our series of research in tackling these two problems using various neural learning building blocks, e.g., attention, memory, graphs, etc, in advancing the state-of-the-art speech emotion modeling accuracy.

    Bio:
    Chi-Chun Lee (Jeremy) an Associate Professor at the Department of Electrical Engineering with joint appointment at the Institute of Communication Engineering of the National Tsing Hua University (NTHU), Taiwan. He received his B.S. degree and Ph.D. degree in Electrical Engineering from the University of Southern California (USC), USA in 2007 and 2012. His research interests are in speech and language, affective multimedia, health analytics, and behavior computing. He is the recipient of the Foundation of Outstanding Scholar's Young Innovator Award (2020), the CIEE Outstanding Young Electrical Engineer Award (2020), the IICM K. T. Li Young Researcher Award (2020), the MOST Futuretek Breakthrough Award (2018, 2019). He is an IEEE senior member and a member of Tau Beta Pi and Eta Kappa Nu. He has served as an associate editor for IEEE Transactions on Multimedia and IEEE Transactions on Affective Computing and a guest editor for Journal of Computer Speech and Language. With his students, he has received multiple best student papers at conferences of INTERSPEECH and IEEE EMBC, and has won twice the INTERSPEECH paralinguistic challenges. He is the PI of the MOST AI Innovation Grant and multiple industry projects. His lab's research has been featured in Discovery, Business Today, Technews, Digitimes, and several major news outlets in Taiwan. An extended version can be found https://biic.ee.nthu.edu.tw/cclee.php.

  10. July 1 (Thursday), JST 11:15-11:45

    Yutaka Sasaki (Toyota Technological Institute)

    Knowledge Acquisition from Materials Science Literature

    Abstract:
    This talk is about deep learning-based knowledge acquisition from materials science literature. Different from Bioinformatics, Materials Informatics fields have limited varieties of annotated corpora. In this respect, we have constructed a new corpus named SC-CoMIcs (SuperConductivity Corpus for Materials Informatics), which is tailored for extracting superconducting materials information from abstracts. This talk will introduce the corpus overview and experimental results in Named Entity Recognition, Main Material Identification, and Relation/Event Extraction to see the effectiveness of the corpus. We also demonstrate that the extracted doping information is consistent with the Hume-Rothery empirical rules, which implies that the corpus can provide a chance to re-discover or propose physical chemical rules from literature.

    Bio:
    Yutaka Sasaki received the B.E., M.Eng., and Ph.D. in Engineering from the University of Tsukuba, Japan, in 1986, 1988, and 2000, respectively. In 1988, he joined the NTT Laboratories. In June 2004, he moved to ATR Spoken Language Translation Research Laboratories. From 2006, he was with NaCTeM/University of Manchester, UK. Since 2009, he is a professor at the Toyota Technological Institute, Nagoya, Japan. He is also an adjoint professor at the Toyota Technological Institute at Chicago. His research interests include Machine Learning-based Knowledge Processing and Natural Language Processing and their applications to Scientific/Engineering Informatics.

  11. July 2 (Friday), JST 10:00-10:30

    Kazuya Ueki (Meisei University)

    A Brief History of Zero-Shot Multi-Modal Image Retrieval

    Abstract:
    Visual-semantic embedding is a very interesting research topic because it is useful for various tasks such as visual question answering (VQA), image-text retrieval, image captioning, and scene graph generation. Therefore, a great number of methods have been proposed in the past few years. In this talk, we will present a brief history of technology trends, focusing on zero-shot image retrieval using sentences as queries.

    Bio:
    He received a B.S. in Information Engineering in 1997, and a M.S. in the Department of Computer and Mathematical Sciences in 1999, both from Tohoku University, Sendai, Japan. In 1999, he joined NEC Soft, Ltd, Tokyo, Japan. He was mainly engaged in research on face recognition. In 2007, he received a Ph.D from Graduate School of Science and Engineering, Waseda University, Tokyo, Japan. In 2013, he became an assistant professor at Waseda University. He is currently an associate professor in the School of Information Science, Meisei University. His current research interests include pattern recognition, video retrieval, character recognition, and semantic segmentation. He is researching on the video retrieval evaluation benchmark (TREVID) sponsored by National Institute of Standards and Technology (NIST), and is contributing to the development of video retrieval technology. In 2016 and 2017, his submitted system achieved the highest for the second consecutive year in the TRECVID AVS task.

  12. July 2 (Friday), JST 10:30-11:00

    Tomoki Toda (Nagoya University)

    Interactive Voice Conversion for Augmented Speech Production

    Abstract:
    Voice conversion is a technique for modifying speech waveforms to convert non-/para-linguistic information into any form we want while preserving linguistic content. It has been dramatically improved thanks to significant progress of deep learning techniques, expanding the possibility of developing various applications beyond traditional speaker conversion. In this talk, I will present interactive voice conversion as one of the new voice conversion paradigms and its application to augmented speech production making it possible to produce speech as we want beyond physical constraints.

    Bio:
    Tomoki Toda is a Professor of the Information Technology Center at Nagoya University, Japan. He received the B.E. degree from Nagoya University, Japan, in 1999, and the D.E. degree from the Nara Institute of Science and Technology, Japan, in 2003. His research interests include statistical approaches to speech, music, and environmental sound processing. He received the IEEE SPS 2009 Young Author Best Paper Award and the 2013 EURASIP-ISCA Best Paper Award (Speech Communication Journal).

  13. July 2 (Friday), JST 11:15-11:45

    Audrey Sedal (Toyota Technological Institute at Chicago)

    Soft Robot Design and Morphological Intelligence

    Abstract:
    Continuous, compliant robots are morphologically intelligent: their inherent compliance helps them deflect around obstacles, handle delicate objects, and match impedance with their environments. Yet, soft robots are difficult to deploy because they do not fit into traditional frameworks for robot design, control and measurement. I will present recent research that aims to describe morphologically intelligent soft robot behavior while connecting it with design and deployment. First, I will discuss modular design of soft robots to perform sequential behavior. Then, I will discuss mechanical memory in soft structures. The work presented here will form the basis for design and control frameworks in soft robotics that create intelligent and useful devices.

    Bio:
    Audrey Sedal is a Research Assistant Professor at the Toyota Technological Institute at Chicago (TTIC). She is interested in designing and controlling continuous, compliant robotic mechanisms for use in manipulation, exploration, and human assistance. She completed her PhD in Mechanical Engineering at the University of Michigan-Ann Arbor in 2020 and her undergraduate studies at MIT in 2015. In 2019, Audrey was named a Rising Star in Mechanical Engineering.