Keynote Speakers 按姓氏拼音排列

Speaker

高新波

西安电子科技大学

Speaker

张正友

腾讯

Invited Speakers 按姓氏拼音排列

Speaker

纪荣嵘

厦门大学

Speaker

Dahua Lin

Chinese University of Hong Kong

Speaker

Hongdong Li

澳大利亚国立大学

Speaker

林宙辰

北京大学

Speaker

欧阳万里

悉尼大学

Speaker

吴飞

浙江大学

Speaker

吴建鑫

南京大学

Speaker

杨铭

地平线机器人联合创始人兼软件副总裁

Speaker

Kevin Zhou

Siemens Healthcare Technology

Speaker

朱军

清华大学

Invited Annual Progress Report Speakers

Speaker

李玺

浙江大学

Speaker

代季峰

微软亚洲研究院

Speaker

乔宇

中科院深圳先进院

Speaker

张兆翔

中国科学院自动化研究所

Speaker

付彦伟

复旦大学

Speaker

齐国君

中佛罗里达大学

Speaker

夏桂松

武汉大学

Speaker

白翔

华中科技大学

Speaker

程健

中国科学院自动化研究所

Speaker

黄圣君

南京航空航天大学

Speaker

吴毅红

中国科学院自动化研究所

Tutorial Instructors

Speaker

冯佳时

新加坡国立大学

Speaker

左旺孟

哈尔滨工业大学

Speaker

章国锋

浙江大学

Speaker

沈劭劼

香港科技大学

讲者介绍

  高新波 个人主页 西安电子科技大学

报告题目:异质图像合成与识别

报告摘要:本次报告将以异质人脸图像合成和识别为例探讨人机混合智能和跨媒体智能的相关问题和思考。报告将系统总结我们团队近十年来在异质人脸图像识别,即素描画像和灰度图像之间的跨媒体识别工作,包括基于异质图像合成的方法和直接跨媒体识别方法,基于概率图模型的方法和基于深度学习的方法等。这些方法可以推广到其他异质图像的变换和识别中,如近红外与可见光图像,低分辨与高分辨图像,甚至CT和MR图像中,在在刑侦破案、网络追逃、动漫设计以及医学诊断等领域具有广泛的应用前景。

讲者信息:高新波,博士,教授,西安电子科技大学模式识别与智能系统学科带头人,综合业务网理论及关键技术国家重点实验室主任,国家万人计划科技创新领军人才,新世纪百千万人才工程国家级人选,国家杰出青年科学基金获得者,教育部长江学者特聘教授,科技部重点领域创新团队负责人、教育部创新团队负责人。IET Fellow、CIE Fellow、IEEE高级会员、中国图象图形学学会常务理事、中国计算机学会理事、中国指挥与控制学会富媒体指挥专委会常务委员、中国电子学会青年科学家俱乐部副主席。主要从事计算机视觉机器学习等领域的研究和教学工作,获国家自然科学二等奖1项、省部级科学技术一等奖3项。

  张正友 腾讯

报告题目:Emotionally Intelligent Machines

报告摘要:Human-Computer Interaction has been evolved from punch card, through the mainstream mouse, keyboard and touch input, to more recent natural user interaction (NUI) with voice and gesture. I believe we are at the crust of next generation of human-computer interaction which is emotionally intelligent. With high-definition cameras and noise-cancelling microphones, together with better algorithms and larger labeled data, the computer can increasingly understand human’s emotion better and better, toward seamless human-computer interaction. By looking at people’s facial expression and hearing their voice tone, the computer can determine whether they are stressed or happy or interested or sleepy. Many applications are possible, ranging from education, medical treatments, advertisement, and entertainment, to security screenings. A robotic assistant (physical or virtual) obviously needs to understand its user’s emotion states. In this talk, I will describe our work on emotion and virtual assistant.

讲者信息:Zhengyou Zhang is an IEEE Fellow and ACM Fellow. He received the B.S. degree from Zhejiang University, Hangzhou, China, in 1985, the M.S. degree from the University of Nancy, Nancy, France, in 1987, and the Ph.D. degree and the Doctorate of Science (Habilitation à diriger des recherches) in 1990 and 1994 from the University of Paris XI, Paris, France. He was a Senior Research Scientist with INRIA (French National Institute for Research in Computer Science and Control), France, for 10 years. In 1996-1997, he spent a one-year sabbatical as an Invited Researcher with the Advanced Telecommunications Research Institute International (ATR), Kyoto, Japan. He then spent 20 years with Microsoft Research, USA, and was a Partner Research Manager. He recently joined Tencent.

  纪荣嵘 个人主页 厦门大学

报告题目:紧致化视觉大数据分析系统

报告摘要:报告主要探索视觉大数据搜索识别系统中的紧凑性问题,将覆盖纪荣嵘教授研究组近两年来在面向视觉终端应用的视觉特征紧凑表示和深度网络压缩中所做的一些工作与成果。在视觉特征紧凑表示方面,将介绍通过引入大规模无监督排序信息,学习排序敏感的哈希码,以保持原始高维特征空间中的检索信息。在深度网络压缩方面,将介绍面向特定任务(人脸和视觉场景解析)的深度网络级联压缩模型(串行低秩矩阵分解技术)与加速模型(结构化稀疏约束剪枝技术)。报告并将介绍上述研究在腾讯\滴滴\华为等视觉产品中的实际应用。

讲者信息:纪荣嵘,福建省“闽江学者”特聘教授,厦门大学教授、博士生导师、2014年获国家优青,2016年获国家万人计划青年拔尖。主要研究方向为计算机视觉与多媒体技术。相关工作发表于SCI源期刊论文90余篇,包括ACM汇刊与IEEE汇刊近50篇、CCF A类国际会议长文40余篇。论文的Google Scholar引用次数近5000次,SCI引用1600余次,H-因子为33,12篇论文入选ESI高被引/热点论文;近年来主持国家自然科学基金联合重点项目、军委科技委战略前沿专项,国家重点研发计划课题/子课题等;获2007年微软学者奖、2011年ACM Multimedia最佳论文奖、2012年哈工大优秀博士论文、2015年省自然科学二等奖、2016年教育部技术发明一等奖。担任多个国际期刊的副编辑,VALSE 2017大会主席、ACM/IEEE高级会员。

  Dahua Lin 个人主页 Chinese University of Hong Kong

报告题目:Deep Understanding of Structures in the Visual World

报告摘要:It goes without saying that the success of deep learning is amazing. The rise of deep learning not only leads to a wave of breakthroughs in traditional AI areas, e.g. speech recognition and computer vision, but also opens up a number of possibilities that are unimaginable before — AI can now play chess games, perform cancer diagnosis, and even drive a car. Despite all such successful stories, the “intelligence” of most deep networks remain rather restrictive — they are essentially doing A to B regression, just that they are doing it particularly well. In the past two years, I worked with a group of talented students on a series of problems, with an aim to move beyond the aforementioned limitations and thus extend the power of deep models to more application domains. Many of our studies revolve around an important theme, namely, learning deep models from structured data. Particularly, we develop new modeling frameworks for high-resolution images, event photos, structured scenes, activity videos, movies, relational databases, etc. All such data, despite their different natures, have an important aspect in common, that is, they all contain structures, i.e. components related to each other. Analysis of their inherent structures not only gives us deeper insights into these domains, but also results in more effective models and training strategies (e.g. self-supervised training that does not rely on external supervision). In this talk, I will give a high-level review of our efforts and achievements, and share my thoughts and reflections on the underlying problems.

讲者信息:Dahua Lin is an Assistant Professor at the department of Information Engineering, the Chinese University of Hong Kong. He received the B.Eng. degree from the University of Science and Technology of China (USTC) in 2004, the M. Phil. degree from the Chinese University of Hong Kong (CUHK) in 2006, and the Ph.D. degree from Massachusetts Institute of Technology (MIT) in 2012. Prior to joining CUHK, he served as a Research Assistant Professor at Toyota Technological Institute at Chicago, from 2012 to 2014. His research interest covers computer vision, machine learning, and big data analytics. In recent years, he primarily focused on deep learning and its applications on high-level visual understanding, probabilistic inference, and big data analytics. He has published about sixty papers on top conferences and journals, e.g. ICCV, CVPR, ECCV, NIPS, and T-PAMI. He serves as an Area Chair of ECCV 2018. His seminal work on a new construction of Bayesian nonparametric models has won the best student paper award in NIPS in year 2010. He also received the outstanding reviewer award in ICCV 2009 and ICCV 2011. He has supervised or co-supervised the CUHK team in international competitions and won multiple awards in ImageNet 2016, ActivityNet 2016, and ActivityNet 2017.

  Hongdong Li 个人主页 澳大利亚国立大学

报告题目:Some Recent Work on Non-Rigid Shape Structure-From-Motion with a Monocular Perspective Camera: Sparse and Dense Solutions.

报告摘要:In this talk, I will describe some of our recent work on monocular perspective camera based reconstruction of the non-rigid 3D shape or an object or a complex scene. We aim to answer an open question in 3D vision: “is it possible to recover the 3D shape of a dynamic deformable object with a single moving camera?”. Traditional methods for dynamic 3D reconstruction often employ stereo-vision, or assume the scene (with deformable object) follows certain simple low-order linear model. Our new work removes such restrictions and shows that, under certain mild assumptions, monocular 3D reconstruction of a dynamic shape scene is possible. I will explain two approaches, one is for the recovery of 3D dynamic human pose using structured movement, and the other is a dense surface reconstruction method for complex dynamic scene. Both methods achieved superior performance on standard benchmarks datasets.

讲者信息:Dr. Hongdong Li is an Associate Professor/Reader with the Computer Vision Group of ANU (Australian National University) and Chief Investigator for Australian Centre for Robotic Vision (ACRV). His research interests include 3D computer vision, SFM/SLAM for robot navigation and autonomous vehicles, as well as the application of mathematical optimization in geometric vision. He graduated from Zhejiang University, and taught at the same university before joined the ANU as a research fellow since 2004. During 2009-2010 he was a senior researcher with NICTA (Canberra Labs) working on the “Australia Bionic Eyes” project. He was a visiting professor with Carnegie Mellon University in 2017. He served as the Area Chair for CVPR, ICCV, ECCV, BMVC and 3DV in the past; Associate Editor for IEEE Transactions on PAMI (T-PAMI); Program Co-Chair for ACCV 2018. Jointly with students and co-workers he won a number of prestigious awards in computer vision, which include the CVPR Best Paper Award and Marr Prize-Honorable Mention.

  林宙辰 个人主页 北京大学

报告题目:A Brief Overview of Practical Optimization Algorithms in the Context of Relaxation

报告摘要:Optimization is an indispensable part of machine learning. There have been various optimization algorithms, typically introduced independently in textbooks and scatter across vast materials, making the beginners hard to have a global picture. In this talk, by explaining how to relax some aspects of optimization procedures I will briefly introduce some practical optimization algorithms in a systematic way.

讲者信息:Zhouchen LIN received the Ph.D. degree in applied mathematics from Peking University in 2000. He is currently a Professor with the Key Laboratory of Machine Perception, School of Electronics Engineering and Computer Science, Peking University. His research interests include computer vision, image processing, machine learning, pattern recognition, and numerical optimization. He is an area chair of ACCV 2009/2018, CVPR 2014/2016, ICCV 2015, and NIPS 2015, and senior program committee of AAAI 2016/2017/2018 and IJCAI 2016/2018. He is an Associate Editor of the IEEE Transactions on Pattern Analysis And Machine Intelligence and the International Journal of Computer Vision. He is an IAPR/IEEE fellow.

  欧阳万里 个人主页 悉尼大学

报告题目:Exploring Deep Structures in Computer Vision tasks

报告摘要:Structure in data provide rich information that helps to reduce the complexity and improves the effectiveness of a model. In this talk, an introduction will be given on the recent progress in using deep learning as a tool for modeling the structure in visual data. We show that observation in our problem are useful in modeling the structure of deep model and help to improve the effectiveness of deep models for many vision problems.

讲者信息:Wanli Ouyang received the PhD degree in the Department of Electronic Engineering, The Chinese University of Hong Kong. He is now a senior lecturer at the University of Sydney. His research interests include image processing, computer vision and pattern recognition. He is the first author of 7 papers on TPAMI and IJCV, and has published around 40 papers on top tier conferences like CVPR, ICCV and NIPS. ImageNet Large Scale Visual Recognition Challenge (ILSVRC) is one of the most important grand challenges in computer vision. The team led by him ranks No. 1 in the ILSVRC 2015 and ILSVRC 2016. He receives the best reviewer award of ICCV. He has been the reviewer of many top journals and conferences such as IEEE TPAMI, TIP, IJCV, SIGGRAPH, CVPR, and ICCV. He is a senior member of the IEEE.

  吴飞 个人主页 浙江大学

报告题目:Memory-augmented learning

报告摘要:Neural networks with a memory capacity provide a promising approach to media understanding (e.g., Q-A and visual classification). In this talk, I will present how to utilize the information in external memory to boost media understanding. In general, the relevant information (e.g., knowledge instance and exemplar data) w.r.t the input data is sparked from external memory in the manner of memory-augmented learning. Memory-augmented learning is an appropriate method to integrate data-driven learning, knowledge-guided inference and experience exploration.

讲者信息:Fei Wu is a professor at the college of computer science, Zhejiang University. From October, 2009 to August 2010, Fei Wu was a visiting scholar at Prof. Bin Yu's group, University of California, Berkeley. Currently, he is the vice-dean of college of Computer Science, and the director of Institute of Artificial Intelligence of Zhejiang University. His research interests mainly include Artificial Intelligence, cross-media computing, and multimedia retrieval. He has won various honors such as the Award of National Science Fund for Distinguished Young Scholars of China (2016).

  吴建鑫 个人主页 南京大学

报告题目:深度学习实践:庖丁解牛与盲人摸象

报告摘要:深度学习是目前计算机视觉与机器学习领域最热门、也是在很多实际问题中实践效果最好的方法。然而,深度学习,尤其是卷积神经网络CNN的机理目前尚不明确。本次报告将介绍我们研究组在CNN深度学习方向上两个方面的实践:庖丁解牛与盲人摸象。庖丁解牛即将CNN的各个构成模块分别探索,发现其优缺点并加以改进,从而对CNN的各个模块形成深入的理解。盲人摸象即对在ImageNet上学习到的CNN预训练模型能起到什么样的作用加以研究,在CNN整体机理尚不清楚的前提下,对预训练模型在各个视觉问题中的无监督应用加以研究。

讲者信息:南京大学教授,Minieye首席科学家(minieye.cc)。研究兴趣为计算机视觉与机器学习,尤其是资源(计算、存储、能源、数据与标注)受限情况下的深度学习。曾获中组部青年千人和基金委优秀青年科学基金资助,曾任ICCV、CVPR等领域主席。

  杨铭 地平线机器人联合创始人兼软件副总裁

报告题目:Challenges in Accelerating Neural Nets on Silicon for Vision Tasks

报告摘要:Deep neural nets have dominated a variety of computer vision tasks, yet productizing neural nets in IoT applications requires a cost and energy efficient computation platform. This talk covers the challenges in accelerating neural networks on silicon for smart IoT applications, including network design and model compression, joint hardware and software architecture design, and some practical issues in trading off the engineering efforts. A neural network acceleration co-processor is demonstrated for visual detection and recognition in autonomous driving and video analysis.

讲者信息:杨铭博士,地平线(Horizon Robotics)联合创始人 & 软件副总裁,Facebook 人工智能研究院创始成员之一。杨铭曾任 NEC 美国研究院高级研究员,专注于计算机视觉和机器学习领域研究,包括物体跟踪、人脸识别、海量图片检索、及多媒体内容分析。他在 Facebook 工作期间负责的深度学习研发项目 DeepFace 在业界产生重大影响,被多家媒体广泛报道,包括 Science Magazine、MIT Tech Review、Forbes 等。他领导 NEC-UIUC 团队参加 TRECVID08/09 视频监控事件检测评测,获得最佳成绩;参与 NEC 团队 ImageNet2010 大规模图像分类挑战,获得第一名。申请获得美国专利15项。杨铭毕业于清华大学电子工程系并获得工学学士和硕士学位,于美国西北大学电气工程与计算机科学系获得博士学位。他在顶级国际会议 CVPR/ICCV 发表论文20余篇,在顶级国际期刊 T-PAMI 上发表9篇论文,被引用超过6700次;多次担任 CVPR/ICCV/NIPS/ACMMM 等顶级国际会议程序委员会成员,T-PAMI/IJCV/T-IP 等顶级国际期刊审稿人。

  Kevin Zhou 个人主页 Principal Key Expert of Image Analysis at Siemens Healthcare Technology

报告题目:Deep learning and beyond: medical image recognition, segmentation, parsing

报告摘要:The "Machine learning + Knolwedge" approaches, which combine meachine learning with domain knowledge, enable us to achieve start-of-the-art performances for many tasks of medical image recognition, segmentation and parsing. In this talk, we first present real success stories of such approaches. Then, we proceed to elaborate deep learning, a special, mighty type of machine learning method, and review its recent advances. We conclude with several latest "DL & Beyond" works.

讲者信息:Dr. S. Kevin Zhou is currently a Principal Key Expert of Image Analysis at Siemens Healthcare Technology, dedicated to researching and developing innovative solutions for medical and industrial imaging products. His research interests lie in computer vision and machine learning and their applications to medical image recognition and parsing, face recognition and modeling, etc. Dr. Zhou has published over 150 book chapters and peer-reviewed journal and conference papers, has registered over 250 patents and inventions, has written two research monographs, and has edited three books. His two most recent books are entitled "Medical Image Recognition, Segmenation and Parsing: Machine Learning and Multiple Object Approaches, SK Zhou (Ed.)" and "Deep Learning for Medical Image Analysis, SK Zhou, H Greenspan, DG Shen (Eds.)." He has won multiple awards honoring his publications, patents and products, including Thomas Alva Edison Patent Award (2013), R&D 100 Award or Oscar of Invention (2014), Siemens Inventor of the Year (2014), and UMD ECE Distinguished Aluminum Award (2017). He has been an associate editor for IEEE Trans Medical Imaging and Medical Image Analysis journals, an area chair for CVPR and MICCAI, a co-Editor-in Chief for WeChat public journal The Vision Seeker, and elected as a fellow of American Institute of Biological and Medical Engineering (AIMBE).

  朱军 个人主页 清华大学

报告题目:深度生成模型及概率编程库前沿进展

报告摘要:深度生成模型是一类灵活的从复杂数据中提取隐含结构并且进行“从上到下”生成样本的方法,广泛用于无监督学习、半监督学习等任务。该报告将介绍最近的一些前沿进展,包括用于半监督学习的模型和算法,以及支持快速编程实现的珠算(ZhuSuan)编程库。

讲者信息:朱军,清华大学长聘副教授、卡内基梅隆大学兼职教授、智能技术与系统国家重点实验室副主任。2001到2009年获清华大学计算机学士和博士学位,之后在卡内基梅隆大学做博士后,2011年回清华任教。主要从事人工智能基础理论、高效算法及相关应用研究,在国际重要期刊与会议发表学术论文近百篇。受邀担任人工智能顶级杂志IEEE TPAMI和AI的编委、《自动化学报》编委,担任机器学习国际大会ICML2014地区联合主席, ICML (2014-2018)、NIPS (2013, 2015)、UAI (2014-2017)、IJCAI(2015,2017)、AAAI(2016-2018)等国际会议的领域主席,中国计算机学会(CCF)学术工委主任助理。获CCF优秀博士论文奖、CCF自然科学一等奖、CCF青年科学家奖、国家优秀青年基金、北京市优秀青年人才奖、清华大学优秀班主任一等奖等,入选国家“万人计划”青年拔尖人才、IEEE Intelligent Systems杂志评选的“AI’s 10 to Watch”及清华大学221基础研究人才计划。指导的学生获NIPS 2017国际人工智能对抗攻防竞赛全部三个任务的冠军。

  李玺 浙江大学

报告题目:行人再识别

讲者信息:浙江大学教授,博导,现就职浙江大学计算机学院人工智能研究所,入选第五批中国国家“青年千人计划”和浙江省151第二层次人才。主要从事计算机视觉、模式识别和机器学习等领域的研究和开发。在目标跟踪、目标行为识别、图像标注、视频检索、哈希(hashing)函数学习、深度特征学习等方面取得了深入系统的研究成果,其中在视频的运动跟踪、理解与检索等方面的研究具有特色和优势,取得了多项具有国际影响力的创新性成果。本人在国际权威期刊和国际顶级学术会议发表文章100余篇。担任神经计算领域知名国际刊物Neurocomputing和Neural Processing Letters的Associate Editor,同时担任多个计算机视觉和模式识别方面的国际刊物和国际会议的审稿人和程序委员。获得两项最佳国际会议论文奖(包括ACCV 2010和DICTA 2012),一项ACML最佳学生论文奖,ICIP2015 Top 10% paper award,另外分别获得两项中国北京市自然科学技术奖(包括一等奖和二等奖),以及一项中国专利优秀奖。

  代季峰

报告题目:物体检测与识别

讲者信息:代季峰,于2009年和2014年分别获得清华大学自动化系本科和博士学位,2012年至2013年在加州大学洛杉矶分校访学,现任微软亚洲研究院视觉计算组Lead Researcher。他的主要研究领域为物体检测、分割问题,和深度学习算法,提出了Deformable ConvNets和R-FCN等算法,得到业界关注。他曾经连续两年在本领域内权威的COCO物体识别竞赛中获得第一名。他曾在ICCV 2017做Tutorial on Instance-level Recognition的专题报告,曾担任AAAI 2018的Senior Program Committee member (SPC)。

  乔宇 中科院深圳先进院

报告题目:Deeply Understanding Human Poses and Actions in the Wild

报告摘要:Human pose estimation and action recognition is receiving extensive research interests in computer vision nowadays due to its wide applications in surveillance, human-computer interface, sports video analysis, and content based video retrieval. The difficulties of pose and action understanding come from background clutter, viewpoint changes, and motion and appearance variations. Recent studies demostrate that deep learning approaches signficantly improve the performance of poses estimation and actions recognition in the wild. This talk will summarize recent progresses toward this objective, especially those based on the deep learning methods. We will also analyze the current challenges and the future directions.

讲者信息:Yu Qiao is a professor with the Shenzhen Institutes of Advanced Technology (SIAT), the Chinese Academy of Science, and the deputy director of multimedia research lab. His research interests include computer vision, deep learning, and robots. He has published more than 150 papers in international journals and conferences, including IEEE T-PAMI, IJCV, IEEE T-IP, IEEE T-SP, CVPR, ICCV, AAAI, ECCV. He received Jiaxi Lv young research award from Chinese academy of sciences. He is a senior member of IEEE. He was the first runner-up at the ImageNet Large Scale Visual Recognition Challenge 2015 in scene recognition, and the winner at the ActivityNet Large Scale Activity Recognition Challenge 2016 in video classification. His group also achieved top places in wide international challenges such as ChaLearn, LSun, THUMOUS.

  张兆翔 中国科学院自动化研究所

报告题目:脑启发的视觉计算

讲者信息:张兆翔,博士,中国科学院自动化研究所研究员,博士生导师,中国科学院脑科学与智能技术卓越创新中心年轻骨干,IEEE高级会员,计算机学会YOCSEF委员,计算机视觉专委会委员,模式识别与人工智能专委会委员,人工智能学会模式识别专委会委员。2004年毕业于中国科学技术大学,获得电路与系统专业学士学位;2004年进入中国科学院自动化研究所硕博连读,于2009年获得工学博士学位。2015年任职中国科学院自动化研究所类脑智能研究中心研究员。张兆翔博士一直从事智能视觉监控方面的研究工作,近期进一步聚焦在结合类脑智能和类人学习机制的视觉计算模型,在可用信息建模和基于模型的物体识别问题上开展了系统工作,在面向国家公共安全和智慧城市监管需求的系统平台上取得成功应用,取得显著社会影响和经济效益,近五年来在国际主流学术期刊与会议上发表论文100余篇,SCI收录期刊论文40余篇,担任了ICPR、IJCNN、AVSS、PCM等多个国际会议的程序委员会委员,SCI期刊《Neurocomputing》编委,《IEEE Access》编委,《Pattern Recognition Letters》客座编委、《Frontiers of ComputerScience》青年编委和TPAMI、TIP、TCSVT、PR等20余个本领域主流期刊的审稿人。入选“教育部新世纪优秀人才支持计划”、“北京市青年英才计划”和“微软亚洲研究院铸星计划”。

  付彦伟 复旦大学

报告题目:小样本学习

讲者信息:复旦大学青年副研究员(tenure-track),2014年获得伦敦大学玛丽皇后学院博士学位,导师: Prof. Tao Xiang and Prof. Shaogang Gong. 2014年12月至2016年7月,在美国Disney Research做博士后研究。入选2017年度上海高校特聘教授 (东方学者) , 2018年获国家青年千人计划资助。 主要研究领域包括零样本、小样本识别、终生学习算法,人脸识别及行人再识别,及视频情感分析等。有IEEE TPAMI, CVPR等顶级期刊会议论文20篇,15项中国、2项美国专利等。论文被美国多家科技媒体报道,如Science 2.0, PhyORG, Science Newsline Technology, Science 2.0, Communications of ACM, Business Standard, Science Newsline Technology, PhyORG, EurekAlert! AAAS等。

  齐国君 中佛罗里达大学

报告题目:生成对抗学习

讲者信息: Dr. Qi is a faculty member in the Department of Computer Science at the University of Central Florida. His research interests include knowledge discovery, analysis and aggregation of big data deluging from a variety of modalities and sources in order to build smart and reliable information and decision-making systems. He aspires to apply my research to solve the practical problems through high quality data processing and analysis in healthcare, sensor and social networks, financial systems and so forth. He was the recipient of one-time Microsoft Fellowship, and twice IBM Fellowships. His research has been sponsored by grants and projects from government agencies and industry collaborators, including NSF, IARPA, Microsoft, IBM, and Adobe.
Dr. Qi has published more than 100 papers in a broad range of venues, such as Proceedings of IEEE, IEEE T PAMI, IEEE T KDE, IEEE T Image Processing, ACM SIGKDD, WWW, ICML, ACM MM, CVPR, ICDM, SDM and ICDE. Among them are the best student paper of ICDM 2014, “the best ICDE 2013 paper” by IEEE Transactions on Knowledge and Data Engineering, as well as the best paper (finalist) of ACM Multimedia 2007 (2015).
He has served or will serve as a technical program co-chair for MMM 2016 and ACM Multimedia 2020, and an area chair (a senior program committee member) for ICCV, ICPR, ACM SIGKDD, ACM CIKM, as well as ACM Multimedia. He is also serving or has served in the program committees of several academic conferences, including CVPR, ICCV, KDD, WSDM, CIKM, IJCAI, ICMR, ACM Multimedia, ACM/IEEE ASONAM, ICDM, ICIP, and ACL. He is an associate editor for IEEE Transactions on Circuits and Systems for Video Technology (CSVT), as well as a guest/lead editor for the special issue on “Big Media Data: Understanding, Search, and Mining” in IEEE Transactions on Big Data, “Deep Learning for Multimedia Computing” in IEEE Transactions on Multimedia, and “Social Media Mining and Knowledge Discovery” in Multimedia Systems, Springer. He was also a panelist for the NSF and the United States Department of Energy.

  夏桂松 个人主页 武汉大学

报告题目:对地观测与识别

讲者信息:夏桂松,男,武汉大学教授,博士生导师。2011年获得 法国巴黎高科电信学院 (Telecom ParisTech) 博士学位,随后在法国国家科学研究中心 (CNRS) 从事博士后研究工作。2012年12月入职武汉大学测绘遥感信息工程国家重点实验室。长期从事图像分析和理解、遥感图像解译等的研究工作,在包括IJCV、IEEE TIP/TGRS/TMM/JSTARS, PR、CVPR、BMVC、ICIP、ICPR等国际期刊和会议上发表论文100余篇。 现担任EURASIP J. on Image and Video Processing 和 Signal Processing: Image Communications 国际期刊Associate Editor,以及 IEEE Trans. on Big Data , Pattern Recognition Letter等期刊Guest Editor。入选湖北省自然科学基金杰青、湖北省“楚天学子”等人才项目支持,获“第二届中国科协优秀科技论文”奖。

  白翔 华中科技大学

报告题目:场景文字识别

讲者信息:白翔,华中科技大学电子信息与通信学院教授,博导,国家防伪工程中心副主任。先后于华中科技大学获得学士、硕士、博士学位。他的主要研究领域为计算机视觉与模式识别、深度学习。尤其在形状的匹配与检索、相似性度量与融合、场景OCR取得了一系列重要研究成果,入选2014-17年Elsevier中国高被引学者。他的研究工作曾获微软学者,国家自然科学基金优秀青年基金的资助。他担任VALSE指导委员,IEEE信号处理协会(SPS)武汉Chapter主席;曾担任VALSE在线委员会(VOOC)主席, VALSE 2016大会主席。

  程健 中国科学院自动化研究所

报告题目:深度神经网络加速与压缩年度进展

讲者信息:程健,男,现为中国科学院自动化研究所模式识别国家重点实验室研究员、南京人工智能芯片创新研究院常务副院长、人工智能与先进计算联合实验室主任。分别于1998年和2001年在武汉大学获学士和硕士学位,2004年在中国科学院自动化研究所获博士学位。2004年至2006年在诺基亚研究中心做博士后研究。2006年9月至今在中科院自动化研究所工作。目前主要从事深度学习、人工智能芯片设计、图像与视频内容分析等方面研究,在相关领域发表学术论文100余篇,英文编著二本。曾先后获得中科院卢嘉锡青年人才奖、中科院青年促进会优秀会员奖、中国电子学会自然科学一等奖、教育部自然科学二等奖等。目前担任国际期刊《Pattern Recognition》的编委,曾担任2010年ICIMCS国际会议主席、HHME 2010组织主席、CCPR 2012出版主席等。

  黄圣君 南京航空航天大学

报告题目:主动学习

讲者信息:黄圣君,博士,南京航空航天大学副教授。分别于2008年和2014年从南京大学计算机科学与技术系获学士和博士学位。主要研究领域为机器学习,在相关领域重要国际期刊如IEEE TPAMI、TNNLS等和国际会议如NIPS、KDD、IJCAI、AAAI等发表论二十余篇。曾入选中国科协“青年人才托举工程”,获中国计算机学会优秀博士学位论文奖、KDD’12 Best Poster奖及微软学者奖等荣誉。

  吴毅红 中国科学院自动化研究所

报告题目:三维计算机视觉年度进展

报告摘要:基于图像2D信息产生3D信息是三维计算机视觉的主要研究内容,在机器人、AR、VR领域有广泛应用。其中有三部分重要研究内容:图像匹配、相机定位、三维重建。从这三个方面介绍2017年以来进展,并进行未来趋势展望。

讲者信息:吴毅红,中国科学院自动化研究所、模式识别国家重点实验室, 研究员,博士生导师。研究方向为多视几何理论、图像匹配、相机标定与定位、SLAM、三维重建等。2001年6月毕业于中国科学院系统科学研究所,获博士学位,之后加入模式识别国家重点实验室,2008年被评为研究员。2005年,2010年,被法国IRIT实验室邀请合作研究。2006年至2008年,被香港城市大学多次邀请合作研究。在国际权威期刊和重要会议等上发表论文70余篇,包括PAMI、IJCV、ICCV、ECCV上第一作者论文。申请或获权国内外发明专利10余项。曾担任ICCV、CVPR、ACCV、PCM、ICPR等的PC委员或Session/Area Chair。目前为《Pattern Recognition》编委、《自动化学报》编委、《计算机辅助设计与图形学学报》编委、《计算机科学与探索》编委,《Visual Computing for Industry, Biomedicine, and Art》编委。获1项高等学校科学研究自然科学奖二等奖,排名第三。

  冯佳时 新加坡国立大学

报告题目:生成式对抗神经网络

报告摘要:生成式对抗神经网络通过引入两个神经网络的相互对抗学习来实现真实数据生成。目前已被广泛应用于图像以及语音的数据生成。其对抗学习框架更被麻省理工科技评论评为全球十大突破性技术。本次VALSE GAN教程的第一部分,将介绍GAN发展的历史,基本原理,以及一些未来发展趋势。具体地,本次教程将介绍GAN的基本框架,GAN的基本原理,GAN作为生成模型的独特优势,现有的关于GAN的理论分析与保证,GAN的基本应用模式,以及关于GAN将来发展的一些探讨。

讲者信息:新加坡国立大学助理教授。研究兴趣为机器学习,包括深度学习,鲁棒机器学习,子空间学习以及其在计算机视觉,大数据分析中的应用。

  左旺孟 个人主页 哈尔滨工业大学

报告题目:GAN的应用与拓展

报告摘要:除了语音、图像与视频生成外,生成式对抗网络近年来在图像转化,图像增强与复原、对抗样本学习、领域自适应及迁移学习等方面也获得了较多的关注。在本次VALSE GAN教程的第二部分,将一方面从生成式对抗网络的优势和特点出发,分析介绍GAN的典型场景和应用模式。另一方面从应用角度出发,介绍分析GAN与其它学习模型的互补性和结合方式。最后从拓展角度对GAN的未来发展与应用做一些简单的探讨。

讲者信息:2007年于哈尔滨工业大学计算机学院获得博士学位。目前为哈尔滨工业大学计算机学院教授,研究兴趣包括图像增强与复原、图像编辑、图像分类、物体检测与目标跟踪及其在计算机视觉中的应用。现任IET Biometrics和Journal of Electronic Imaging编委。在CVPR/ICCV/ECCV等顶级会议和T-PAMI、IJCV及IEEE Trans.等期刊上发表论文70余篇。

  章国锋 个人主页 浙江大学

报告题目:运动恢复结构与视觉SLAM

报告摘要:运动恢复结构(Structure from Motion,简称SfM)和视觉同时定位与地图构建(Visual Simultaneous Localization and Mapping,简称VSLAM)是三维视觉和机器人领域的基本问题,可以在未知环境中定位自身的方位并构建环境的三维地图,有着广泛的应用。本次教程首先介绍相机模型、双视图几何、多视图几何等基本的概念和原理,并介绍目前主流的SfM、视觉SLAM、RGB-D SLAM的框架和重要模块,包括特征点跟踪、相机姿态求解、集束调整和回环检测等。此外,还会专门介绍基于SfM和视觉SLAM 的一些典型应用,比如在手机AR上的应用。

讲者信息:章国锋,男,博士,浙江大学CAD&CG国家重点实验室教授,浙江大学-商汤三维视觉联合实验室副主任,博士生导师。2003年获浙江大学计算机专业学士学位,2009年获浙江大学计算机应用专业博士学位。主要从事运动恢复结构、同时定位与地图构建、三维重建、增强现实、视频分割与编缉等方面的研究工作,尤其在同时定位与地图构建和三维重建方面的研究取得了一系列重要成果,研制了一系列相关软件(ACTS, LS-ACTS, RDSLAM, RKSLAM等)并在网上发布供大家下载使用(http://www.zjucvg.net)。获2010年度计算机学会优秀博士学位论文奖,2011年度全国百篇优秀博士学位论文奖,以及2011年度教育部高等学校科学研究优秀成果奖科学技术进步奖一等奖(排名第4)。

  沈劭劼 个人主页 香港科技大学

报告题目:单目视觉惯导SLAM

报告摘要:融合单目视觉与惯性传感器的的信息可获得带尺度信息的高精度定位。此定位信息可用作无人机自主导航以及AR/VR的运动估计。本次教程主要基于我们最近开源的单目视觉惯导SLAM系统,VINS-Mono,对视觉与惯导融合做详细的讲解。教程会覆盖系统初始化,惯导传感器预积分与偏差修正,多传感器时间戳以及外参在线标定,滚动快门补偿,基于非线性优化的滑动窗口状态估计器的设计,回环检测与全局定位修正,以及各种系统实现细节。

讲者信息:2009年于香港科技大学电子与计算机工程学系获得学士学位。2014年于美国宾夕法尼亚大学GRASP实验室获得博士学位,并于同年加入香港科技大学担任助理教授。2016年建立香港科大-大疆创新联合实验室,并成为实验室主任。研究兴趣为无人机自主导航,传感器融合,状态估计,三维重建,SLAM,路径规划。曾在多个国际机器人学会议担任组委会委员。现为IEEE Transactions on Robotics副主编。在IJRR,ICRA等高水平学术会议和期刊上发表论文50余篇,获得SSRR2015以及SSRR2016最佳论文奖,并于ICRA2011及ICRA2017被列入最佳论文评选名单。于2017年开源单目视觉惯导SLAM系统VINS-Mono,并获得业界广泛关注。