The year in AI: 2019 ML/AI advances recap

The year in AI: 2019 ML/AI advances recap

2020-01-26 11:47:14

Source: https://medium.com/@xamat/the-year-in-ai-2019-ml-ai-advances-recap-c6cc1d902d5

It has become somewhat of a tradition for me to do an end-of-year retrospective of advances in AI/ML (see last year’s round up for example), so here we go again! This year started with a big recognition to the impact of Deep Learning when Hinton, Bengio, and Lecun were awarded the Turing award. You might think that after a few years of neck-breaking speed in innovation, this kind of recognition might be signaling that we are getting near some sort of plateau. Well, think again. That is nowhere in sight yet. It is true that hot areas have clearly shifted and while a few years ago image recognition was all the rave this year we have seen more impressive advances in language. I will go into all of this below, but let me start by summarizing the biggest headlines of AI in 2019, in my own very biased opinion:

  • Computers learn to talk (i.e. language models like Bert and specially GPT-2 get scaringly good)
  • AI becoming good at creating synthetic content has some serious consequences
  • The biggest theoretical controversy continues to be how to incorporate innate knowledge or structure into machine learned models. There has been little practical progress towards this end, and little progress towards any other theoretical breakthrough.
  • The revolution may get unsupervised at some point, but for now we can make it self-supervised
  • Computers continue to get better at playing games and can now collaborate in multi-agent escenarios
  • Other areas like Healthcare and Recommender Systems continue to see advances by using Deep Learning, but some of these advances are questioned
  • The war between frameworks continues, with a major TensorFlow release and also big movements on the Pytorch arena.

But, let’s get right into it and dive into each of these fascinating 2019 headlines.

The year of the Language Models

I think it is hard to argue against the fact that this has been the year of Deep Learning and NLP. Or more concretely, the year of language models. Or even more concretely the year of Transformers and GPT-2. Yes, it might be hard to believe, but it has been less than a year since OpenAI first released talked about their GPT-2 language model. That blog post sparked a lot of discussions about AI safety since OpenAI did not feel comfortable releasing the model. Since then, the model was publicly replicated (see here and here), and finally released. However, this has not been the only advance in this space. We have seen Google publish AlBERTXLNET, and Universal Transformers, and also talk about how BERT has been the largest improvement to Google search in years. Besides Google, most of the other big players in the AI space have also published their own language models: SalesforceAmazonMicrosoft, or Facebook seem to all have really bought into the Language Model revolution.

What can these models do in practice? Besides the obvious and scary “generate credible fake tweets”, there are much more constructive ones that we have seen during this past year. For example, Google told us how they were using them not only for search as mentioned above, but also for their Smart compose feature. Facebook learned models to answer questions wholistically and Allen Institute’s Aristo AI passed an eighth-grade science test. In fact, if we look at the SQUAD leaderboard, it seems nowadays anyone can surpass human-level reading comprehension by combining some of these known approaches (see image below).

I do expect to see many more impressive advances in this space in 2020 as it seems we are getting closer and closer to passing the Turing Test and having computers that “can speak human”. That being said, we should also temper our expectations since there have been many papers that have also identified the limitations of the current approaches. To start with, Google’s impressive study sets a good backdrop on the limitations of transfer learning in language. In Limitations of Language Models for generating text or storytellers, the Stanford NLP folks walk us through situations when these language models work, and many others where they don’t. Of course, a key aspect of these limitations is the fact that these models are expected to generalize across a wide range of tasks and even domains. However, we know that, as shown in “To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks”, it usually pays off to fine tune models to specific tasks. Our team at Curai came to that exact conclusion when comparing general language models to those trained on the medical domain in “Domain-Relevant Embeddings for Medical Question Similarity”. So, we are still far from having general-purpose language models that can tell good stories and adapt to different tasks and domains. Finally, I could not finish this paragraph on limitations of language models without mentioning Merity’s great “Stop Thinking With Your Head” where he shows how for many tasks a simple LSTM model can perform almost as well as the most complicated Transformer.

Combining knowledge/structure with deep learning

In 2019 we continued to hear loud voices advocating for AI not to get stuck in a Deep local maxima. According to many, me included, we should be able to combine data-intensive deep learning approaches with more knowledge-intensive methods to add some form of innate structure. While it is true that there is a lot of work to be done in that space, we did see many examples of research combining deep learning and more “traditional” AI.

In “Transferable Multi-Domain State Generator for Task-Oriented Dialogue Systems”, Salesforce presents a state-of-the-art approach for slot-filling task-oriented dialogue systems that combines deep learning with more traditional conversational methods. “Neural Assistant: Joint Action Prediction, Response Generation, and Latent Knowledge Reasoning” is a recent paper from Google that also combines several deep and knowledge-intensive approaches for the same purpose. Wizard of Wikipedia: Knowledge-Powered Conversational agents is Facebook’s response in that same space.

ERNIE: A knowledge graph-enhanced language model” is a novel approach where a language model is trained not only on natural language data, but also a knowledge graph. Interestingly though, it turns out that while language models might benefit from being trained on knowledge graphs, they themselves also encode knowledge and can be used as knowledge bases (see “Language Models as Knowledge Bases?”). Similarly, deep learning models like BERT or ELMO do not only encode knowledge, but also syntax. The Stanford NLP team showed how syntax trees can be directly inferred from such models.

The self-supervised revolution

If I had to name two important fundamental trends behind many of the advances highlighted above those would be: transfer learning, and self-supervision. Transfer learning (the idea that you can train a model on an original dataset and apply the resulting model elsewhere) is a pretty obvious idea behind language models, but also earlier image models trained on Imagenet and the like. The idea of self-supervision might be a bit less obvious. Maybe that is why some are calling it The Quiet Revolution despite Yan LeCun having screamed it out loud for years to anyone who was listening. In any case, self-supervision, the idea that you can train a model on unlabelled data by exploiting the context in the data itself, is catching on. Not only language models like BERT or ALBERT use the idea extensively, but this same notion is being applied to other domains, making it easier to train on large corpuses without needing to spend huge efforts in annotation. For example, self-supervision is being used to improve image classification models. See for example “Self-Supervised Learning of Pretext-Invariant Representations”, “Data-Efficient Image Recognition with Contrastive Predictive Coding”, or the recent “Self-training with Noisy Student improves ImageNet classification”. All of these approaches improve on SOTA supervised methods while using much less labeled data.

A fascinating application of self-supervision that takes the idea a step further is Facebook’s “Unsupervised Question Answering by Cloze Translation” where they split the question answering problem into two steps. The first steps generates synthetic training data with a model that synthesizes fill-in-the-gap questions from documents. The second step uses a traditional Q&A model. This is similar to our “Learning from the experts” where we sidestep the need for costly and noisy labeling of medical data by generating synthetic training data.

Other miscellaneous research advances

The year also came with other advances that don’t neatly fit into the main trends of combining knowledge with deep learning, or self-supervision. What follows are some of my favorite highlights in this miscellaneous category.

In “The Lottery ticket hypothesis” the authors show a fascinating result: due to sheer chance, some subnetworks with many less parameters than the original network have comparable accuracy. For some reason their connections have initial weights that result in a much more effective training. The authors also present an algorithm to identify those “winning tickets”. In the same vein of finding more efficient yet performing models, “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks” introduces an approach to uniformly scale all dimensions in a CNN.

Rectified Adam is a variation over the well-known Adam optimizer that results in better training and higher accuracy (if you don’t know Adam, you probably should since, according to Chip Huyen, it’s the most commonly asked question during interviews).

In “Classification Accuracy Score for Conditional Generative Models” the authors present a new way to evaluate generative models by training a classifier on synthetic data but predicting labels on real data. While this is not strictly speaking a novel idea (see e.g. “LR-GAN: Layered Recursive Generative Adversarial Networks for Image Generation”), and the paper applies it only on GANs for image, it does show an interesting path for evaluating other generative models in different domains. In “Training on Synthetic Noise Improves Robustness to Natural Noise in Machine Translation” we see how injecting synthetic noise during training can improve the quality of the trained models. I personally believe that the use of synthetic data plus noise is going to bring a lot of advances in AI in the near future. But, maybe I am just biased because our own publication “Learning from the experts: From expert systems to machine-learned diagnosis models” already proposed a combination of these two techniques.

Another important line of research is on how to apply learned models in “the wild” by modeling uncertainty and out-of-distribution modeling. In real-life it is important to understand the uncertainty of model predictions and whether the data point is outside of the distribution on which the model was trained on. A few papers on this space have been presented at NeurIPS 2019 (see e.g. ”Modeling Uncertainty by Learning a Hierarchy of Deep Neural Connections” and “Likelihood Ratios for Out-of-Distribution Detection”. This is also a very important aspect to tackle in healthcare, and it is indeed the focus of our recent paper “Open Set Medical Diagnosis”.

Finally, I should note that there have been a lot of publications in the broad space of what I would call human-AI-Interaction that includes research areas like fairness, bias, or interpretability. Hard to pick the most impactful works in this space, but I will highlight two with almost opposite takeaways. AI2 presented AllenNLP-Interpret, a toolkit for interactive model interpretations and explanations. This work won the EMNLP best demo award. On the other extreme, in “Manipulating and Measuring Model Interpretability” Microsoft researchers surprisingly concluded that model transparency and interpretability not only did not help, but could hamper user ability to detect model mistakes. And one last, and very recent, piece of news on this space of human-AI is Facebook’s announcement of a $1M deep-fake detection challenge. Clearly detecting fake content is going to be a huge deal in the future, and it is good to see that we are already putting efforts into this.

Let’s keep playing

It has been more than 3 years since Alpha Go beat Lee Sedol, but we are still receiving the aftershocks of such a feat with Sedol recently announcing his retirement because of that defeat. And, while it might seem like there is not much more progress to be made in AI for games, computers insist on getting better at more, and more complex games. This year we saw two major feats, with DeepMind reaching human-level performance in Quake III Arena Capture the Flag and wining the Starcraft competition with AlphaStar. Both these advances show us the ability of algorithms not only to master complicated but highly structured games like Go, but also to adapt to more fuzzy strategic goals in which even collaboration is needed.

A final, and pretty recent, advance in this space is Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model where Deepmind again shows like a combination of search and a learned model can be applied to gain superhuman performance not only in a single game, but in a range of games.

Is there room for deep outside of text and image?

Of course the Deep revolution is impacting way beyond text and image. I will focus on the two areas I follow most closely: recommender systems and healthcare. Interestingly, I have seen a similar pattern in both areas this year (warning: you should know that us “scientists” see patterns all around us).

In recommender systems, Deep Learning has been impacting the research community for some time now, probably ever since Youtube published theirfirst paper on using Deep Learning for recommendations in 2016. Maybe unsurprisingly, most of the results related to deep learning continue to come from industry continues. My former team at Netflix has definitely jumped on the DL train, and they have been speaking publicly about different deep learning enhancements to the Netflix recommender system. See this recent tutorial by Anoop Deoras on using Deep Learning for recommendations. Even Facebook, who are honestly not very active in the recommendation arena, made quite a splash this year by open-sourcing a Deep Learning recsys model/framework. But, not everything is shiny and bright on the deep side of the recsys street. As a matter of fact, the best paper award at the Recsys conference went to a paper that questions most of the recent advances in using deep learning approaches and shows how simpler methods obtain similar to better results.

In healthcare, the deep learning revolution has already been here for some time too. There have been many research papers in this intersection. So much, that Stanford/Google felt like we are at a point when we can even publish “A guide to deep learning in healthcare”. Truth be told, the most interesting/credible applications of deep learning to healthcare are actually still when applied to either images or text (see some of our own examples in “Domain-Relevant Embeddings for Medical Question Similarity” or “Prototypical Clustering Networks for Dermatological Disease Diagnosis”). However, when applied to more complex data like Electronic Health Records (EHR) we show that much simpler models perform just as well as deep neural networks (see our upcoming “The accuracy vs. coverage trade-off in patient-facing diagnosis models”).

The Framework/Platform war

Unsurprisingly, the “AI framework war” that I already mentioned in last year’s round up has not cooled down. The two main contenders continue to be Google’s TensorFlow and Facebook’s Pytorch. Who will win the war remains to be seen, but according to some data, Pytorch continues to win the research battle, while TensorFlow dominates in production-ready systems.

This last year, TensorFlow released the highly-anticipated TF 2.0, its main highlights being tight integration with Keras, default eager execution mode, and more Pythonic function execution. In other words, TF is trying to become more usable and friendly… more Pytorch-like. TensorFlow also introduced its Swift library, which immediately attracted many programmers. Even fast.ai announced that it would embrace Swift, and maybe question the use of Pytorch over time, which would be a huge deal. On the other hand, Pytorch has continued to evolve. Probably the most notable news late this year was another popular library Chainer merging into Pytorch.

There was a lot of movement outside of the two big players too. Microsoft recently announced a really interesting serving infrastructure, which provides highly efficient production-ready serving of models trained with any framework supporting the ONNX standard. This can be a huge win for Pytorch since its serving infrastructure is lagging behind TensorFlow’s for sure. On the NLP-frameworks arena, I have to obviously mention the great work by the folks at Huggingface who seem to release the code to any NLP advance before the paper even hits arxiv. A huge accomplishment especially given that all big players are interested in this space. Even Uber published their own framework for dialogue system research.

What to expect in 2020

Thanks for making it this far! I know this is a long post. Plus, I am always much better at explaining the past than predicting the future. So, I won’t keep you here much longer. I don’t have risky predictions for what 2020 will bring, but I am sure of a few things:

  1. There will be more advances in NLP, some of which will be categorized as breakthroughs
  2. AI will get better at faking all kinds of content, and we will see efforts on how to avoid the possible negative side effects
  3. Aspects such as uncertainty modeling, out-of-distribution modeling, metalearning, or interpretability will continue to be top of mind for many.
  4. AI will continue to hugely impact broad application areas such as Healthcare
  5. We won’t get see self-driving Teslas on the roads
  6. We won’t solve AGI

Hope you enjoyed the post, and looking forward to your feedback and comments!

Thanks to Anitha Kannan for feedback on an early version of this post.

原文地址:https://www.cnblogs.com/wangxiaocvpr/p/12234000.html

时间: 2024-10-03 20:00:31

The year in AI: 2019 ML/AI advances recap的相关文章

2019北京AI机器人展

2019北京AI机器人展2019北京AI机器人展会将于2019年5月16日在北京朝阳区中国国际展览中心(老馆)盛大开幕! 前景分析 当前,我国机器人市场进入高速发展期,2017年市场规模约62.8亿美元,2012-2017年平均增长率达到28%. 中国互联网络信息中心31日发布的报告显示,截至2018年6月,全球人工智能企业总数达到2542家,中国人工智能企业数量居第二位,仅次于美国. 该中心当日发布第41次<中国互联网络发展状况统计报告>.据统计报告,截至2018年6月,全球人工智能企业总数

让人工智能与产业紧密融合 2019人工智能AI大会

让人工智能与产业紧密融合 2019人工智能AI大会业内专家认为,我国人工智能产业应用总体上处于起步阶段.当前,新一代人工智能已经从最初的算法驱动逐渐向数据.算法和算力的复合驱动转变,其中,基于数据的应用驱动作用日益显著.以医疗领域为例,志诺维思基因科技创始人.人工智能专家凌少平介绍,我国智慧医疗近几年发展比较快,一个重要原因是有比较丰富的电子病历.医疗影像.病理图像等数据,基于这些数据,科研人员能够通过标注来训练人工智能模型.地平线创始人兼首席执行官.人工智能专家余凯说,受益于行业丰富的需求,我

(hdu step 5.1.1)A Bug&#39;s Life((ai,bi)表示ai、bi不在同一堆中,有若干对数据,判断是否有bug)

题目: A Bug's Life Time Limit: 15000/5000 MS (Java/Others) Memory Limit: 32768/32768 K (Java/Others) Total Submission(s): 723 Accepted Submission(s): 277   Problem Description Background Professor Hopper is researching the sexual behavior of a rare spe

安全进化的终极猜想— 以“AI之盾”对抗“AI之矛”

今年的"3.15晚会"给我最大的震撼就是AI机器人带来的负面问题,40亿骚扰电话播出,每天播出5000个,一年拨出40亿个,设备日活跃量5000多台,在使用设备数量超过3万台,6亿多用户隐私泄露....... 一.AI本无罪,怀璧其罪 AI机器人替代人工,永远保持礼貌从不生气且不知疲倦的声音,当之无愧成为骚扰电话的主力军.AI机器人替代人工拨打骚扰电话的方案已形成"AI拨打电话+AI躲避监管+AI收集个人隐私"的完整黑色产业链.当人们用AI的能力辅助做安全.实现工作

AI,DM,ML,PR的区别与联系

数据挖掘和机器学习的区别和联系,周志华有一篇很好的论述<机器学习与数据挖掘>可以帮助大家理解.数据挖掘受到很多学科领域的影响,其中数据库.机器学习.统计学无疑影响最大.简言之,对数据挖掘而言,数据库提供数据管理技术,机器学习和统计学提供数据分析技术.由于统计学往往醉心于理论的优美而忽视实际的效用,因此,统计学界提供的很多技术通常都要在机器学习界进一步研究,变成有效的机器学习算法之后才能再进入数据挖掘领域.从这个意义上说,统计学主要是通过机器学习来对数据挖掘发挥影响,而机器学习和数据库则是数据挖

认识:人工智能AI 机器学习 ML 深度学习DL

人工智能 人工智能(Artificial Intelligence),英文缩写为AI.它是研究.开发用于模拟.延伸和扩展人的智能的理论.方法.技术及应用系统的一门新的技术科学. 人工智能是对人的意识.思维的信息过程的模拟.人工智能不是人的智能,但能像人那样思考.也可能超过人的智能. 人工智能的定义可以分为两部分,即"人工"和"智能". 机器学习 1.    什么是机器学习 根据等人事件中判断人是否迟到了解什么是机器学习,具体参见地址:http://www.cnblo

2019中国AI大会.北京人工智能科技展官网

2019中国北京人工智能科技展作为"科博会"品牌展之一将于2019年5月16日在中国国际展览中心(老国展)盛大开幕! 为什么选择2019北京人工智能科技展? 1.对接"中国制造2025"实力打造全球智能制造中心. 以北京为龙头的华北地区作为全球重要的智能制造产业地之一,拥有雄厚的行业资源,展会在此将为参与者提供更多高价值的合作商机. 2.中国科技产业展,观众与采购商资源实现各行业全覆盖. 亮相中国有影响力.最专业的科技产业展览会,汇聚全球知名供应商及高购买力企业,一

2019年AI智能会淘汰一批电销行业,你还不知道吗

2019年会有一大批电销行业被淘汰,你了解吗? 智能语音机器人对于电销行业来说已经不再陌生,从观望到使用过程不需要太久,只要你足够了解可要赶紧用起来,不然行业竞争很快会出现淘汰赛.电销机器人系统相比使用人工具有哪些方面的优势,今天我们来好好说一说: 智能语音系统1.智能语音机器人并不存在离职问题,大家订购机器人一年,我们的电话机器人就为大家兢兢业业的工作一年,还能够实行续约. 2.智能语音机器人多少钱?智能语音机器人使用价值根据年份算,一台机器人一年的花销是一万二,不需要保险.不需要奖金更不需要

Knowledge-Augmented Language Model and its Application to Unsupervised Named-Entity Recognition(Facebook AI 2019) 文献综述

1.摘要: 传统的语言模型无法为文本中的实体名称进行有效建模,主要是因为这些命名实体不会在文本中频繁出现.近期研究发现相同的实体类型(如人名.地名)其上下文之间具有一定共性,同时训练语言模型时可以借助外部知识图谱.本文主要研究继续探索利用知识图谱来强化语言模型的训练,训练目标是优化语言模型的困惑度,模型全称Knowledge-Augmented Language Model(KALM).在这个过程并不依赖其他如命名实体识别工具.为了提升语言模型的效果,本模型在训练过程中学习到实体类型信息,然后模