Placeholder Image

Subtitles section Play video

  • KONSTANTINOS KATSIAPIS: Hello, everyone.

  • Good morning.

  • I'm Gus Katsiapis.

  • And I'm a principal engineer in TFX.

  • ANUSHA RAMESH: Hi, everyone.

  • I'm Anusha.

  • I'm a product manager in TFX.

  • KONSTANTINOS KATSIAPIS: Today, we'll

  • talk to you about our end-to-end ML

  • platform, TensorFlow Extended, otherwise known as TFX,

  • on behalf of the TFX team.

  • So the discipline of software engineering

  • has evolved over the last five-plus decades

  • to a good level of maturity.

  • If you think about it, this is both a blessing and a necessity

  • because our lives usually depend on it.

  • At the same time, the popularity of ML

  • has been increasing rapidly over the last two-plus decades.

  • And over the last decade or so, it's

  • been used very much, very actively

  • both in experimentation and production settings.

  • It is no longer uncommon for ML to power

  • widely-used applications that we use every day.

  • So much like it was the case for software engineering,

  • the wide use of ML technology necessitates the evolution

  • of the discipline from ML coding to ML engineering.

  • As most of you know, to do ML in production,

  • you need a lot more than just a trainer.

  • For example, the trainer code in an ML production system

  • is usually 5% to 10% of the entirety of the code.

  • And similarly, the amount of time

  • that engineers spend on the trainer

  • is often dwarfed by the amount of time engineers

  • spend in preparing the data, ensuring it's of good quality,

  • ensuring it's unbiased, et cetera.

  • At the same time, research eventually

  • makes its way into production.

  • And ideally, one wouldn't need to change stacks

  • in order to evolve an idea and put it into a product.

  • So I think what is needed here is flexibility, and robustness,

  • and a consistent system that allows

  • you to apply ML in a product.

  • And remember that the ML code itself

  • is a tiny piece of the entire puzzle.

  • ANUSHA RAMESH: Now, here is a concrete example

  • of the difference between ML coding and ML engineering.

  • As you can see in this use case, it took about three weeks

  • to build a model.

  • It's about a year.

  • It's still not deployed in production.

  • Similar stories used to be common at Google as well,

  • but we made things noticeably easier over the past decade

  • by building ML platforms like TFX.

  • Now, ML platforms in Google is not a new thing.

  • We've been building Google's scale machine learning

  • platforms for quite a while now.

  • Sibyl existed as a precursor to TFX.

  • It started about 12 years ago.

  • A lot of the design code and best practices

  • that we gained through Sibyl have been incorporated

  • into the design of TFX.

  • Now, while TFX shares several core principles with Sibyl,

  • it also augments it under several important dimensions.

  • This made TFX to be the most widely used end-to-end ML

  • platform at Alphabet, while being available

  • on premises and on GCP.

  • The vision of TFX is to provide an end-to-end ML

  • platform for everyone.

  • By providing this ML platform, our goal

  • is to ensure that we can proliferate

  • the use of ML engineering, thus improving

  • ML-powered applications.

  • But let's discuss on what it means to be an ML platform

  • and what are the various parts that are required

  • to help us realize this vision.

  • KONSTANTINOS KATSIAPIS: So today, we're

  • going to tell you a little bit more

  • about how we enabled global-scale ML

  • engineering at Google from best practices

  • and libraries all the way to a full-fledged end-to-end ML

  • platform.

  • So let's start from the beginning.

  • Machine learning is hard.

  • Doing it well is harder.

  • And doing it in production and powering applications

  • is actually even harder.

  • We want to help others avoid the many, many pitfalls that we

  • have encountered in the past.

  • And to that end, we actually publish papers, blog posts,

  • and other material that capture a lot of our learnings

  • and our best practices.

  • So here are but a few examples of our publications.

  • They capture collective lessons learned more than a decade

  • of applied ML at Google.

  • And several of them, like the "Rules of Machine Learning,"

  • are quite comprehensive.

  • We won't have time to go into them today

  • as part of this talk obviously, but we

  • encourage you to take a look when you get a chance.

  • ANUSHA RAMESH: While best practices are great,

  • communication of best practices alone would not be sufficient.

  • This does not scale because it does not get applied in code.

  • So we want to capture our learnings

  • and best practices in code.

  • We want to enable our users to reuse these best practices

  • and at the same time, give them the ability to pick and choose.

  • To that extent, we offer standard and data parallel

  • libraries.

  • Now, here are a few examples of libraries

  • that we offer for different phases of machine

  • learning to our developers.

  • As you can see, we offer libraries for almost every step

  • of your ML workflow, starting from data validation

  • to feature transformations to analyzing

  • the quality of a model, all the ways

  • till serving that in production.

  • We also make transfer learning easy by providing TensorFlow

  • Hub.

  • Ml-metadata is a library for recording and retrieving

  • metadata for ML workflows.

  • Now, the best part about these libraries

  • is that they are highly modular, which

  • makes it easy to plug into your existing ML infrastructure.

  • KONSTANTINOS KATSIAPIS: We have found that libraries are not

  • enough within Alphabet, and we expect the same elsewhere.

  • Not all users need or want the full flexibility.

  • Some of them might actually be confused by it.

  • And many users prefer out-of-the-box solutions.

  • So what we do is manage the release of our libraries.

  • We ensure they're nicely packaged and optimized,

  • but importantly, we also offer higher-level APIs.

  • And those come frequently in the form

  • of binaries or components-- or containers, sorry.

  • ANUSHA RAMESH: Libraries and binaries

  • provide a lot of flexibility to our users,

  • but this is not sufficient for ML workflows.

  • ML workflows typically involve inspecting and manipulating

  • several types of artifacts.

  • So we provide components which interact

  • with well-defined and strongly-typed artifact APIs.

  • The components also understand the context and environment

  • in which they operate in and can be interconnected

  • with one another.

  • We also provide UI components for visualization

  • of the said artifacts.

  • That brings us to a new functionality we're

  • launching in TensorFlow World.

  • You can run any TFX component in a notebook.

  • As you can see here, you can run TFX components cell by cell.

  • This example showcases a couple of components.

  • The first one is ExampleGen. ExampleGen ingests data

  • into a TFX pipeline.

  • And this is typically the first component that you use.

  • The second one is StatisticsGen, which

  • computes statistics for visualization and example

  • validation.

  • So when you run a component like StatisticsGen in notebook,

  • you can visualize something like this,

  • which showcases stats on your data

  • and it helps you detect anomalies.

  • The benefit of running TFX components in a notebook

  • is twofold.

  • First, it makes it easy for users to onboard onto TFX.

  • It helps you understand the various components of TFX,

  • and how you connect them, and the order in which you can go.

  • It also helps with debugging the various steps of your ML

  • workflow as you go through the notebook.

  • KONSTANTINOS KATSIAPIS: Through our experience though,

  • we've learned that components aren't actually

  • sufficient for production ML.

  • Manually orchestrating components

  • can become cumbersome and importantly error prone.

  • And then also understanding the lineage of all the artifacts

  • that are produced by those components-- produced

  • or consumed by those components--

  • is often fundamental both from a debugging perspective,

  • but many times from a compliance perspective as well.

  • As such, we offer ways of creating task-driven pipelines

  • of components.

  • We allow you to stitch components together

  • in a task-driven fashion.

  • But we have also found that data scale and advanced use

  • cases also necessitate this pipeline to actually

  • be reactive to the environment, right?

  • So we found that over time, we need more something

  • like data-driven components.

  • Now, the interesting part here is that the components we offer

  • are the same components that could operate

  • both in a task-driven mode and in a data-driven mode,

  • thereby enabling more flexibility.

  • And the most important part is that the artifact lineage

  • is tracked throughout this ML pipeline,

  • whether its task- or data-driven, which

  • helps experimentation, debugging, and compliance.

  • So here's putting it all together.

  • Here is kind of a canonical production end-to-end ML

  • pipeline.

  • It starts with ExampleGeneration,

  • StatisticGeneration to ensure the data is

  • of good quality, proceeds with transformations

  • to augment the data in ways that make it easier to fit

  • the model, training the model.

  • After we train the model, we ensure

  • that it's of good quality.

  • And only after we're sure it meets the quality bar

  • that we're comfortable with do we actually

  • push to one of the serving systems of choice,

  • whether that's a server or a mobile application via TF Lite.

  • Note that the pipeline topology here

  • is fully customizable, right?

  • So you can actually move things around as you please.

  • And importantly, if one of the out-of-the-box components we

  • offer doesn't work for you, you can create a custom component

  • with custom business logic.

  • And all of this is under a single ML pipeline.

  • Now, what does it mean to be an end-to-end ML platform?

  • So I think there are some key properties to it.

  • And one is seamless integration.

  • We want to make sure that all the components

  • within the pipeline actually seamlessly interoperate

  • with each other.

  • And we have actually found that within Google, the value

  • added for our users gets larger as they move higher

  • up the stack-- you know, as they move higher

  • from the libraries going further up to components and further up

  • into the pipeline itself.

  • This is because operating at a higher level of the abstraction

  • allows us to give better robustness and supportability.

  • Another important aspect of an ML platform

  • is its interoperability with the environment it operates in.

  • So each of those platforms might be

  • employed in different environments-- you know,

  • some on premises, some on GCP, et cetera.

  • And we need to make sure that we interact with the ecosystem

  • that you operate in.

  • So TFX actually works with other parts of the fundamental parts

  • of the ML ecosystem, like Kubeflow Pipelines,

  • Apache Beam, Apache Spark, Flink, Airflow, et cetera.

  • This interoperability also gives us

  • something else that's very important here--

  • the flexibility, right?

  • So we allow customization of components and extension points

  • within the ML platform that allows you

  • to if something doesn't work out of the box for you,

  • it allows you to customize it to your business needs.

  • TFX is by no means a perfect platform,

  • but we strive to collect feedback and improve it,

  • so please give it to us.

  • ANUSHA RAMESH: Internally, TFX platform powers

  • several Alphabet companies.

  • Within Google, it powers several of our most important products

  • that you're probably familiar with.

  • Also, TFX powers by integrates with Cloud AI Platform, ML

  • Engine, and Dataflow products, and thus

  • helping you realize your ML needs robustly on GCP.

  • TFX also powers several of Cloud AutoML solutions

  • that automate and simplify ML for you, so check them out.

  • To the external world, TFX is available

  • as an end-to-end solution.

  • Our friends at Twitter, who spoke at the keynote yesterday,

  • talked about they have already published

  • like a fascinating blog post on how

  • they are ranking tweets on their home timeline using TensorFlow.

  • They are using TensorFlow Model Analysis and TensorFlow Hub

  • for sharing word embeddings.

  • They evaluated several other technologies and frameworks

  • and decided to go ahead with TensorFlow

  • ecosystem for their production requirements.

  • Similar to Twitter, we also have several other partners

  • who are using TFX.

  • I hope you will join us right after this talk

  • to hear from Spotify on how they are

  • using TFX for their production workflow needs.

  • We also have another detailed talk later today

  • called "TFX, Production ML Pipelines with TensorFlow."

  • So we have two great talks-- one by Spotify,

  • the other one a detailed talk on TFX.

  • If you're interested in learning more, check these two talks.

  • Visit our web page tensorflow.org/tfx to get

  • started.

  • Thank you.

  • [APPLAUSE]

  • TONY JEBARA: Very excited to be here.

  • So my name is Tony Jebara.

  • Today, I'm going to be talking to you about Spotify,

  • where I work today, and how we've

  • basically taken personalization and moved it onto TensorFlow.

  • I'm the VP of engineering and also

  • the head of machine learning.

  • And I'm going to describe our experience

  • moving onto TensorFlow and to the Google Cloud Platform

  • and Kubeflow, which has been really an amazing experience

  • for us and really has opened up a whole new world

  • of possibilities.

  • So just a quick note, as Ben was saying, before I started

  • at Spotify, I was at Netflix.

  • And just like today, I'm going to talk about Spotify's home

  • page, also at Netflix, I was working

  • on personalization algorithms and the home screen of Netflix

  • as well.

  • So you may be thinking, oh, that sounds like a similar job.

  • They both have entertainment, and streaming,

  • and home screens, and personalization,

  • but there are fundamental differences.

  • And I learned about those fundamental differences

  • recently.

  • I joined a couple of months ago, but the biggest

  • fundamental difference to me is it's a difference in volume

  • and scale.

  • And I'll show you what I mean in just a second.

  • So if you look at movies versus music or TV

  • shows versus podcasts, you'll see

  • that there's a very different magnitude of scale.

  • So on the movie side, there's about 158 million Netflix

  • users.

  • On the music side, there's about 230 million Spotify users.

  • That's also a different scale.

  • Also the content really is a massively different scale

  • problem.

  • There's only about 5,000 movies and TV shows

  • on the Netflix service.

  • Whereas on Spotify, we've got about 50 million tracks

  • and about half a million almost podcasts.

  • So if you think about the amount of data and content

  • you need to index, that's a huge scale difference.

  • There's also content duration.

  • Once you make a recommendation off the home screen

  • on, let's say, Netflix, the user is

  • going to consume that recommendation for 30 minutes

  • for a TV show, maybe several seasons sometimes, two hours

  • for a movie.

  • Only 3 and 1/2 minutes of consumption per track,

  • let's say, on Spotify.

  • And they don't replay as often on, let's say, movies,

  • but you'll replay songs very often.

  • So it's really a very different world of speed and scale.

  • And we're getting a lot more granular data about the users.

  • Every 3 and 1/2 minutes, they're changing tracks, listening

  • to something else, engaging differently with the service,

  • and they're touching 50 million-plus pieces of content.

  • That's really a very granular data.

  • And that's one of the reasons why

  • we had to move to something like TensorFlow

  • to really be able to scale and do something that's high speed

  • and in fact, real time.

  • So this is our Spotify home.

  • How many people here use Spotify?

  • All right, so about half of you.

  • I'm not trying to sell Spotify on anyone.

  • I'm just trying to say that many of you

  • are familiar with this screen.

  • This is the home page.

  • So this is basically driven by machine learning.

  • And every month, hundreds of millions of users

  • will see this home screen.

  • And every day, tens of millions of users

  • will see this home screen.

  • And this is where you get to explore what we have to offer.

  • It's a two-dimensional grid.

  • Every image here is what we call a card.

  • And the cards are organized into rows we call shelves.

  • And what we like to do is move these cards and shelves around

  • from a massive library of possible choices

  • and place the best ones for you at the top of your screen.

  • And so when we open up Spotify, we have a user profile.

  • The home algorithms will score all possible cards

  • and all possible shelves and pack your screen

  • with the best possible cards and shelves combination for you.

  • And we're doing this in real time based off

  • of your choices of music, your willingness

  • to accept the recommendation, how long you

  • play different tracks, how long you

  • listen to different podcasts.

  • And we have dozens and dozens of features

  • that are updating in real time.

  • And every time you go back to the home page,

  • it'll be refreshed with the ideal cards

  • and shelves for you.

  • And so we like to say there isn't a Spotify home

  • page or a Spotify experience.

  • Really, there's 230 million Spotify's-- one for each user.

  • So how do we do this and how did we do this in the past?

  • Well, up until our migration to GCP TensorFlow and Kubeflow,

  • we wrote a lot of custom libraries and API in order

  • to drive the machine learning algorithms

  • behind this personalization effort.

  • So the specific machine learning algorithm

  • is a multi-armed bandit.

  • Many of you have heard about that.

  • It's trying to balance exploration and exploitation,

  • trying to learn which cards and shelves are good for you

  • and score them, but also trying out some new cards and shelves

  • that it might not know if they're kind

  • of hidden gems for you or not.

  • And we have to employ counterfactual training,

  • and log propensities, and log some small amounts

  • randomization in order to train these systems in order

  • to avoid large-scale A/B tests and large-scale randomization.

  • Before we moved to TensorFlow, this

  • was all done in custom, let's say, APIs and data libraries.

  • And that had a lot of challenges.

  • So we'd always have to go back and rewrite code.

  • And if we wanted to compare different choices of the model

  • underneath the multi-armed bandit,

  • like logistic regression versus trees versus deep neural nets,

  • that involved tons of custom code rewriting.

  • And so that would make the system

  • really brittle, hard to innovate and iterate on.

  • And then when you finally pick something

  • you want to roll out, when you roll it out,

  • you're also worried that it may fail because

  • of all this custom stitching.

  • So then we moved over to the TensorFlow ecosystem.

  • And we said, hey, let's move on to techniques like TensorFlow

  • Estimators and TensorFlow Data Validation

  • to avoid having to do all this custom work.

  • And so for TensorFlow Estimator, what we can do

  • is now build machine learning pipelines

  • where we get to try a variety of models

  • and train and evaluate them very quickly--

  • some things like logistic regression,

  • boosted trees, and deep models--

  • and in a much faster kind of iterative process.

  • And then also migrating out to Kubeflow as well

  • was super valuable because that helped us manage the workload,

  • and accelerate the pace of experimentations, and roll out.

  • And so this has been super fast for automatically

  • retraining, and scaling, and speeding up our machine

  • learning training algorithms.

  • Another thing that we really rely on heavily

  • is TensorFlow Data Validation, which

  • is another part of the TFX offering.

  • One key thing we have to do is find bugs

  • in our data pipelines and our machine

  • learning pipelines while we're developing them,

  • and evaluating them, and rolling them out.

  • For example, we want to catch data issues

  • as quickly as possible.

  • And so one thing we can do with TFDV

  • is quickly find out if there's some missing data or data

  • inconsistencies in our pipelines.

  • And we have this dashboard that quickly

  • plots the distribution of any feature,

  • and the counts of different data sets, and so on, and also

  • kind more granular things like how much

  • is the user spending on the service,

  • what are their preferences, and so on, looking

  • at those distributions.

  • And we caught a bug like this one

  • on the left, which basically was showing us

  • that in our training data, the premier tier data

  • samples were missing from our training pipelines.

  • And then on the validation, the free shuffle tier data set

  • and samples were missing from our evaluation pipeline.

  • So this is horrible from a machine learning perspective,

  • but we caught it quickly.

  • We're able to now trigger alarms, and alerts,

  • and have dashboards, and look at these distributions daily,

  • so the machine learning engineers

  • don't have to worry about the data

  • pipelines into their system.

  • So now, we have Spotify paved path,

  • which is a machine learning infrastructure based off

  • of Google Cloud, Kubeflow, and TensorFlow.

  • And it has achieved significant lists off

  • of baseline systems and popularity-based methods.

  • And now, we're just scratching the surface.

  • We want to do many more sophisticated machine

  • learning types of explorations.

  • And we really view this as an investment.

  • It's an investment in machine learning engineers

  • and their productivity.

  • We don't want machine learning engineers to spend tons of time

  • fixing custom infrastructure, and catching kind

  • of silly bugs, and updating libraries, and having to learn

  • bespoke types of platforms.

  • Instead, we want to have them go on

  • to a great kind of lingua franca platform like GCP, Kubeflow,

  • and TensorFlow and really think about

  • machine learning, and the user experience,

  • and building better entertainment for the world.

  • And that's what we want to enable,

  • not necessarily building custom, let's

  • say, machine learning infrastructure.

  • And so if you're excited about working in a great platform

  • that's got kind of a great future ahead of it,

  • like TFX, and Google Cloud, and Kubeflow,

  • but also working on really deep problems around entertainment

  • and what makes people excited and engaged with a service,

  • and music, and audio, and podcasts,

  • then you can get this best of both worlds.

  • We're hiring.

  • Please look at these links and come work with us.

  • Thank you so much.

  • [APPLAUSE]

  • MIKE LIANG: Good morning, everyone.

  • My name is Mike.

  • I'm one of the product managers on the TensorFlow team.

  • And today, I'd like to share with you something

  • about TensorFlow Hub.

  • So we've seen some amazing breakthroughs

  • on what machine learning can do over the past few years.

  • And throughout this conference, you've

  • heard a lot about the services and tools that

  • have been built on top of them.

  • Machines are becoming capable of doing

  • a myriad of amazing things from vision

  • to speech to natural language processing.

  • And with TensorFlow, machine learning experts and data

  • scientists are able to combine data, and algorithms,

  • and computational power together to train machine learning

  • models that are very proficient at a variety of tasks.

  • But if your focus was to solve business problems

  • or build new applications, how can

  • you quickly use machine learning in your solutions?

  • Well, this is where TensorFlow Hub comes in.

  • TensorFlow Hub is a repository of pretrained ready-to-use

  • models to help you solve novel business problems.

  • It has a comprehensive collection of models

  • from across the TensorFlow ecosystem.

  • And you can find state-of-the-art research

  • models here in TensorFlow Hub.

  • Many of the models here also can be composed into new models

  • and retrained using transfer learning.

  • And recently, we've added a lot of new models

  • that you can deploy straight to production from cloud

  • to the edge through TensorFlow Lite or TensorFlow.js.

  • And we're getting many contributions

  • from the community as well.

  • TensorFlow Hub's rich repository of models

  • covers a wide range of machine learning problems.

  • For example, in image-related tasks,

  • we have on a variety of models for object detection, image

  • classification, automatic image augmentation,

  • and some new things like image generation for cell transfers.

  • In text-related tasks, we have some of the state-of-the-art

  • models out there, like BERT and ALBERT,

  • and universal sentence encoders.

  • And you've heard about some of the things

  • that machines can deal with with BERT just yesterday.

  • These encoders can support a wide range of natural language

  • understanding tasks, such as question and answering, text

  • classification, or sentence analysis.

  • And there are also video-related models too.

  • So if you want to do gesture recognitions,

  • you can use some of the models there or even video generation.

  • And we've recently actually just completely upgraded

  • our front-end interface so that it's a lot easier to use.

  • So many of these models can be easily found

  • or searched going to TensorFlow Hub.

  • We've invested a lot of energy in making these models

  • in TensorFlow Hub easily reusable

  • or composable into new models, where you can actually

  • bring your own data and through transfer learning,

  • improve the power of those models.

  • With one line of code, you can bring these models right

  • into TensorFlow 2.

  • And using the high-level Keras APIs or the low-level APIs,

  • you can actually go and retrain these models.

  • And all these models can also be deployed straight into machine

  • learning pipelines, like TFX, as you've

  • heard about earlier today.

  • Recently, we've added support for models

  • that are ready to deploy.

  • These pretrained models have been

  • prepared for a wide range of environments

  • across the TensorFlow ecosystem.

  • So if you want to work in a web or a node-based environment,

  • you can deploy them into TensorFlow.js

  • or if you are working with mobile [INAUDIBLE] devices,

  • you can employ some of these models through TensorFlow Lite.

  • In TensorFlow Hub, you can also discover ready-to-use models

  • for Coral edge TPU devices.

  • And we recently start adding these.

  • These devices combine TensorFlow Lite models

  • with really efficient accelerators.

  • That allows companies to create products

  • that can run inference right on the edge.

  • And you can learn more about that at coral.ai.

  • So here's an example of how you can use TensorFlow Hub to do

  • fast, artistic style transfer that

  • can work on an arbitrary painting

  • style or generative models.

  • So let's say you had an image of a beautiful yellow Labrador,

  • and you wanted to see what that style would

  • look like in Kandinsky.

  • Well, with one line of code, you can

  • load one of these pretrained style transfer models

  • from the Magenta team at Google, and then you

  • can just apply it to your content and style image

  • and you can get a new stylized image.

  • And you can learn more about some simple tutorials

  • like that in this link below.

  • Or let's say you wanted to train a new text

  • classifier, such as predicting whether a movie review had

  • a positive or negative rating.

  • Well, training a text embedding layer

  • may take a lot of time and data to make that work well,

  • but with TensorFlow Hub, you can pull

  • a number of pretrained text models

  • with just one line of code.

  • And then you can incorporate it into TensorFlow 2.

  • And using standard APIs like Keras,

  • you can retrain it on your new data set just like that.

  • We've also integrated an interactive model visualizer

  • in beta for some of the models.

  • And this allows you to immediately preview

  • what the model would do and run that model within the web page

  • or on a mobile app, like a Playground app.

  • For example, here is a model from the Danish Mycological

  • Society for identifying a wide range of fungi

  • as part of the Svampeatlas project.

  • You can directly drag an image onto the site

  • and the model will run it in real time

  • and show you the results, such as what

  • mushrooms were in that image.

  • And then you can click on it to go and get more information.

  • Many of the TensorFlow Hub models

  • also have Colab links, so you can play with these models

  • with the code right inside the browser

  • and powered by the Google infrastructure with Colab.

  • In fact, the Google machine learning fairness team also

  • has built some Colab notebooks that

  • can pull text embeddings and other embeddings

  • straight into their platform so that you can assess

  • whether there are potential biases for a standard set

  • of tasks.

  • And you can come by our demo booth

  • if you want to learn more about that.

  • TensorFlow Hub is also powered by the community.

  • When we launched TensorFlow Hub last year,

  • we were sharing some of the state-of-the-art models from

  • DeepMind and Google.

  • But now, a wide range of publishers

  • are beginning to share their models

  • from a diverse set of areas, such as Microsoft AI for Earth,

  • the Met, or NVIDIA.

  • And these models can be used for many different tasks,

  • such as from studying wildlife populations

  • through these camera traps or for automatic visual defect

  • detections in industries.

  • And Crowdsource by Google is also

  • generating a wide range of data through the Open Images

  • Extended data sets.

  • And with that, we can get an even richer set

  • of ready-to-use models across many different specific data

  • sets.

  • So with hundreds of models that are

  • pretrained and ready to use, you can

  • use TensorFlow Hub to immediately begin

  • using machine learning to solve some business problems.

  • So I hope that you can come by our demo booth

  • or go to tfhub.dev.

  • And I'll see you there.

  • Thank you.

  • [APPLAUSE]

  • UJVAL KAPASI: So the TensorFlow team with TF 2

  • has solved a hard problem, which is

  • to make it easy for you to easily express your ideas

  • and debug them in TensorFlow.

  • This is a big step, but there are additional challenges

  • in order for you to obtain the best results for your research

  • or your product designs.

  • And I'd like to talk about how NVIDIA is solving

  • three of these challenges.

  • The first is simple acceleration.

  • The second is scaling to large clusters.

  • And finally, providing code for every step of the deep learning

  • workflow.

  • One of the ingredients of the recent success of deep learning

  • has been the use of GPUs for providing

  • the necessary raw compute horsepower.

  • This compute is like oxygen for new ideas and applications

  • in the field of AI.

  • So we designed and shipped Tensor Cores in our Volta

  • and Turing GPUs in order to provide an order of magnitude

  • more performance capability, compute capability

  • than was previously available.

  • And we built libraries, such as cuDNN,

  • to ensure that all the important math functions inside of TF

  • can run on top of Tensor Cores.

  • And we update these regularly as new algorithms are invented.

  • We worked with Google to provide a simple API

  • so you can from your TensorFlow script,

  • easily activate these routines in these libraries

  • and train with mixed precision on top of Tensor Cores

  • and get speed-ups for your training

  • with examples here, for instance, 2x to 3x faster,

  • which helps you iterate faster on your research,

  • and also maybe within a fixed budget of time,

  • get better results.

  • Once you have a trained model, we

  • provide a simple API inside of TensorFlow

  • to activate TensorRT so you can get

  • drastically faster latency for serving your predictions, which

  • lets you deploy perhaps more sophisticated models

  • or pipelines than you would be able to otherwise.

  • But optimizing the performance of a single GPU is not enough.

  • And let me give you an example.

  • So Google, last year, released a model called BERT.

  • As Jeff Dean explained yesterday,

  • this model blew away the accuracy

  • on a variety of language tasks compared to any approach

  • or model previous to it.

  • But on a single GPU, it takes months to train.

  • Even on a server with eight GPUs,

  • it takes more than a week.

  • But if you can train with 32 servers, or 256 GPUs,

  • training can complete with TensorFlow in mere hours.

  • However, training at these large scales

  • introduces and poses several new challenges

  • at every level of the system.

  • If you don't properly codesign the hardware and software

  • and precisely tune them, then as you add more compute,

  • you will not get a commensurate increase in performance.

  • And I think NVIDIA is actually ideally uniquely suited

  • to solve some of these challenges

  • because we're building hardware from the level of the GPU

  • to servers to supercomputers, and we're

  • working on challenges at every level on hardware

  • design, software design, system design, and at

  • the boundaries of these.

  • You know, the combination of a bunch of our work on this

  • is the DGX SuperPOD.

  • And to put its capabilities sort of in visceral terms,

  • a team at NVIDIA recently was able to on the DGX SuperPOD,

  • as part of Project Megatron, train the largest language

  • model ever, more than 8 billion parameters,

  • 24 times larger than BERT.

  • Another contribution that NVIDIA is making

  • and what we're working on is providing reliable code

  • that anyone from individuals to enterprises

  • can build on top of.

  • NVIDIA is doing the hard work of optimizing, documenting,

  • qualifying, packaging, publishing, maintaining code

  • for a variety of models and use cases

  • for every step of the deep learning workflow from research

  • to production.

  • And we're curating this code and making

  • it available to everyone, both at ngc.nvidia.com,

  • but also other places where developers might frequent,

  • such as GitHub and TF Hub, which you just heard about as well.

  • So I hope that in the short time,

  • I was able to convey some of the problems

  • that NVIDIA is working on, the challenges we're working on,

  • and how we're making available to the TensorFlow

  • community, along with Google, simple APIs

  • for acceleration, solving scaling challenges,

  • putting out DGX SuperPODs, building DGX SuperPODs,

  • and curating code that anyone can

  • build on top of for the entire deep learning workflow.

  • Thank you for your time.

  • I hope you enjoy the rest of the conference.

  • ANNA ROTH: So the world is full of experts,

  • like pathologists who can diagnose diseases, construction

  • workers who know that if a certain tube is more than 40%

  • obstructed, you have to turn that machine off

  • like right now, people who work in support and know how to,

  • like, kind of triage tickets.

  • And one of the exciting things about kind

  • of the past few years is that it's become increasingly easy

  • for people who want to take some thing that they know how to do

  • and teach it to a machine.

  • I think the big dream is that anybody could

  • be able to go and do that.

  • It's what I spent my time on in the past few years.

  • I've worked on the team that launched Cognitive Services.

  • And I spent the past few years working on customvision.ai.

  • It's a tool for building image classifiers and object

  • detectors.

  • But it really has never been easier to build machine

  • learning models, like the tooling is really good.

  • We're all here at TensorFlow World.

  • Computational techniques have gotten faster,

  • transfer learning easier to use.

  • You have access to compute in the cloud.

  • And then educational materials have, like, never

  • been better, right?

  • One of my hobbies is to go and, like,

  • browse the fast.ai forums just to see

  • what learners are building.

  • And it's completely inspiring.

  • That being said, it's actually still

  • really hard to build a machine learning model.

  • In particular, it's hard to build

  • robust production-ready models.

  • So I've worked with hundreds-- actually,

  • by this point, thousands of customers,

  • who are trying to automate some particular task.

  • And a lot of these projects fail.

  • You know, it's really easy to build your first model.

  • And sometimes, it's actually kind of a trick, right?

  • Like, you can get something astonishingly good

  • in a couple of minutes.

  • You get some data off the web, like model.fit,

  • and like a few minutes later, I have

  • a model that does something and it's kind of uncanny.

  • But getting that to be robust enough

  • to use kind of in a real environment

  • is actually really tough.

  • So the first problem people run into,

  • it's actually hard to transfer your knowledge to a machine.

  • So like this might seem trite, but when people first train

  • object detectors, actually a lot of people

  • don't put bounding boxes around every single object.

  • Like, the model doesn't work.

  • Or they get stuck on the kind of parsimoniousness.

  • So for example, I had one guy in Seattle.

  • People like the Seahawks.

  • He wanted to train a Seahawks detector.

  • He puts bounding boxes around a bunch of football players

  • and discovers that he's actually really kind of built

  • a football person detector, as opposed to a Seahawks detector.

  • Like it's really upset when he kind of uploads

  • another information from another team

  • because the model didn't have that semantic knowledge

  • that the user had.

  • And so, like, you know, this is stuff

  • you can document away, right?

  • Like, you can kind of learn this in your first hour or so,

  • but it speaks to the unnaturalness of the way

  • in which we train models today.

  • Like when you teach something to a computer,

  • you're having to kind of give it data that represents

  • in some way a distribution.

  • That's not how you and I would normally teach something.

  • And it really kind of trips people up a lot.

  • But sure, so you grok that.

  • You figure it out.

  • You figure out, all right, the problem is building a data set.

  • That's really hard to do too.

  • And so I want to walk through one kind of hypothetical case.

  • So I get a customer.

  • And what they really wanted to do

  • was recognize when people would upload it to their online photo

  • store, like something that might be, like,

  • personally-identifiable information.

  • So for example, if you'd uploaded

  • a photo of a credit card or a photo of your passport.

  • So to start this off, they scrape some web data, right?

  • You just, like, go.

  • You use kind of like a search API

  • and you get a bunch of images of credit cards off the web.

  • You do evaluations.

  • All right, it looks like we're going to have

  • maybe a 1% false positive rate.

  • Well, that's not good.

  • I got a million user images I want to run this on.

  • Suddenly, I have 10,000 sort of potential false positives.

  • So then they kind of, but they build the model.

  • Let's see how it goes.

  • And when they try it out on real user data,

  • it turns out that the actual false positive rate,

  • as you might expect, is much, much, much higher.

  • All right, so now, the user has to take another round.

  • So now, let's add some negative classes, right?

  • We want to be able to kind of make

  • examples of other kinds of documents,

  • sort of non-credit card things, et cetera, et cetera.

  • But it's still OK, right?

  • We're on day one or day two of the project,

  • like this still feels good.

  • You know, we're able to kind of make progress.

  • It's a little more tedious.

  • Second round-- I think you guys kind of know

  • where this is going.

  • It doesn't work.

  • Still an unacceptably high number of negative examples

  • are coming up-- way too many false positives.

  • So now, we kind of go into kind of stage three

  • of the experience of trying to build a usable model, which

  • is, all right, let's collect some more data

  • and let's go kind of label some more data.

  • It starts to get really expensive, right?

  • Now, something that I thought was

  • going to take me a day in the first round,

  • I'm on like day seven of getting a bunch of labelers,

  • trying to get MTurk to work, and labeling kind

  • of very large amounts of data.

  • It turns out the model still doesn't work.

  • So the good news was at this point,

  • somebody said, all right, well, let's

  • try one of these kind of interpretability techniques,

  • [INAUDIBLE] saliency visualization.

  • And it turns out, the problem was thumbs.

  • So when you are using kind of-- when people take photos

  • on their phone of something like a document,

  • they're usually holding it, which

  • is not what you see in web-scraped images for example,

  • but it's kind of what you tend to do.

  • So it turned out that they had basically

  • built a classifier that recognized

  • are you holding something and is your thumb in the picture?

  • Well, that was not the goal, but OK.

  • But this isn't just kind of a one-off problem.

  • It happens all the time.

  • So for example, there's that really famous nature paper

  • from 2017 where they were doing like dermatology images.

  • And they kind of discover, all right, well,

  • having a ruler in an image of a mole

  • is actually a very good signal that that might be cancerous.

  • You might think we learned from that.

  • Except just a couple weeks ago, I think,

  • Walker, et al published another paper

  • where they said having surgical markings in an image,

  • so having marked up things around a mole, also

  • tended to trip up the classifier because, not unsurprisingly,

  • people don't tend to--

  • the training data didn't have any marked up skin

  • for people that didn't have cancerous moles.

  • And a lot of people, I think, particularly these people

  • who are sometimes on our team, look at that

  • and say it's user error, it's human error.

  • They weren't building the right distribution of data.

  • That's like extremely hard to do, even for experts.

  • And even harder to do for somebody who's just getting

  • started.

  • Because reality, real world environments

  • are incredibly complex.

  • This is where projects die.

  • Out of domain problems, which most problems people

  • want to actually do something in a real world environment,

  • whether it's a camera, a microphone, a website, where

  • user inputs are unconstrained, are

  • incredibly challenging to build good data for.

  • One of my favorite examples, I had a customer

  • who had built a system, [INAUDIBLE] camera, an IoT

  • camera.

  • And one day it hails.

  • And it turns out, it just hadn't hailed in this town before.

  • Model fails.

  • You can't expect people to have had data for hail.

  • Luckily, they had a system of multiple sensors,

  • they had other kinds of validation,

  • a human in the loop.

  • It all worked out.

  • But this thing is really challenging to do, rare events.

  • If I want to recognize explosions,

  • how much data am I going to have from explosions?

  • Or we had a customer who was doing hand tracking.

  • It turned out, the model failed the first time somebody

  • with a hand tattoo used it.

  • There aren't that many people with hand tattoos.

  • But you still want your model to work in that case.

  • Look, there's a lot of techniques for being

  • able to do this better.

  • But I thing it's worth recognizing that it's actually

  • really hard to build a model that's an important problem.

  • Once you build a model, you got to figure

  • if it's going to work.

  • A lot of the great work here is happening in the fairness

  • and bias literature.

  • But there is an overall impact for any customer or any person

  • who's trying to build a high quality model.

  • One of the big problems is that aggregate statistics

  • hide failure conditions.

  • You might make this beautiful PR curve.

  • Even the slices that you have look really great.

  • And then it turns out that you don't actually

  • have a data set with all the features in your model.

  • So let's say you're doing speech,

  • you may not have actually created a data set that says,

  • OK, well, this is a woman, a woman with an accent,

  • or a child with an accent.

  • All these subclasses become extremely important.

  • And it becomes very expensive and difficult to actually go

  • and figure out where your model is failing.

  • A lot of techniques for this.

  • Sampling techniques, pairing uninterpreted models,

  • interpreted models, things that you can do.

  • But it's super challenging for a beginner

  • to figure out what their problems might be,

  • and even for experts.

  • You see these problems come up of railroad systems

  • all the time.

  • Finally, when you have a model it

  • can be tough to actually figure out what to do with it.

  • Most of the programs that you use

  • don't have probabilistic outputs in the real world.

  • What does it mean for something to be 70% likely

  • or to have seven or eight trained models in a row?

  • It might more obvious for you.

  • But for an end user, it can actually hard to figure out

  • what actions you should take.

  • Look, nothing I've said today, I think,

  • is particularly novel for the folks in this room.

  • You've gone through all of these challenges before.

  • You've built a model, you've built

  • a data set, you've probably built it 18 times,

  • finally gotten it to work.

  • I had a boss who used to say that problems are inspiring.

  • And for me, there isn't a problem

  • that is more inspiring in figuring out how can we

  • help anybody who wants to automate some problem be

  • able to do so and be able to train a machine

  • and have a robust production ready model.

  • I can't think of a more fun problem.

  • I can't think of a more fun problem

  • to work on with everybody in this room.

  • Thanks.

  • [APPLAUSE]

  • SARAH SIRAJUDDIN: Welcome, everyone.

  • I'm Sarah.

  • I'm the engineering lead for TensorFlow Lite.

  • And I'm really happy to be here talking to you

  • about on device machine learning.

  • JARED DUKE: And I'm Jared, tech lead on TensorFlow Lite.

  • And I'm reasonably excited to share with you our progress

  • and all the latest updates.

  • SARAH SIRAJUDDIN: So first of all, what is TensorFlow Lite?

  • TensorFlow Lite is our production ready framework

  • for deploying machine learning on mobile and embedded devices.

  • It is cross-platform, so it can be used for deployment

  • on Android, iOS Linux based space systems, as well

  • as several other platforms.

  • Let's talk about the need for TensorFlow Lite

  • and why we build an on device machine learning solution.

  • Simply put, there is now a huge demand

  • for doing machine learning on the edge.

  • And it is driven by a need for building user experiences

  • which require low latency.

  • Further factors are poor network connectivity

  • and the need for user privacy preserving features.

  • All of these are easier done when

  • you're doing machine learning directly on the device.

  • And that's why we released TensorFlow Lite late in 2017.

  • This shows our journey since then.

  • We've made a ton of improvements across the board in terms

  • of the ops that we support, performance, usability,

  • tools which allow you to optimize your models,

  • the number of languages we support in our API,

  • as well as the number of platform

  • TensorFlow Lite runs on.

  • TensorFlow Lite is deployed on more than three billion devices

  • globally.

  • Many of Google's own largest apps are using it,

  • as are apps from several other external companies.

  • This is a sampling of apps which use TensorFlow Lite.

  • Google Photos, Gboard, YouTube, Assistant, as well as

  • leading companies like Hike, Uber, and more.

  • So what is TensorFlow Lite being used for?

  • We find that our developers use it

  • for popular use cases around text, image, and speech.

  • But we are also seeing lots of emerging and new use

  • cases come up in the areas of audio and content generation.

  • This was a quick introduction about TensorFlow Lite.

  • In the rest of this talk we are going

  • to be focusing on sharing our latest

  • updates and the highlights.

  • For more details, please check out the TensorFlow Lite talk

  • later in the day.

  • Today I'm really excited to announce

  • a suite of tools which will make it really easy for developers

  • to get started with TensorFlow Lite.

  • First up, we're introducing a new support library.

  • This makes it really easy to preprocess and transform

  • your data to make it ready for inferencing with a machine

  • learning model.

  • So let's look at an example.

  • These are the steps that a developer typically

  • goes through to use a model in their app

  • once they have converted it to the TensorFlow Lite model

  • format.

  • Let's say they're doing image classification.

  • So then they will likely need to write code which

  • looks something like this.

  • As you can see, it is a lot of code for loading, transforming,

  • and using the data.

  • With the new support library, the previous wall

  • of code that I showed can be reduced significantly to this.

  • Just a single line of code is needed

  • for each of loading, transforming, and using

  • the resultant classifications.

  • Next up, we're introducing model metadata.

  • Now model authors can provide a metadata spec

  • when they are creating and converting models.

  • And this makes it easier for users of the model

  • to understand what the model does

  • and to use it in production.

  • Let's look at an example again.

  • The metadata descriptor here provides additional information

  • about what the model does, the expected format of the inputs,

  • and what is the meaning of the outputs.

  • Third, we've made our model repository much richer.

  • We've added several new models across

  • several different domains.

  • All of them are pre-converted into the of TensorFlow Lite

  • model formats, so you can download them and use them

  • right away.

  • Having a repository of ready to use models

  • is great for getting started and trying them out.

  • However, most of our developers will

  • need to customize these models in some way, which

  • is why we are releasing a set of APIs which you can use

  • your own data to retrain these models

  • and then use them in your app.

  • We've heard from our developers that we

  • need to provide better and more tutorials and examples.

  • So we're releasing today several full examples which show code

  • not only how to use a model but how you

  • would write an end-to-end app.

  • And these examples have been written

  • for several platforms, Android, iOS, Raspberry Pi and even

  • Edge TPU.

  • And lastly, I'm super happy to announce that we have just

  • launched a brand new course on how to use

  • TensorFlow Lite on Udacity.

  • All of these are live right now.

  • Please check them out and give us feedback.

  • And this brings me to another announcement

  • that I'm very excited about.

  • We have worked with the researchers at Google Brain

  • to bring mobile BERT to developers

  • through TensorFlow Lite.

  • BERT is a method of pre-training language representations, which

  • gets really fantastic results on a wide variety

  • of natural language processing tasks.

  • Google itself uses BERT extensively to understand

  • natural text on the web.

  • But it is having a transformational impact

  • broadly across the industry.

  • The model that we are releasing is up to 4.4 times faster

  • than standard BERT, while being four times smaller with no loss

  • in accuracy.

  • The model is less than 100 megabytes in size.

  • So it's usable even on lower-end phones.

  • It's available on our site, ready for use right now.

  • We're really excited about the new use

  • cases this model will unlock.

  • And to show you all how cool this technology really

  • is, we have a demo coming up of mobile BERT running

  • live on a phone.

  • I'll invite Jared to show you.

  • JARED DUKE: Thanks, Sarah.

  • As we've heard, BERT can be used for a number

  • of language related tasks.

  • But today I want to demonstrate it for question answering.

  • That is, given some body of text and a question

  • about its content, BERT can find the answer

  • to the question in the text.

  • So let's take it for a spin.

  • We have an app here which has a number of preselected Wikipedia

  • snippets.

  • And again, the model was not trained on any

  • of the text in these snippets.

  • I'm a space geek, so let's dig into the Apollo program.

  • All right.

  • Let's start with an easy question.

  • [BEEPING]

  • What did Kennedy want to achieve with the Apollo program?

  • COMPUTER GENERATED WOMAN'S VOICE:

  • Landing a man on the moon and returning him safely

  • to the Earth.

  • JARED DUKE: OK.

  • But everybody knows that.

  • Let's try a harder one.

  • [BEEPING]

  • Which program came after Mercury but before Apollo?

  • COMPUTER GENERATED WOMAN'S VOICE: Project Gemini.

  • JARED DUKE: Not bad.

  • Hmm.

  • All right, BERT, you think you're so smart,

  • [BEEPING]

  • Where are all the aliens?

  • COMPUTER GENERATED WOMAN'S VOICE: Moon.

  • JARED DUKE: There it is.

  • [LAUGHTER]

  • Mystery solved.

  • Now all jokes aside, you may not have

  • noticed that this phone is running in airplane mode.

  • There's no connection to the server.

  • So everything from speech recognition

  • to the BERT model to text to speech

  • was all running on device using ML.

  • Pretty neat.

  • [APPLAUSE]

  • Now I'd like to talk about some improvements and investments

  • we've been making in the TensorFlow Lite

  • ecosystem focused on improving your model deployment.

  • Let's start with performance.

  • A key goal of TensorFlow Lite is to make your models run

  • as fast as possible across mobile and Edge CPUs, GPUs,

  • DSPs, and NPUs.

  • We've made many investments across all of these fronts.

  • We've made significant CPU improvements.

  • We've added OpenCL support to improve GPU acceleration.

  • And we've updated our support for all of Android Q and an API

  • ops and features.

  • Our previously announced Qualcomm DSP delegate,

  • targeting mid- and low-tier devices,

  • will be available for use in the coming weeks.

  • We've also made some improvements

  • in our performance and benchmark tooling

  • to better assist both model and app

  • developers in identifying the optimal deployment

  • configuration.

  • To highlight some of these improvements, let's

  • take a look at our performance just six months ago at Google

  • I/O using MobileNet for classification inference

  • and compare that with the performance of today.

  • This represents a massive reduction in latency.

  • And you can expect this across a wide range

  • of models and devices, both low end and high end.

  • Just pull the latest version of TensorFlow Lite into your app

  • and you can see these improvements today.

  • Digging a little bit more into these numbers,

  • floating point CPU execution is our default path.

  • It represents a solid baseline.

  • Enabling quantization, now easier

  • with post-training quantization, provides three times faster

  • inference.

  • And enabling GPU execution provides yet more of a speedup,

  • six times faster than our CPU baseline.

  • And finally, for absolute peak performance,

  • we have the Pixel 4 neural core accessible via the NNAPI

  • TensorFlow Lite delegate.

  • This kind of specialized accelerator,

  • available in more and more of the latest devices,

  • amongst capabilities and use cases

  • that just a short time ago were thought impossible

  • on mobile devices.

  • But we haven't stopped there.

  • Seamless and more robust moral conversion

  • has been a major priority for the team.

  • And we'd like to give an update on a completely new TensorFlow

  • Lite model conversion pipeline.

  • This new converter was built from the ground up

  • to provide more intuitive error messages when conversion fails,

  • add support for control flow, and for more advanced models,

  • like BERT, Deep Speech v2, Mask R-CNN, and more.

  • We're excited to announce that the new converter is

  • available in beta, and will be available more generally soon.

  • We also want to make it easy for any app developer

  • to use TensorFlow Lite.

  • And to that end, we've released a number of new first class

  • language bindings, including Swift, Objective-C,

  • C# for Unity, and more.

  • This complements our existing set of bindings in C++, Java,

  • and Python.

  • And thanks to community efforts, we've

  • seen the creation of additional bindings

  • in Rust, Go, and even Dart.

  • As an open source project, we welcome and encourage

  • these kinds of contributions.

  • Are model optimization toolkit remains the one-stop shop

  • for compressing and optimizing your model.

  • There will be a talk later today with more details.

  • Check out that talk.

  • We've come a long way, but we have many planned improvements.

  • Our roadmap includes expanding the set of supported models,

  • further improvements in performance,

  • as well as some more advanced features, like on device

  • personalization and training.

  • Please check out our roadmap on tensorflow.org

  • and give us feedback.

  • Again, we're an open source project

  • and we want to remain transparent

  • about our priorities and where we're headed.

  • I want to talk now about our efforts

  • in enabling ML not just on billions of phones

  • but on the hundreds of billions of embedded devices

  • and microcontrollers that exist and are

  • used in production globally.

  • TensorFlow Lite for microcontrollers

  • is that effort.

  • It uses the same model format, the same conversion pipeline,

  • and largely the same kernel library as TensorFlow for Lite.

  • So what are these microcontrollers?

  • These are the small, low power all-in-one computers

  • that power everyday devices all around us,

  • from microwaves and smoke detectors to sensors and toys.

  • It can cost as little as $0.10 each.

  • And with TensorFlow, it's possible to use them

  • for machine learning.

  • Arm, an industry leader in the embedded market,

  • has adopted TensorFlow as their official solution

  • for AI on Arm microcontrollers.

  • And together, we've made optimizations

  • that significantly improve performance

  • on this embedded Arm hardware.

  • We've also partnered with Arduino,

  • and just launched the official Arduino TensorFlow library.

  • This makes it possible for you to get started

  • doing speech detection on Arduino hardware in just

  • under five minutes.

  • And now we'd like to demonstrate TensorFlow

  • Lite for microcontrollers running in production.

  • Today, if a motor breaks down, it

  • can cause expensive downtime and maintenance costs.

  • But using TensorFlow, it's possible

  • to simply and affordably detect these problems before failure,

  • dramatically reducing these costs.

  • Mark Stubbs, co-founder of Shoreline IoT,

  • will now give us a demo of how they're using TensorFlow

  • to address this problem.

  • They've developed a sensor that can be attached

  • to a motor just like a sticker.

  • It uses a low power, always on TensorFlow model

  • to detect motor anomalies.

  • And with this model, their device

  • can run for up to five years on a single small battery,

  • using just 45 microamps with its Ambiq Cortex-M4 CPU.

  • Here we have a motor that will simulate an anomaly.

  • As the RPMs increase, it'll start to vibrate and shake.

  • And the TensorFlow model should detect this as a fault

  • and indicate so with a red LED.

  • All right, Mark, let's start the motor.

  • [HIGH-PITCHED MOTOR HUMMING]

  • Here we have a normal state.

  • And you can see this, it's being detected with the green LED.

  • Everything's fine.

  • Let's crank it up.

  • [MOTOR WHIRRING]

  • OK.

  • It's starting to vibrate, it's oscillating.

  • I'm getting a little nervous and frankly, a little sweaty.

  • Red light.

  • Boom.

  • OK.

  • The TensorFlow model detected the anomaly.

  • We could shut it down.

  • Halloween disaster averted.

  • Thank you, mark.

  • [APPLAUSE]

  • SARAH SIRAJUDDIN: That's all we have folks.

  • Please try out TensorFlow Lite if you haven't already.

  • And once again, we're very thankful for the contributions

  • that we get from our community.

  • JARED DUKE: We also have a longer talk later today.

  • We have a demo booth.

  • Please come by and chat with us.

  • Thank you.

  • [APPLAUSE]

  • SANDEEP GUPTA: My name is Sandeep Gupta.

  • I am the product manager for TensorFlow.js.

  • I'm here to talk to you about machine learning in JavaScript.

  • So you might be saying to yourself that I'm not

  • a JavaScript developer, I use Python

  • for machine learning, so why should I care?

  • I'm here to show you that machine learning in JavaScript

  • enables some amazing and useful applications,

  • and might be the right solution for your next ML problem.

  • So let's start by taking a look at a few examples.

  • Earlier this year, Google released the first

  • ever AI inspired Doodle, what you see on the top left.

  • This was on the occasion of Johann Sebastian Bach's birth

  • anniversary.

  • And users were able to synthesize a back style harmony

  • by running a machine learning model in the browser

  • by just clicking on a few notes.

  • Just in about three days, more than 50 million users

  • created these harmonies, and they saved them and shared them

  • with their friends.

  • Another team and Google has been creating these fun experiences.

  • One of these is called shadow art, where

  • users are shown a symbol of a figure,

  • and you use your hand shadow to try to match that figure.

  • And that character comes to life.

  • Other teams are building amazing accessibility applications,

  • making web interfaces more accessible.

  • On the bottom left, you see something called Createability,

  • where a person is trying to control a keyboard simply

  • by moving their head.

  • And then on the bottom right is an application

  • called Teachable Machine, which is a fun and interactive way

  • of training and customizing a machine learning model directly

  • in a browser.

  • So all of these awesome applications

  • have been made possible by TensorFlow.js.

  • TensorFlow.js is our open source library

  • for doing machine learning in JavaScript.

  • You can use it in the browser, or you can use

  • it server-side with Node.js.

  • So why might you consider using TensorFlow.js?

  • There are three ways you would use this.

  • One is you can run any of the pre-existing pre-trained models

  • and deploy them and run them using TensorFlow.js.

  • You could use one of the models that we have packaged for you,

  • or you can use any of your TensorFlow saved models

  • and deploy them in the web or in other JavaScript platforms.

  • You can retrain these models and customize them

  • on your own data, again, using TensorFlow.js.

  • and lastly, if you're a JavaScript developer wanting

  • to write all your machine learning directly

  • in JavaScript, you can use the low level ops API

  • and from scratch build a new model using this library.

  • So let's see why this might be useful.

  • First, it makes machine learning really, really accessible

  • to a web developer and a JavaScript developer.

  • With just a few lines of code, you

  • can bring the power of machine learning

  • in your web application.

  • So let's take a look at this example.

  • Here we have two lines of code with which we are just

  • sorting our library from our hosted scripts,

  • and we are loading a pre-trained model.

  • In this case, the body-pix model,

  • which is a model that can be used to segment

  • people in videos and images.

  • So just with these two lines, you

  • have the library and the model embedded in your application.

  • Now we choose an image.

  • We create an instance of the model.

  • And then we call the model's estimate person segmentation

  • method, passing it the image.

  • And you get back an array, an object

  • which contains the pixel mask of where there is the person

  • present in this image.

  • And there are other methods that can subdivide this

  • into various body parts.

  • And there are other rendering utilities.

  • So just with about five lines of code,

  • your web application has all the power of this powerful machine

  • learning model.

  • The library can be used both client-side and server-side.

  • Using it client-side in browser has lots of advantages.

  • You get the amazing interactivity and reach

  • of browser as a platform.

  • Your application immediately reaches all your users

  • who have nothing to install on their end.

  • By simply sharing the URL of your application

  • they are up and running.

  • You get the benefit of interactivity

  • of browser as a platform with easy access

  • to webcam, and microphone, and all of the sensors

  • that are attached to the browser.

  • Another really important point is

  • that because these are running client-side,

  • user data stays client-side.

  • So this has strong implications for privacy

  • sensitive applications.

  • And lastly, we support GPU acceleration through WebGL.

  • So you get great performance out of the box.

  • Using the server-side, TensorFlow.js supports Node.

  • Lots of enterprises use Node for their back-end operations

  • and for a ton of their data processing.

  • Now you can use TensorFlow directly with Node

  • by importing any TensorFlow saved model

  • and running it through TensorFlow.js Node.

  • Node also has an enormous NPM package ecosystem.

  • So you can benefit from that, and plug into the NPM

  • repository collection.

  • And for enterprises, where your entire back-end stack

  • is in Node, you can now bring all of the ML into Node

  • and maintain a single stack.

  • A natural question to ask is, how fast is it?

  • We have done some performance benchmarking.

  • And I'm showing here some results

  • from MobileNet inference time.

  • On the left, you see results on mobile devices

  • running client-side.

  • And on state of the art mobile phones,

  • you get really good performance with about 20

  • milliseconds inference time, which

  • means that you can run real time applications

  • at about 50 frames per second.

  • Android performance has some room for improvement.

  • Our team is heavily focused on addressing that.

  • On the server side, because we bind to TensorFlow's native C

  • library, we have performance parity

  • with Python TensorFlow, both on CPU as well as on GPU.

  • So in order to make it easy for you to get started,

  • we have prepackaged a collection of models, pre-trained models,

  • for most of the common ML tasks.

  • These include things like image classification,

  • object detection human pose and gesture detection,

  • speech commands models for recognizing spoken words,

  • and a bunch of text classification

  • models for things like sentiment and toxicity.

  • You can use these models with very easy wrapped high level

  • APIs from our hosted scripts, or you can NPM install them.

  • And then you can use these pre-trained models

  • and build your applications for a variety of use cases.

  • These include AR, VR type of applications.

  • These include gesture-based interactions

  • that help improve accessibility of your applications,

  • detecting user sentiment and moderating content,

  • conversational agents, chat bots,

  • as well as a lot of things around front end web page

  • optimization.

  • These pre-trained models are a great way to get started,

  • and they are good for many problems.

  • However, often, you have the need

  • to customize these models for your own use.

  • And here, again, the power of TensorFlow.js

  • with the interactivity of the web comes in handy.

  • I want to show you this application

  • called a Teachable Machine, which

  • is a really nice way of customizing a model in just

  • a matter of minutes.

  • I am going to test both the demo gods as well as the time buzzer

  • gods here and try to show this live.

  • What you're seeing here is--

  • this is the Teachable Machine web

  • page, which has the MobileNet model already loaded.

  • We are going to be training three classes.

  • These are these green, purple, and orange classes.

  • We will output words.

  • So let's say we will do rock for green, paper for purple,

  • and scissors for red.

  • We're going to record some images.

  • So let's record some images for rock.

  • I'm going to click this button here.

  • COMPUTER GENERATED MAN'S VOICE: Rock.

  • SANDEEP GUPTA: And I'm going to record some images for paper.

  • COMPUTER GENERATED MAN'S VOICE: Pa-- rock.

  • SANDEEP GUPTA: And I'm going to record

  • some images for scissors.

  • COMPUTER GENERATED MAN'S VOICE: Paper.

  • SANDEEP GUPTA: OK.

  • So there--

  • COMPUTER GENERATED MAN'S VOICE: Scissors.

  • SANDEEP GUPTA: We have customized our model

  • with these just about 50 images recorded for each class.

  • Let's see how it works.

  • COMPUTER GENERATED MAN'S VOICE: Rock.

  • Paper.

  • Rock.

  • Paper.

  • Rock.

  • Scissors.

  • Paper.

  • Rock.

  • Scissors.

  • SANDEEP GUPTA: So there you go.

  • In just a matter of--

  • [APPLAUSE]

  • Pretty neat.

  • It's really powerful to customize models

  • like these super interactively with your own data.

  • What if you want to train your data on somewhat of a larger

  • scale?

  • So here, AutoML comes in really handy.

  • AutoML is a GCP cloud based service,

  • which lets you bring your data to the cloud

  • and train a custom, really high performing model

  • specific to your application.

  • Today, we are really excited to announce that we now support

  • TensorFlow.js for AutoML.

  • Meaning that you can use AutoML to train your model.

  • And then with one click, you can export

  • a model that's ready to be deployed in your JavaScript

  • application.

  • Using this feature, one of our early testers,

  • the CVP Corporation, which is building some image

  • classification applications for the mining industry,

  • they were able to use this feature.

  • And in just about five node-hours of training

  • they improved their model accuracy

  • from their manually trained model from 91% to 99%

  • and get a much smaller and faster performing model.

  • And then immediately, instantly deployed in a progressive web

  • application for on-field use.

  • So in addition to models, one of the big focus areas for us

  • has been support for a variety of platforms.

  • And because JavaScript is a versatile language which

  • runs on a large bunch of platforms,

  • TensorFlow.js can be used on all these different platforms.

  • And today, again, we are really happy to announce

  • that we now support integration with React Native.

  • So if you are a React Native developer building

  • cross-platform Native applications,

  • you can use TensorFlow.js directly from within React

  • Native and you get all the power of WebGL acceleration.

  • We've looked at the capabilities of the library.

  • Let's look at a couple of use cases.

  • Modiface is an AR technology company based out of Canada.

  • They have used TensorFlow.js to build

  • this mobile application that runs on the WeChat mini program

  • environment.

  • They did this for L'Oreal, where it lets users try out

  • these beauty products instantly running

  • in these instant messaging applications.

  • They had some strict criteria about model size and frame rate

  • performance.

  • And they were able to achieve all of those targets

  • with TensorFlow.js running natively deployed

  • on these mobile devices.

  • In order to showcase the limits of what's possible with this,

  • our team has built a fun game and an application

  • to show how you can take a state of the art model, a very

  • high resolution model that can do face tracking,

  • and we have built this lip syncing game.

  • So here what you will see is that a user

  • is trying to lip sync to a song and a machine learning

  • model is trying to identify the lips

  • and trying to match it to how well you are doing lip syncing.

  • And then because it's in JavaScript, it's in the web,

  • we have added some visualization effects

  • and some other AR, VR effects.

  • So let's take a look.

  • [MUSIC PLAYING]

  • SPEAKER 1: (SINGING) Hey, Hey.

  • Give me one more minute.

  • I would.

  • Hey, Hey.

  • Give me on more, one more, one more.

  • Hey, Hey.

  • Give me one more minute.

  • I would.

  • Hey, Hey.

  • Make it last for...Ohh, ohh.

  • Hey, Hey.

  • Give me one more minute.

  • I would.

  • Hey, Hey.

  • Give me one more, one more, one more.

  • SANDEEP GUPTA: OK.

  • It's pretty cool.

  • This demo, the creator of this demo is here with.

  • He's at the TensorFlow.js demo station.

  • Please stop by there, and you can

  • try playing around with this.

  • In the real world, we are beginning

  • to see more and more applications of enterprise

  • using TensorFlow.js in novel ways.

  • Uber is using it for a lot of their internal ML

  • tasks, visualization, and computation

  • directly in the browser.

  • And a research group in IBM is using it

  • for on the field mobile classification

  • of these disease carrying snails which spread

  • certain communicable diseases.

  • So lastly, I want to thank our community.

  • The popularity and growth of this library is in large part

  • due to the amazing community of our users and contributors.

  • And thus, we are really excited to see that lot of developers

  • are building amazing extensions and libraries on top

  • of TensorFlow.js to extend its functionality.

  • This was just a quick introduction to TensorFlow.js.

  • I hope I've been able to show you

  • that if you have a web or a Node ML use case,

  • TensorFlow.js is the right solution for your needs.

  • Do check out our more detailed talk later this afternoon,

  • where our team will dive deeper into the library.

  • And there are some amazing talks from our users showcasing some

  • fantastic applications. tensorflow.org/js is your one

  • source for a lot more information, more examples,

  • getting started content, models, et cetera.

  • You can get everything you need to get started.

  • So with that, I would like to turn it over to Joseph Paul

  • Cohan, who's from Mila Medical.

  • And he will share with us an amazing use

  • case of how their team is using TensorFlow.js.

  • Thank you very much.

  • [APPLAUSE]

  • JOSEPH PAUL COHEN: Thanks.

  • Great.

  • I am very excited to be here today.

  • So what I want to talk about is a chest X-ray radiology

  • tool in the browser.

  • We look at the classic or traditional diagnostic

  • pipeline.

  • There is a certain area where web based tools

  • are used by physicians to aid them

  • in a diagnostic decision, such as kidney donor

  • risk or cardiovascular risk.

  • These tools are already web based.

  • With the advances of the learning,

  • we now can do radiology tasks such as chest X-ray

  • diagnostics, and now put them in the browser.

  • Can you imagine such use cases where this is useful?

  • In an emergency room, where you have a time-limited human.

  • In a rural hospital where radiologists are not

  • available or very far away.

  • The ability for a non-expert to triage cases

  • for an expert, saving time and money,

  • And where we'd like to go is towards rare diseases.

  • But we're a little data starved in this area

  • to be able to do that.

  • This project has been called "nice" by Yann Lecun.

  • What we need to do to achieve this

  • is run a state of the art chest X-ray diagnostic DenseNet

  • in a browser.

  • One thing, for preserving privacy of the data,

  • while at the same time of allowing

  • us to scale to millions of users with zero computational

  • cost on our side.

  • How do we achieve this?

  • With TensorFlow.js, which allows us one second feed forward

  • in this DenseNet model with a 12 second initial load uptime.

  • We also need to deal with processing

  • out-of-distribution samples, where

  • we don't want to process images of cats or images

  • that are not properly formatted X-rays.

  • To do this, we're going to use an autoencoder with a SSIM

  • score, and we're going to look at the reconstruction.

  • And then finally, we need to compute gradients

  • in the browser so show a saliency [INAUDIBLE] of why

  • we made such a prediction.

  • So we could ship two models, one computing the feed forward

  • and the other one computing the gradient.

  • Or we can use TensorFlow.js to compute the actual gradient

  • graph and then compute it right in the browser,

  • given whatever model we have already shipped.

  • So this makes development really easy.

  • And it's also pretty fast.

  • Thank you.

  • [APPLAUSE]

  • TATIANA SHPEISMAN: Hi.

  • I'm Tatiana.

  • I'm going to talk today about MLIR.

  • Before we talk about the MLIR, let's start from the basics.

  • We are here because artificial intelligence is

  • experiencing tremendous growth.

  • All the three components, algorithms, data,

  • compute have come together to change the world.

  • Compute is really, really important

  • because that's what enables machine learning

  • researchers to build better algorithms to build new models.

  • And you can see the models are becoming much, much more

  • complex.

  • To train a model today, we need several orders of magnitude

  • compute capabilities than we needed several years ago.

  • How do we build hardware which makes that possible?

  • For those of you who are versed in hardware details,

  • Moore's law is ending.

  • This is also the end of Dennard scaling.

  • We cannot anymore simply say, the next CPU is going to run

  • at higher frequency.

  • And because of that, that will power machine learning.

  • What is happening in the industry

  • is the explosion of custom hardware.

  • And there is a lot of innovation,

  • which is driving this compute which

  • makes artificial intelligence possible.

  • So if we look at what is happening,

  • you look in your pocket.

  • You probably have a cell phone.

  • Inside that cell phone, most likely

  • there is a little chip which makes artificial intelligence

  • possible.

  • And it's not just one chip.

  • There is CPU, there is GPU, there is DSP,

  • there is neural processing unit.

  • All of that is sitting inside a little phone

  • and seamlessly working together to make

  • great user experience possible.

  • In the data center, we see the explosion

  • of specialized hardware also.

  • Habana, specialized accelerations in CPUs,

  • in GPUs, many different chips.

  • We have TPUs.

  • All of this is powering the tremendous growth

  • of specialized compute in data centers.

  • Once you have more specialized accelerators,

  • that brings more complexity.

  • And as we all know, hardware doesn't work by itself.

  • It is powered by software.

  • And so there is also a tremendous growth

  • in software ecosystems for machine learning.

  • In addition to TensorFlow, there are

  • many other different frameworks which are

  • trying to solve this problem.

  • And actually, we've got a problem

  • with the explosive growth of hardware and software.

  • CHRIS LATTNER: So the big problem here

  • is that none of the scales.

  • Too much hardware, too much complexity, too much software,

  • too many different systems that are not working together.

  • And what's the fundamental problem?

  • The fundamental problem is that we as a technology

  • industry across the board are re-inventing

  • the same kinds of tools, the same kinds of technologies,

  • and we're not working together.

  • And this is why you see the consequences of this.

  • You see systems that don't interoperate because they're

  • built by different people on different teams that

  • solve different problems.

  • Vendor X is working on their chip,

  • which makes perfect sense.

  • It doesn't really integrate with all the different software.

  • And likewise, for the software people

  • that can't know or work with all the hardware people.

  • This is why you see things like you bring up your model,

  • you try to get it to work on a new piece of hardware

  • and it doesn't work right first time.

  • You see this in the cracks that form between these systems,

  • and that manifests as usability problems, or performance

  • problems, or debugability problems.

  • And as a user, this is not something

  • you should have to deal with.

  • So what do we want?

  • What we'd really love to do is take this big problem, which

  • has many different pieces, and make

  • it simpler by getting people to work together.

  • And so we've thought a lot about this.

  • And the way we think that we can move the world forward

  • is not by saying that there is one right way to do things.

  • I don't think that works in a field that

  • is growing as explosively as machine learning.

  • Instead, what we think the right way to do this is,

  • is to introduce building blocks.

  • And instead of standardizing the user experience

  • or standardizing the one right way to do machine learning,

  • we think that we as a technology industry

  • can standardize some of the underlying building blocks that

  • go into these tools, that can go into the compiler

  • for a specific chip, that can go into a translator that

  • works between one system or the other.

  • And if we build building blocks, we know

  • and we can think about what we want from them.

  • We want, of course, the best in class graph technology.

  • That's a given.

  • We want the best compiler technology.

  • Compilers are really important.

  • We want to solve not just training but also inference,

  • mobile, and servers, and including all permutations.

  • So training on the edge, super important,

  • growing in popularity.

  • We don't want this to be a new kind of technology island

  • solution.

  • We want this to be part of a continuous ecosystem that

  • spans the whole problem.

  • And so this is what MLIR is all about.

  • MLIR is new system that we at Google have been building,

  • that we are bringing to the industry

  • to help solve some of these common problems that

  • manifests in different ways.

  • One of the things that we're really excited about

  • is that MLIR is not just a Google technology.

  • We are collaborating extensively with hardware makers

  • across the industry.

  • We're seeing a lot of excitement and a lot of adoption

  • by people that are building the world's

  • biggest and most popular hardware across the world.

  • But what is MLIR?

  • MLIR is a compiler infrastructure.

  • And if you're not familiar compilers,

  • what it's really saying is it's saying

  • that it is providing that bottom level technology, low level

  • technology that underpins building individual tools

  • and individual systems that then get used to help with graphs

  • and help with chips, and things like that.

  • And so how does this work?

  • What MLIR provides, if you look at it in contrast

  • to other systems, is that it is not,

  • again, a one size fits none kind of a solution.

  • It is trying to be technology, technology

  • that powers these systems.

  • Like we said before, it of course,

  • contains a state of the art compiler technology.

  • And we have, both within Google, we

  • have dozens of years of compiler experience within the team.

  • But we probably have hundreds of years

  • of compiler experience across the industry

  • all collaborating together on this common platform.

  • It is designed to be modular and extensible because requirements

  • continue to change in our field.

  • It's not designed to tell you the right way to do things

  • as a system integrator.

  • It's designed to provide tools so that you

  • can solve your problems.

  • If you dive into the compiler, there's

  • a whole bunch of different pieces.

  • And so there are things like low level graph transformation

  • systems.

  • There are things for code generation

  • so that if you're building a chip

  • you can handle picking the right kernel.

  • But the point of this is that MLIR does not force

  • you to use one common pipeline.

  • It turns out that, while compilers for co-generation

  • are really great, so are handwritten kernels.

  • And if you have handwritten kernels that

  • are tuned and optimized for your application, of course,

  • they should slot into the same framework,

  • should work with existing run times.

  • And we really see MLIR as providing

  • useful value that then can be used to solve problems.

  • It's not trying to force everything into one box.

  • So you may be wondering, though, for you, if you're not

  • a compiler person or a system integrator or a chip person,

  • what does this mean to you?

  • So let's talk about what it means for TensorFlow.

  • TATIANA SHPEISMAN: What it means for TensorFlow

  • is it allows us to build a better

  • system because integrating TensorFlow with the myriad

  • of specialized hardware is really a hard problem.

  • And with MLIR, we can build a unified infrastructure layer,

  • which will make it much simpler for TensorFlow

  • to seamlessly work with any hardware chip which comes out.

  • For you as a Python developer, it simply

  • means better development experiences.

  • A lot of things that today might be not working

  • as smoothly as we would like them to can be it

  • is resolved by MLIR.

  • This is just one example.

  • You write a model.

  • You try to run it through the TensorFlow Lite converter.

  • You get an error.

  • You have no clue what it is.

  • And now we see issues on GitHub and try to help you.

  • With MLIR, you will get an error message that says,

  • this is the line of Python code which caused the problem.

  • You can look at it and fix the problem yourself.

  • And just to summarize, the reason we are building in MLIR

  • is because we want to move faster

  • and we want the industry to move faster with us.

  • One of the keys to make industry work well together

  • is neutral governance.

  • And that's why we submitted MLIR as a project to LLVM.

  • Now it is part of LLVM system.

  • The code is moving soon.

  • This is very important because LLVM

  • has a 20-year history of neutral governance

  • and building the infrastructure which is

  • used by everybody in the world.

  • And this is just the beginning.

  • Please stay tuned.

  • We are building a global community around MLIR.

  • Once we are done, ML will be better for everybody.

  • And we will see much faster advance

  • of artificial intelligence in the world.

  • ANKUR NARANG: I'm Ankur.

  • I work at Hike.

  • And I lead AI innovations over there

  • in various areas, which I'm going to talk today.

  • Formally, I have been working with IBM Research

  • in New Delhi, and also some research labs here

  • in Menlo Park.

  • Here are some various use cases that we do using AI.

  • The fundamental being Hike as a platform for messaging.

  • And now we are driving a new social future.

  • We are looking at a more visual way of expressing interactions

  • between the users.

  • So instead of typing messages in a laborious way,

  • if one could use and get recommended stickers which

  • could express the same way in a much efficient fashion,

  • in a more expressive fashion, then

  • it would be a more interesting and engaging conversation.

  • So the first use case is essentially

  • across multi-linguistic sticker recommendations,

  • where basically, we address around eight to nine languages

  • currently in India.

  • And as we expand internationally,

  • we will be expressing more number of languages.

  • So we want to go hyperlocal and then as well as hyperpersonal.

  • From a hyperlocal perspective, we

  • want to address the needs of a person

  • from his or her own personal language perspective.

  • When you type, you would automatically

  • get stickers recommended in the corresponding native language

  • of the person.

  • The second one is friend recommendation

  • using social network analysis and deep

  • learning, where we use graph embeddings

  • and deep learning to recommend friends.

  • The next one essentially is around fraud analytics.

  • We have lots of click farms, where

  • people try to misuse the rewards that

  • are given on the platform in a B2C setting.

  • And therefore, you need interesting,

  • deep learning techniques and anomaly detection

  • to address known knowns, known unknowns,

  • and the unknown unknowns.

  • Another one essentially is around campaign tuning,

  • hyperpersonalization, and optimization

  • to be able to address the needs of every user

  • and make the experience engaging and extremely interactive.

  • And finally, we have interesting sticker processing

  • using vision models and graphics, which will be

  • coming soon in later releases.

  • Going further, we have a strong AI research focus.

  • So we are passionate about research.

  • We have multiple publications in ECIR this year, IJCAI demo.

  • And we have an ArXiv publication.

  • And we have [INAUDIBLE] to areas not directly related

  • to messaging.

  • But we had an ICML workshop paper, as well.

  • Fundamentally, the kind of problems we address

  • need to look at extensions and basically address

  • the limitations of supervised learning problems.

  • We need to address cases where there's a long tail of data,

  • very less labels available, limited number of labels

  • available, very costly to get those labels.

  • And the same problems occur in NLP, Vision, reinforcement

  • learning, and stuff like that.

  • We are looking at meta learning formulations

  • to address this stuff.

  • At Hike, we are looking at 4 billion events per day

  • across millions of users.

  • We collect a terabytes of data, essentially using the Google

  • Cloud with various tools on Google Cloud,

  • including KubeFlow, BigQuery, Data Proc, and Dataflow.

  • We use it for some of the use cases

  • which I mentioned earlier.

  • Essentially, I will look into one particular use

  • case right now.

  • It is on stickers.

  • Stickers, as I mentioned, are powerful expressions

  • of emotions, context, and with various visual expressions

  • over there.

  • The key challenge over there is discovery.

  • If you have tens of thousands of stickers now going

  • into millions and further into billions of stickers,

  • how do you discover these stickers

  • and be able to exchange at real time

  • with a few milliseconds of latency

  • while you are typing of personal interest?

  • What we want to solve essentially

  • is a chat context with time, event of the day, situation,

  • recent messages, gender, language.

  • And we want to predict what's the sticker that's

  • most relevant to it.

  • Building this, essentially, one needs

  • to look at all the different ways a particular text is

  • typed.

  • One needs to aggregate, essentially,

  • the semantically similar phrases to have the right encoding

  • across these various languages and also between the languages

  • and across the languages so that it does not

  • affect the typing experience.

  • And we need to deliver in the limited memory of the device

  • as well as a few milliseconds of response time.

  • So here in [INAUDIBLE] is a sticker recommendation flow.

  • Where basically, given a chat on text

  • and what the user is currently typing,

  • we use a message model, which predicts using a classification

  • model.

  • It predicts the message, and those messages

  • are mapped to the corresponding stickers.

  • For prediction, essentially, we use a combination

  • of TensorFlow learning at the server, TensorFlow

  • Lite learning on the device.

  • And in the combination, we want to deliver, basically,

  • a few milliseconds of latency for getting the accurate

  • stickers recommended.

  • And here we use a combination of neural network and Trie.

  • Obviously, we quantized the neural network

  • on the device using TensorFlow Lite.

  • And we are able to get the desired amount of performance.

  • The stickers essentially, come--

  • so once the messages are predicted,

  • the stickers are naturally mapped

  • based on the tags of the stickers on what intent

  • they are meant to deliver.

  • And correspondingly to the message

  • predicted, those stickers are delivered to the user.

  • This is a complete flow.

  • Basically, given a chat context, one

  • predicts the message that the person is trying to express.

  • Then one adds the user context from a hyperpersonalization

  • perspective, consider sticker preferences, age, gender,

  • and then goes to the relevant stickers.

  • In the stickers, we basically, score

  • using reinforcement learning algorithms.

  • Maybe to begin with, then more complex going forward

  • so that the right kind of stickers and the way people

  • behavior on the platform is changing,

  • the corresponding stickers also adapt to it at real time.

  • Thank you.

  • [APPLAUSE]

KONSTANTINOS KATSIAPIS: Hello, everyone.

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it