Subtitles section Play video
[ MUSIC ]
[ APPLAUSE ]
KARASICK: You all know...now know why Dr. Bengio did not need much of an introduction.
Thank you.
Let me also introduce Fei-Fei Li from Stanford.
She's head of the AI division lab at Stanford.
I know her has the image net lady.
And I guess we'll talk a little bit about image net challenges as we go through the discussion.
And John Smith is an IBM Fellow.
Long history of research in visual information retrieval, deep learning.
And I'm going to be a little selfish.
So, I'm a trained computer scientist, but I'm not an AI person
by training; lots of other things.
So, I thought this was a real opportunity to educate me.
So, I'm going ask questions that I am curious about in order to get the discussion started.
And you know, the first thing I wanted to ask the three of you is what I would call kind
of the hype correction sort of a question.
Computer science has a kind of a shiny object property that we're all pretty familiar with.
And every so often, a group or individual comes up with a breakthrough.
And everybody kind of runs over and the teeter-totter tips to the side.
And so, I always worry, whenever I see that, it's oh, God, what aren't people working on?
So, I guess I was interested, Yoshua, your talk really got to this.
It was a discussion about what we know how do and we're trying to know how to do.
And I guess I'd like to ask the three of you, maybe starting with Fei-Fei,
what do we think the limits of deep learning technologies are?
What won't be able to do?
You weren't here at the beginning, but Terry Sejnowski,
in the previous session, said don't trust anybody.
So, what do you think is sort of beyond in kind of technology?
LI: Okay. So, thank you, first of all, for the invitation.
And great, talk, Yoshua.
I think already Yoshua mentioned the law.
So, first of all, deep learning is a dynamic changing area of research.
So, it's very hard to pinpoint what exactly is deep learning.
In computer vision, a lot of people who, when they talk about refer to deep learning
and the success of deep learning is really the specific evolution on your network model,
architecture, that's unsupervised training with big data,
meaning image that mostly that does object recognition.
And that is a very narrow definition of deep learning.
So, when you ask the limitation of deep learning, one way to answer,
there's no limitation if deep learning keeps evolving.
That's a little bit of an irresponsible answer, I recognize, and just to be brief,
but I also want to echo Yoshua, in my opinion, the quest towards AI is especially
in my own area of computer vision that goes from perception to cognition to reasoning.
And that whole path, we have just begun to get a grasp.
We're doing very well with perception, thanks to data and these high capacity models,
but beyond the basic building blocks of perception such as speech
and object recognition, the next thing is really a slew of cognitive tasks
that we're not totally getting our hands on yet.
We begin to see question and answering or QA.
We begin to see image captions with grounding.
We're beginning, just beginning to see these budding areas of research
and down the road, how do we reason?
How do we reason in novel situations?
How do we learn to learn?
How do we incorporate intentions, predictions, emotions?
So, all those are still on the horizon.
KARASICK: Yoshua.
BENGIO: So, I will repeat what Terry said.
Like, until we have a mathematical proof, we don't know that isn't possible.
That being said, for sure, if you look at the current technology, there are challenges.
I don't know if there are impossibilities, but they are clearly challenges.
One of the challenges I've worked on for more
than two decades is the long-term dependencies challenges.
So, as soon as you start dealing with sequences, there are optimization challenges
and that makes it hard to learn, to train those neural nets to do their job.
Even for simple tasks.
And we've been studying this problem for 20 years.
We're making incredible progress, but it's still an obstacle.
And there are other challenges, like I mentioned some of the challenges come up in inference,
in [INAUDIBLE] learning that seem intractable.
But of course, at the same time, we know brains do a pretty good job of these tasks.
So, there's got to be some approximate methods, and we have already some,
that are doing very well on these things.
And that's what the research is really about.
KARASICK: John?
SMITH: Yes.
I think this is a good question to ask, is there hype and if there's hype, why is there hype?
I think the one thing that's clear is there's a lot of attention being given to deep learning.
But to some extent, it is warranted because performance is there.
And it's hard to argue against performance.
So, for many years, my colleagues here, Professor Bengio has worked on neural nets,
and it actually took a timing I think of many things at once,
of Fei-Fei's work on image net sort of coming at the right time with computation
that actually let people realize that there's a class of problems
that were previously very difficult,
like classifying 1000 object categories, we are now essentially solvable.
So, I think what we're seeing, some of these tasks
which we thought were very difficult are now solved.
So, image net, you know, 1,000 categories is essentially solved.
Some other data sets, like labeled faces in the wild,
which is face recognition, essentially solved.
So, I think it's hard to argue against that kind of performance.
And I think the question for us now is, what else should we do?
So, it is a shiny object.
But there's a lot more out there, at least in the vision sense.
I think we know very little about what the world looks like
or how to teach a computer what the world looks like.
But I think we're in a very good time now that we have this shiny object and we can think
about scaling it to a much larger set of tasks.
KARASICK: Thanks.
One of the...this is a good segue to something else I wanted to talk about.
One of the things that cause me to have one of the most fun jobs on the planet,
which is managing a bunch of researchers and developers building up Watson have bridging
out between at least frankly...what did you say?...
constantly changing technologies and sort of the pragmatics of those pesky customers
who don't want to use the system to do things.
One of the biggest challenges we have is this whole area we talk
about called real world evidence.
And it really is a discussion about reasoning in a particular domain.
So, if you are going to put a system like Watson in front of an oncologist, and we have,
and they're going to ask questions, they're going to get answers.
The first thing they're going to want to know is why.
Why did the linguistic inference engine decide that this particular passage, phrase, document,
was a better answer to the question than that one.
And I also get this when I ask my team about how much fun it is to debug these things,
and you actually are [hat on a hat] on a whatever that was is maybe a good illustration
of some of the challenges of really trying
to get underneath how these things work fundamentally.
So, how about this notion of why as opposed to what these things do?
Anybody? BENGIO: It's interesting you ask this question.
It's a very common question.
What if we had a human in front of us doing the job?
Sometimes a human is able to explain their choice.
And sometimes they're not really able to explain their choice.
And the way we trust that person is mostly because he does the right thing most of the time
or we have some reasons to believe that.
So, I think there will be progress in our technical abilities to figure out the why,
why is it taking those decisions, but it's always going
to be an approximation to the real thing.
The real thing is very complicated.
You have these millions of computations taking place.
The reason why it's making this decision is hidden in those millions of computations.
And it's going to be true essentially of any complex enough system.
So, the why is going to be an approximation, but still,
sometimes it can give you the queues that you need to figure it out.
But ultimately we can't really have a completely clearing picture of why it's doing it.
One thing I want to add is I think there's going to be progress in that direction
as we advance on the natural language side.
For example, think of the example I gave with the images and the sentences.
So, maybe you can think of the task was not
to actually describe the image but do something with it.
But now you can ask the computer about what it sees in the image,
even though that was not the task, to get a sense of, you know,
why it's getting things wrong and even ask where it was seeing these things.
So, we can design the system so that we can have some answers,
and the machine can actually talk back in English about what's going on inside.
LI: And just to add, I think, you know, in most of our research,
the interpretability is what you call why.
And a lot of us are making effort into that.
In addition to the image captioning work that both of our labs have worked on in terms
of not only generating the sentence but grounding back the words
into the spatial region where the words make sense.
For example, we're recently working on videos and using a lot
of the attention-based LSTM models and there we're looking
at how we can actually explain using some of these attention models
where actions are taking place in the temporal spatial segment of a long video.
So, all these attempts are trying to understand the why question
or at least make the model interpretable.
SMITH: Yes, I think the question is why is actually a very important one in applications.
I think particularly as we look to apply deep learning techniques in industry problems.
I'll give one example.
So, one of the areas that we're applying deep learning techniques is
around melanoma detection.
So, looking at skin cancer, looking at skin lesion images
and essentially training the computer based on those types of lesions.
And what we know is possible, actually it's...the essential value proposition
of a deep learning is that we can learn a representation from those images,
from those pixels that can be very effective for then building discrimination and so on.
So, we can actually get the systems to be accurate using deep learning techniques.
But these representations are not easy for humans to understand.
They're actually very different
from how clinicians would look at the features of those images.
So, around melanoma, around skin lesions in particular,
doctors are trained to look at sort of ABCDE.
Asymmetry, border, color, diameter, evolution, those kinds of things.
And so when our system is making some decisions about these images,
it's not conveying that information in ABCDE.
So, it actually can get to a better result in the end,
but it's not something that's easily consumable by that clinician,
ultimately who needs to make the decision.
So, I think we have to...we do have to think about how we're going to design these systems
to convey not only final classifications, but a set of information,
a set of features in some cases that make sense to those humans who need...
BENGIO: You could just train to also output...
SMITH: You can do that, yes, absolutely.
Yes. Right.
So, I think there are thing that can be done, but the applications may give these requirements
and it may influence how we use deep learning.
KARASICK: Yes.
I think there's going to be kind of a long interesting discussion as you look at the use
of these algorithms in regulated settings, how to characterize them in such a way
that the regulators are happy campers, whatever the technical term is.
So, let's continue on this discussion around, if you like, domains.
One of the things that I've seen about systems like this, you know,
the notion of what's an application is a function of who you are.
So, an application of deep learning, talk about image, speech,
question and answer, natural language processing.
When you climb up into a, if you like, an industrial domain, the things that people
who give IBM money understand, banks, governments, insurance companies,
now increasingly folks in the healthcare industry, there's really a lot of very,
very deep domain knowledge that we have used to train systems like Watson.
One of the things that's both a blessing and a curse
with deep learning is this...you get taken away from some of the more traditional things
like feature engineering that we've all seen.
But on the other hand, the feature engineering
that you see really embeds deep understanding and knowledge of the domain.
So, to me, and I'm pretty simple-minded about this stuff, we are going to have
to see how these two different worlds come together so that we can mix understanding
and knowledge and reasoning and a domain with the kinds of things that we're beginning to see,
you know, starting the classification and lifting up on deep learning.
So, research in this area?
What are people...
BENGIO: So, first of all, if you have features that you believe are good,
there's nothing that prevents you from using them as extra input.
KARASICK: Absolutely.
BENGIO: You can use the raw thing.
You can use your features.
You can use both.
That's perfectly fine.
But you have to sometimes think of it, where are you going to put them in the system.
But typically, there's nothing that prevents you from using them.
Also, researchers working on deep learning have been very creative in ways
of incorporating prior knowledge.
So, in computer vision, they could tell you
about the different approaches that people have used.
There are lots of thing we know about images, we can use essentially
to provide more data, more examples.
Like transformations of images.
And of course, the architectures themselves we're using by the convolutional nets,
they also incorporate prior knowledge.
And we can play with that if we have other kinds of knowledge,
we can sometimes change the architecture accordingly.
And one of the most power ways in which we can incorporate prior knowledge is
that we have these intermediate presentations and we can preassign meaning
to some of these representations.
You could say, well, okay, so that part of the representation is supposed
to capture this aspect of the problem and that part is supposed to capture this aspect
of the problem and we're going structure the architecture so that it really takes advantage
of this interpretation we're giving.
Even though they don't tell it precisely what the output should be for these heated units,
we can wire the network in such a way that it takes advantage of this in priority notion.
And there's a lot of work in that kind of...and many other creative ways
to put in...just...it's just different from the ways it's been done before.
KARASICK: Absolutely.
BENGIO: But, there are many things that can be done to put in [INAUDIBLE].
In fact, a lot of the papers in machine learning are about exactly doing that.
LI: Yes, so, I also want to add that first of all, knowledge,
this word knowledge doesn't come into one level.
If you think about [VARS] description of the visual system, you know, there's many layers.
I. Think there is a common misperception and it's a sexy, easy story to tell,
but it's kind of misguided, is that a lot of people think before the reentrance
or the re-success of deep learning convolution or neural network,
the entire field of computer vision is a bunch of us engineering features.
And it's true that feature engineering was a big chunk of computer vision research,
and that some of you might know those famous features called SIFT or HOG.
But really, as a field, not only we were looking at features,
we were also looking at other forms of knowledge.
For example, camera models.
To this day, the knowledge about perspective, about transformation are important.
And as you look at other aspects of knowledge, there is relationships,
there is physical properties of the world, there's interactions, there is materials,
there is affordances and there's a lot of structure.
So, I agree with Yoshua, a lot of our current research, even past,
but with a power of deep learn something to think about how we continue
to make models expressive or capable of encoding interesting knowledge structures
or acquiring knowledge structures to serve the end task.
And one of the things I'm excited about is these blending
between more expressive generative model or knowledge based or relationship based encoding
with the power of deep learning and so on.
So, it's a really rich research area.
KARASICK: Thank you.
John? SMITH: Yes, so I would agree with certainly Yoshua and Fei-Fei.
I do think that deep learning has brought some degree
of soul searching by computer division scientists.
I think particularly, you know, because feature engineering, actually, that was the chase.
That was the race.
Until 2012, you know, it was who has the best feature.
And SIFT was getting a huge amount of attention.
So, I think the field was really hyper focused on creating the next best feature.
But, I think now we know with deep learning, the computer can come up with features better
than the best computer vision scientists.
And that's even on problems like image net.
And yet there are many even more complex problems out there that deal with video
and motion and dynamic scenes where I think it's even harder for humans
to know what is the right feature in those situation.
I think there's even more potential here for deep learning to win out in the end.
But I think knowledge is actually coming into multiple places.
I think Yoshua talked about sort of transfer learning, you know, types of uses,
so when you train from data, those sets
of representations are often...they can become knowledge
that gets reused in other applications.
So, there are lots of examples of learning on image net,
then taking those features essentially and going and doing a completely different task
and actually doing very well on that task.
So, I think there's a lot of knowledge there in these networks that get trained.
And that can be stored and shared and reused and so on.
And I think there's still a lot of work that has to get imposed from the outside
in around the semantic structures.
So, we seem to have a very good start on objects and how they should be organized.
There's work happening around places.
But there's search other categories related to people and actions and activities and events
and in domains, you know, food and fashion and how to organize all of that, I think,
takes some human effort, but if we do that, then the computer can come take over.
KARASICK: Okay.
One more question from me and then we'll open it up to the audience.
It would have you -- actually, I guess you've all talked about challenges.
So, John, with your background in image processing and, Fei-Fei with your work
on image net, the same thing, Yoshua, you talked about the challenges recently.
Are we doing enough of those?
BENGIO: No.
KARASICK: So, throw one out.
BENGIO: So, so, I believe that scientific and technical progress has been much slower
than it could have been if we had been less obsessed by the short term objectives.
Since I started my [INAUDIBLE] research, I've seen this, that in my own work,
I've been spending too much time going for the low-hanging fruits and the big ideas
that require more thinking and more time have been sort of paying the price for that.
And I think as a community of researchers and people who explore that field,
I think we are still way too much into the short-term questions
and not enough spending time on understanding.
And when we understand something, we get much more powerful.
We can build better bridges, better AI, if we understand what we do.
Of course, understanding doesn't give you immediate benefits,
but it's really an important investment.
Of course you have to do both, right?
There's a right balance.
But I think especially with what's happening right now in deep learning, people are jumping
on it immediately to build products and that's useful but we shouldn't forget
that the science needs to continue because these challenges are still there.
KARASICK: So, Fei-Fei, image net has been probably a pretty interesting journey.
I mean, if you think back to where you start and where it is now, you know,
John says I think rightly for a thousand categories, it's essentially solved.
So, you're thinking about what next.
But, you know, do you think this really helped push the state of the art forward quickly?
I know it's been kind of entertaining for spectators like me, you know,
notwithstanding companies cheating.
It's always been interesting to read about.
But where is it going?
BENGIO: All right.
So, I agree with Yoshua.
I think that there's still a huge place for doing the right challenges.
Not for the sake, from standing where we are in that continuum,
not for the sake of immediate products,
but really pushing scientific advanced to the next stage.
You know, I remember in 2007, when my student and I, now a professor at Michigan, [INAUDIBLE]
and I started with Professor Kiley at Princeton image net,
we were asking ourself the question, why do we want to do this?
Is this really, you know, the bigger the better?
Because there was CalTech 101.
And I really remembered that we had this conversation and convinced ourselves
that we're going to challenge ourselves in the field with a data set of that size
that will have to push for new machine learning.
That was what our goal is.
I even told my student my prediction is the first users,
successful users of image net will be coming from machine learning community rather
than our own core computer vision community.
So, following that same kind of hypothesis, I'm putting a lot of thought right now again
about what we, my own lab as well as our field can do with the next set of challenges.
And there's some really budding interesting challenges coming up.
I really, really like [Alla] Institute's challenge.
That's very NOP-oriented about, you know,
they're going through the different grades of high school.
I think right now it's eighth grade science.
I think eighth grade science exam challenge where you get the Q
and A going, the reasoning going.
I think we more of that.
In my own lab, probably within a month or two,
we'll roll out...I would not call it image net 2.0.
We'll roll out the different data set called visual genome data set which is really going
after deeper knowledge structure in the image world focusing on grounding and relationships
and we hope that the next kind of set of challenges surrounding pictures is
about more focusing on relations, affordances, attributes,
that kind of challenge, rather than categorization.
And one more thing probably John can comment on is another area of challenge
that I hope some people can take the task of putting together is dynamic syncs.
I think there is a huge space to be explored in videos and either the Internet kind of videos
as well as videos that's more situational like robotic videos.
So, I believe that there's a huge opportunity still for different challenges.
KARASICK: So, John, I mean, you ask I work for a company where throwing challenges in front
of researchers is part of our DNA.
So, you must think about this a lot.
SMITH: Absolutely.
I think we're talking about two things here.
One is evaluations and the other is challenges.
I think with evaluations, you know,
we've seen that they're absolutely essential to making progress in this area.
Everything from what [Trek] has done, [NIST], you know, for information retrieval
and language translation and speech transcription
and image and video retrieval and so on.
These things have been the way to sort of gauge and measure progress
as these technologies develop, so we need those.
But, yes, I think we need also, you know, big ideas around grand challenges.
I think video is still ripe here.
I think we need a lot more progress on that.
And certainly I see opportunities to put things together like questioning and answering on video
and captioning of video and all of this.
I think we're starting to see.
I think multi-modal, multi-media data is actually going to be very important.
So, how do we combine the audio and the language and what we see
and all of these modalities together with sensors.
It could be inertial sensors on a mobile device to give an understanding of the world.
And I think where it's all heading is actually just there
which is how can we really completely understand the world.
So, not just know what objects may be where or what landmarks but really what happens where.
I mean, just really be able to sort of capture sense and understand
and model what going on in our daily lives.
So, lots of challenges, you know, I think the real challenge is getting enough time
in people and effort.
KARASICK: It's not boring.
SMITH: Yes.
KARASICK: So, with that, I'd like to throw it open.
I don't know where the microphones are.
Where are they?
They're coming.
In the front here.
>> Thank you.
Before asking my question, I want to echo one thing you said,
that the low-hanging fruit sometimes is the one that's starting to rot,
so you should not all be content with just helping yourselves to that.
My question is, I noticed something that I think is a good thing
and my question is whether the three of you agree or disagree or what,
is that the original...I don't know, purity or narrowness depending upon your point of view
of neural nets is starting to fade and you're incorporating all kind of ideas,
I think good ideas from other areas, for example in your image and caption description,
you get two different kinds of networks, a convolutional network
and another one that's more like an HMM or CRS type structure around the output side.
And now you're talking about multimedia where you have different kinds of inputs
and so you have to treat them somewhat different ways.
So, do you think that these sound like more hybrid approaches
that give power also has a drawback in terms of you have to train each part somewhat separately
or do you think that this is a great idea because it reflects maybe the fact
that different parts of a brain don't necessarily work in exactly the same way?
BENGIO: So, first of all, we can train these things together.
They don't have to be trained separately.
Second, they're called recurrent nets and they're very different from CRS and HMMs.
Now, for the bigger question you're asking, of course it is important to play
with these architectures and combine the appropriate types of pieces for the task.
If you're dealing with images, you don't necessarily want the same kind
of complications as if we're dealing with words.
Now, it's all implemented with neurons.
It's just that they're wired in different ways and in the case of images,
we use these translation variants properties of images
that save us training data and training time.
So, and will we are adding more components
so I mentioned the memory components, heard a little bit about it.
That's something that also present in the brain.
We have a lot of these things in the brain, and there are many more things in the brain.
They are we were saying it's incredibly complex, has many different kinds of components.
At the same time, what's interesting is that for all these components that we're currently using
in deep learning, they all use the same learning principle and this is probably true
to some extent, at least in core text that there's sort of one sort of general recipe.
If this is an hypothesis, we don't know.
And then that's really interesting because that means we can reuse
that principle for many types of jobs.
LI: I don't think I can add.
I totally agree with you, Yoshua.
And, you know, if you think about the evolution of the brain,
brain not...the evolution is not that pristine, right?
Like nature just patch up parts together, in a way.
So, I think there is some, you know, just some fact of life there that these kind of models,
whether it's the biological model here
or the actual computer algorithm grows in a dynamic way.
SMITH: Yes, I think this is in part where we're heading a Watson developer cloud which is
to essentially put out all of these individual trainable components that address everything
from language to speech to vision to question and answering.
And then these become the building blocks for creating potentially much more complex systems
and solutions, adapting to industry problems and so on.
So, I think we definitely see all of these pieces coming together.
KARASICK: Next question.
Who's got a microphone.
Anybody? Go ahead.
>> Yes. [INAUDIBLE] theoretical challenges that lie in truly understanding these networks.
I was wondering if the [INAUDIBLE] can be expanded upon as well as what kinds
of new mathematics going to be required to really solve these problems.
KARASICK: I didn't quite catch that.
LI: Can you repeat the question?
>> I didn't hear everything.
>> Yes. So, [INAUDIBLE] to truly understand,
we need to have a [INAUDIBLE] foundational mathematical understanding of the networks,
what kinds of new mathematics are [INAUDIBLE]
and [INAUDIBLE] specifically state some outstanding theoretical challenges.
BENGIO: I think the theoretical understanding of deep learning is really something interesting
and because of the success of deep learning, there are more and more mathematicians,
applied math people mostly, getting into this field.
And it's great.
There's something interesting with neuro net research in general,
which is the theory tends to come after the discoveries.
And because they're very complex.
And mostly, we move forward using our intuitions and we play with things and they work
and then we start asking why and maybe a different set
of people are able to answer these questions.
Right now, we're using the same old mathematics.
Although there are people who are answering some of the questions with aspects of math
that I know much less, like topology and things like that.
So, there's a lot of potential for understanding,
there's a lot of complex questions that we don't understand.
But there's been a lot of progress recently in theory.
So, we understand a lot better the problem of local [INAUDIBLE].
We understand a lot better the expressive flower of deep net and we understand a lot better some
of the probabilistic properties of different algorithms.
So, by far, you know, we haven't answered all the questions.
But there's a lot of work and a lot of progress.
KARASICK: So, none of you mind being characterized as experimentalists.
It's something I've observed too.
There's a lot of intuition and then...
>> Yes. KARASICK: ...
[INAUDIBLE] exactly?
BENGIO: Yes.
Intuition is crucial in this because, right now, you know, mathematical analysis has...is limited
in the power that...what it can predict.
So, a lot of it is intuitions and experiments to validate hypothesize.
So, there's a lot of the scientific sort of cycle of trying to propose hypothesis,
explain what we observe and then testing
for experiments rather than proving it mathematically.
LI: And just to add, at Stanford, you see this because first it's the people
as CS department making these algorithms and now there's a lot more interaction
between statisticians and the machine learning communities here at Stanford.
Some of our youngest hires, faculty hires, are actually paying the most attention from the side
of statistics department or even applied math looking
at potential theories behind these algorithms.
So, I predict we'll see more statisticians coming to explain some of what we do here.
KARASICK: Next question.
Over here.
>> [INAUDIBLE] from UC San Diego.
So, neural networks as the universal function approximators theoretically have been known
for several decades, perhaps even almost half a century by some accounts,
and even deep architectures have been explored perhaps even many, many decades ago.
And so, I guess, what I'm trying to say is seems like the idea
of supervised learning theoretically being solved had been well known for a long time
and now, thanks to competing powers and lots and lots of data, label data,
we're seeing amazing performances, right.
But I'm not sure where a similar guarantees in the unsupervised learning realm, right?
And I think that unsupervised learning is especially powerful in the case
of what we talked about earlier, this 90 percent of dark data.
Some...I guess we don't have that...question we don't have...has to be supervised learning.
So, is there any theory for the direction we need to go for unsupervised learning?
BENGIO: There is.
There is. So, first of all, regarding the universal approximation properties,
we have similar theorems for a number of [INAUDIBLE].
So, we can prove...so I worked on universal approximation theorems
for most machines and DBNs and RBMs.
And you can reuse results from the supervised learning framework
for other unsupervised learning methods like [INAUDIBLE] encoders and things like that.
So, that question is pretty much settled.
That being said, there are other theoretical questions to which we don't have answers
in unsupervised learning that have to do with these intractabilities
that are inherent in solving this problem.
So, there's a lot more to be done there for sure.
LI: And just to add to that, there is some evidence from nature that this might also exist.
I mean, the paper by Olshausen and Fields in 1996, you know, unsupervised ICA model
for V1 receptive field is kind of a very nice evidence to show that these kind of features
and emerges in just unsupervised training of natural statistics of things.
KARASICK: Do we have time for one more, or are we done?
One more, okay.
Make it good.
Last question.
In the back.
>> Make it good.
I don't know if it's a good question.
I'm mostly an outsider.
I am...I heard a lot of machine learning applications to classification [INAUDIBLE]
and whatnot, but I'm wondering if deep networks are a good match for modeling complex systems.
For example, to predict the reading of a sensor in a complex piece of machinery
to predict the trajector of something moving in a compression situation.
So, continuous problems rather than discrete problems.
KARASICK: Hell, yes.
Anybody? BENGIO: I could say something about this.
There's actually substantial working done on continuous value signals.
Such as handwriting, the trajectories of handwriting, such as [INAUDIBLE] signals.
And so and now, there's a lot of work as well in control.
So, I don't see any [INAUDIBLE] reason why these kinds of applications would be not feasible.
There's lots of evidence that, you know, some work may be needed.
But I don't see any problem in particular.
SMITH: Yes.
And we're definitely seeing that also looking at objects
in this trajectory in video is one example.
I think there's a lot of work around manifolds and real value data
and so on where there's applications.
So, I don't know that there's any fundamental limitation there.
KARASICK: Thank you.
So, everybody please join me in thanking the panel.
[APPLAUSE]