Subtitles section Play video
behind some of the coolest premium effects in Hollywood content is the invisible aid of a I artificial intelligence.
在好萊塢內容中一些最酷的高級效果背後,是一個I型人工智能的無形幫助。
It is just blowing the doors wide open on opportunities for new ways to tell stories.
它只是為講故事的新方法的機會吹開了大門。
This is a good technology to hang our hat on because it is getting so much better.
這是一個很好的技術,我們可以把帽子掛在上面,因為它正在變得如此之好。
Every single year.
每一年都是如此。
Machine learning is being baked into workflows helping create previously unimaginable moments from big blockbusters to nonfiction tv I think where Ai really is impactful is getting it to do things that human beings can't do, including raising the dead as if you know you have Andy Warhol standing in the studio right in front of you and you looked at him and said I want you to say it like like this.
機器學習正在被烘烤到工作流程中,幫助創造以前無法想象的時刻,從大型大片到非小說類電視,我認為Ai真正有影響力的地方是讓它做人類無法做到的事情,包括起死回生,就像你知道你有安迪-沃霍爾就站在工作室裡,你看著他說,我希望你像這樣說。
I wasn't very close to anyone although I guess I wanted to be, let's examine a few specific use cases of how Ai is changing Hollywood's creative workflow.
我沒有和任何人走得很近,雖然我想我想成為這樣的人。讓我們研究一下Ai是如何改變好萊塢的創意工作流程的幾個具體使用案例。
The entertainment industry was spawned by new technology.
娛樂業是由新技術催生的。
So it makes sense that from talkies to television to digital video, Hollywood has a history of leveraging new tech, especially in the world of visual effects.
是以,從有聲電影到電視再到數字視頻,好萊塢有利用新技術的歷史,特別是在視覺效果領域。
When I saw Jurassic park.
當我看到《侏羅紀公園》。
That was the moment that I realized that computer graphics would change the face of storytelling forever.
就在那一刻,我意識到計算機圖形將永遠改變講故事的面貌。
In the last 25 years that I've been working in film, we've been conquering various challenges doing digital water for the first time in titanic, doing digital faces for the first time in a movie like Benjamin button and now the state of the art is machine learning Ai applications like the kind matt's company maRS develops in house, You can throw it, you know, infinite amount of data and it will find the patterns in that data naturally.
在我從事電影工作的過去25年裡,我們一直在征服各種挑戰,在《泰坦尼克號》中第一次做數字水,在《本傑明-巴頓》這樣的電影中第一次做數字臉,現在最先進的是機器學習的Ai應用,就像馬特的公司maRS內部開發的那種,你可以把無限量的數據扔給它,它自然會找到這些數據的模式。
Thanks to thirsty streaming services.
感謝飢渴的流媒體服務。
Hollywood is scrambling to feed demand for premium content rich and visual effects.
好萊塢正在爭先恐後地滿足對豐富的優質內容和視覺效果的需求。
Budgets time are not growing in a way that corresponds to to those rising quality expectations.
預算的增長方式與不斷提高的品質預期不相適應。
It's outpacing the number of artists that are available to do the work and that's where Ai comes in tackling time consuming uncreative tasks like Denoix, zing, rotoscoping and motion capture tracking removal.
它已經超過了可以從事這項工作的藝術家的數量,這就是Ai處理耗時的非創造性任務的地方,如Denoix、Zing、Rotoscoping和動作捕捉跟蹤刪除。
This was our first time ever trying AI in a production, we had a lot of footage just by virtue of being on the project and doing 400 shots per marvel.
這是我們第一次在製作中嘗試人工智能,我們有大量的鏡頭,只是因為參加了這個項目,每一個奇蹟都有400個鏡頭。
When we received the footage which we call the plates in order to manipulate paul.
當我們收到我們稱之為板塊的錄像,以操縱保羅。
Bettany is face.
貝坦尼是臉。
There needed to be tracking markers During principal photography we looked at it we said, okay, well removing tracking markers is going to take roughly one day per shot in order to replace or partially replace vision's head for each shot and the shot is typically defined as about five seconds of footage.
需要有跟蹤標記 在主要拍攝期間,我們看了看,我們說,好吧,為了替換或部分替換視覺的頭部,每個鏡頭大約需要一天時間,而鏡頭通常被定義為大約五秒鐘的鏡頭。
The tracking marker removal itself was about 1/10 of that.
追蹤標記的移除本身大約是其1/10。
So on a 10 day shot, one day was simply removing tracking markers, developed a neural net where we are able to identify the dots on the face of the artificial intelligence averaged out the skin texture around the dot remove the dot and then in filled with the average of the texture surrounding it.
是以,在為期10天的拍攝中,有一天是簡單地刪除跟蹤標記,開發了一個神經網絡,我們能夠識別臉上的小點,人工智能平均出小點周圍的皮膚紋理刪除小點,然後在填充它周圍紋理的平均值。
And marvel loved it because it sped up production.
驚歎不已,因為它加快了生產速度。
They save money exactly what we wanted these solutions to do.
他們省錢,正是我們希望這些解決方案能做到的。
Where the solution was faltering was whenever there was motion blur.
該解決方案的缺點是每當出現運動模糊時就會出現問題。
When paul Bettany moves his head very quickly to the right or to the left, there's moments where those dots will reappear partially because in the data set itself we didn't have enough motion blur data.
當保羅-貝坦尼非常迅速地將他的頭向右或向左移動時,有些時候這些小點會重新出現,部分原因是在數據集本身,我們沒有足夠的運動模糊數據。
Another example would be whenever the character turned his head where his eyes were out of the screen you would see those dots reappear as well.
另一個例子是,每當人物轉頭時,他的眼睛離開螢幕,你會看到這些點也重新出現。
The Ai recognition.
艾的認可。
It's using the eyes as a kind of a crucial landmark to identify the face.
這是把眼睛作為一種關鍵的地標來識別臉部。
And so if I turn my head this way and you can't see my eyes well the ai can identify that as a face again.
是以,如果我把頭轉過來,你看不到我的眼睛,那麼ai又能把它識別為一張臉。
You can fix those things with more data.
你可以用更多的數據來解決這些問題。
The more data you feed these things typically the better right.
你給這些東西提供的數據通常越多越好吧。
There wasn't a lot of clean data available on our next day.
在我們的第二天,沒有很多幹淨的數據可用。
I use case the star of the film had been dead for 25 years yet.
我使用的情況是,該片的明星已經死了25年了。
The director wanted more than 30 pages of dialogue read by iconic artist Andy Warhol himself.
導演希望由標誌性藝術家安迪-沃霍爾親自朗讀30多頁的對話。
So what do you do?
那麼你是怎麼做的?
You could hire like a voice actor to do like a great impersonation but we found with his voices you kind of wanted to retain that humanness that Andy had himself, you can get fairly close with the voice actor but you really can't get it.
你可以僱一個配音演員來做一個很好的模仿,但我們發現對於他的聲音,你有點想保留安迪本人的那種幽默感,你可以通過配音演員來相當接近,但你真的無法得到它。
So and that's where ai technology really helps general of audio is the ability for a artificial agent to be able to reproduce a particular voice but also reproduce the style, the delivery, the tone of of a real human being and do it in real time.
是以,這就是AI技術真正幫助音頻的地方,即人工代理能夠重現一個特定的聲音,但也能重現一個真正的人的風格、傳遞和語氣,並實時做到這一點。
Welcome to resemble a generative audio engine.
歡迎類似於生成性音頻引擎。
When the team initially reached out to us, they proposed what they were going to do.
當團隊最初與我們聯繫時,他們提出了他們要做的事情。
We asked him like, okay well what kind of data are we working with?
我們問他,好吧,那我們要用什麼樣的數據來工作?
And they sent us these audio files, recordings over a telephone.
他們給我們發送了這些音頻文件,通過電話錄音。
They're all from the late seventies, mid seventies.
他們都是七十年代末、七十年代中期的人。
The thing about machine learning is that bad data hurts a lot more than good data.
關於機器學習的事情是,壞的數據比好的數據傷害要大得多。
So I remember looking at the data we had available and thinking this is going to be really, really difficult to get right with three minutes of data, we're being asked to produce six episodes worth of content with three minutes of his voice.
是以,我記得我看了看我們現有的數據,並認為這將是非常、非常困難的,因為我們被要求用他三分鐘的聲音製作六集的內容。
So with three minutes hasn't said every word that's out there.
是以,在三分鐘內還沒有說出外面的每一個字。
So we're able to extrapolate to other phonetics and two other words and our algorithm is able to figure out how Andy would say those words.
是以,我們能夠推斷出其他語音和其他兩個詞,我們的算法能夠計算出安迪會如何說這些詞。
That's where neural networks are really powerful.
這就是神經網絡真正強大的地方。
They basically take that speech data.
他們基本上把這些語音數據。
They break it down and they understand hundreds and thousands of different features from it.
他們將其分解,並從中瞭解成百上千的不同功能。
Once we have that voice that sounds like Andy from those three minutes of data then it's all about delivery, it's all about performance.
一旦我們從這三分鐘的數據中得到了聽起來像安迪的聲音,那麼所有的事情都是關於交付的,都是關於性能的。
I went down to the office because they're making a robot of me and Andy's voice.
我去了辦公室,因為他們正在製作一個我和安迪的聲音的機器人。
It's highly irregular and that's where the idea of style transfer really came in.
這是非常不規則的,這也是風格轉移的想法真正出現的地方。
So style transfer is this ability for our algorithm to take input as voice and someone else's speech.
是以,風格轉移是我們的算法將輸入作為語音和別人的講話的這種能力。
I wasn't very close to anyone.
我和任何人都不是很親近。
Although I guess I wanted to be we're able to say that line and then our algorithms are able to extract certain features out of that delivery and apply it to Andy's synthetic or target voice.
雖然我想我想要的是我們能夠說出那句話,然後我們的算法能夠從這句話中提取某些特徵,並將其應用於安迪的合成或目標聲音。
The first one was automatically generated.
第一個是自動生成的。
No touch ups.
沒有潤色。
I wasn't very close to anyone although I guess I wanted to be.
我和任何人都不是很親近,雖然我想我想親近。
The second one was like touch up by adding a pause.
第二件事就像通過增加一個停頓來進行修飾。
I wasn't very close to anyone although I guess I wanted to be.
我和任何人都不是很親近,雖然我想我想親近。
And then the third one was basically adding the final touch where it's like okay you know what?
然後第三個基本上是添加最後的潤色,就像好吧你知道嗎?
I really want to place an emphasis on this particular syllable.
我真的想把重點放在這個特殊的音節上。
So yeah let's get a voice actor too.
所以是的,讓我們也找一個配音演員。
Do that part to actually place that emphasis on the right words right syllable.
做到這一點,才能真正把重點放在正確的詞語正確的音節上。
And then the third output has those features extracted from that voice over actor and to Andy's voice.
然後第三個輸出有那些從那個配音演員中提取出來的特徵,並對安迪的聲音進行提取。
I wasn't very close to anyone although I guess I wanted to be.
我和任何人都不是很親近,雖然我想我想親近。
You have definitely heard ai voice is being used in the past for touch ups for a line here or there.
你肯定聽說過ai的聲音在過去被用來修飾這裡或那裡的線條。
This is probably the first major project that's using it so extensively.
這可能是第一個如此廣泛地使用它的重大項目。
Most of the effects are still a very manual process, characters can be extremely challenging creatures, things like for hair.
大多數的效果仍然是一個非常手動的過程,人物可能是非常具有挑戰性的生物,像對於頭髮的事情。
Those things can be extremely challenging and time consuming.
這些事情可能極具挑戰性和耗時。
One notable example of where the technology is headed are the scenes involving advanced three D.
技術發展方向的一個顯著例子是涉及高級三D的場景。
V.
V.
Effects in Avengers.
復仇者聯盟中的效果。
Endgame, josh Brolin plays Thanos.
末日之戰》,喬什-布洛林扮演薩諾斯。
We capture tons and tons of data in this laboratory setting with josh and then we use that data to train neural networks inside of a computer to learn how josh's face moves.
我們在這個實驗室環境中與喬什一起捕捉了成噸成噸的數據,然後我們用這些數據來訓練計算機內的神經網絡,以學習喬什的面部動作。
They'll say lines that look left look right.
他們會說,看起來左邊的線條看起來是右邊的。
They'll go through silly expressions and we capture an immense amount of detail in that laboratory setting.
他們會經歷愚蠢的表情,我們在那個實驗室環境中捕捉到大量的細節。
Then they can go to a movie set and act like they normally would act.
然後他們就可以去電影片場,像平時那樣表演。
They don't have to wear any special equipment.
他們不需要穿任何特殊的設備。
Sometimes they wear a head camera but it's really lightweight stuff.
有時他們會戴上頭戴式攝影機,但這真的是很輕的東西。
Very unobtrusive and allows the actors to act like they're in a normal movie.
非常不顯眼,讓演員們像在正常電影中一樣表演。
Then later when the animators go to animate the digital character, they kind of tell the computer what expression the actor wants to be in and the computer takes what it knows based on this really dense set of data And uses it to plus up to enhance what the visual effects animator has done and make it look completely real.
然後,當動畫師為數字角色製作動畫時,他們會告訴計算機演員想要什麼表情,而計算機則根據這套非常密集的數據,把它所知道的東西拿出來,用來加強視覺效果動畫師的工作,使其看起來完全真實。
So there will come a time in the future.
所以未來會有一個時間。
Maybe it's 10 years, maybe it's 15 years but you will see networks that are going to be able to do really creative stuff.
也許是10年,也許是15年,但你會看到網絡將能夠做真正有創意的東西。
Again, that's not to suggest that you remove talented artist from the equation, but I mean that's the bet that we're taking as a business is a I gonna take over my job.
同樣,這並不是說你把有才華的藝術家從等式中移除,但我的意思是,這就是我們作為一個企業的賭注,是我要接管我的工作。
What I see happening right now is actually quite the opposite is that it is creating new opportunities for us to spend the time on doing things that are creatively meaningful rather than spending lots of time doing menial tasks, were actually able to focus on the creative things and we have more time for iteration.
我看到現在發生的事情實際上恰恰相反,它正在為我們創造新的機會,讓我們把時間花在做有創造性意義的事情上,而不是花很多時間做瑣碎的任務,實際上我們能夠專注於創造性的事情,我們有更多的時間進行迭代。
We can experiment more creatively to find the best looking result.
我們可以更有創意地進行試驗,以找到最好看的結果。
I think that the more that Ai can do the menial stuff for us, the more we're going to find ourselves being creatively fulfilled.
我認為,艾為我們做的瑣事越多,我們就會發現自己越能得到創造性的滿足。
Again, the argument for us is like really creating content that isn't humanly possible.
同樣,對我們來說,爭論的焦點是像真正創造內容,不是人類可以做到的。
So, you know, we're not interested in like creating an ad spot that a real voice actor would do because in all honesty, that real voice actor would do way better than the Ai technology would do would be way faster if you're just delivering a particular sentence or a particular line.
所以,你知道,我們對創造一個真正的語音演員會做的廣告點不感興趣,因為說實話,如果你只是傳遞一個特定的句子或特定的臺詞,那個真正的語音演員會比Ai技術做得更好會更快。
The technology to do Deepfakes is so prevalent.
做Deepfakes的技術是如此普遍。
You can get apps on your phone now that pretty much can do a Rudimentary Deepfake, it's going to be interesting in the future, are we going to have to put limits on this technology?
你現在可以在你的手機上得到一些應用程序,幾乎可以做一個初級的深度偽造,這在未來會很有趣,我們是否要對這項技術進行限制?
How do we really verify what's authentic and what isn't there sort of social repercussions for it as well that I think that we don't quite understand yet.
我們如何真正驗證什麼是真實的,什麼是不真實的,也有一些社會反響,我認為我們還不太瞭解。
I absolutely believe that this technology could be misused.
我絕對相信,這項技術可能被濫用。
Our number one priority is to make everyone feel comfortable what we're doing.
我們的首要任務是讓大家對我們所做的事情感到舒適。
I think it comes down to educating the general population eventually making them understand that they should think through whatever they are, looking at, whatever they're reading and now whatever they're hearing, we feel we're directionally correct in our bet that this is a good technology to hang our hat on because it is getting so much better every single year and we don't want to miss what we see as like a once in a lifetime opportunity here.
我認為這歸結為教育普通民眾,最終讓他們明白,無論他們在看什麼,無論他們在讀什麼,現在無論他們在聽什麼,他們都應該思考,我們覺得我們的賭注方向是正確的,這是一個很好的技術,因為它每年都在變得更好,我們不想錯過我們認為的一生中只有一次的機會。