Subtitles section Play video
So yes, today I'm going to talk about eBPF, as Sven said, I work at Aqua Security where we build tools to help enterprises with their cloud-native, securing their cloud-native deployments, and eBPF is one of the technologies that we're starting to leverage, in particular with a project called Tracy.
正如斯文所說,我在 Aqua Security 公司工作,我們開發工具幫助企業實現雲原生,確保雲原生部署的安全,而 eBPF 是我們開始利用的技術之一,尤其是在一個名為 Tracy 的項目中。
Now, I have done a talk about eBPF before, but today I'm going to do something new in that I am writing my code, or at least my user space code, with Go today, it's the first time I've done the talk using Go, so that's the new twist that we're going to tackle today.
現在,我曾經做過一次關於 eBPF 的演講,但今天我要做一些新的事情,因為我今天要使用 Go 編寫我的代碼,或者至少是我的用戶空間代碼,這是我第一次使用 Go 進行演講,所以這也是我們今天要解決的新問題。
So, before I get into the point where I'm going to start using Go and start writing some code, I should probably start by talking a little bit about what eBPF is.
是以,在開始使用 Go 和編寫代碼之前,我應該先介紹一下什麼是 eBPF。
Key thing is, it lets you run custom programs of your choice in the kernel, so it's a Linux kernel feature technology that lets you, on the fly, add and change code programs that are going to run in response to events, and you can change them dynamically without having to reboot the machine, so much, much more powerful than writing a kernel module.
最關鍵的是,它能讓你在內核中運行自己選擇的自定義程序,是以它是一種 Linux 內核功能技術,能讓你在運行過程中添加和更改代碼程序,這些程序將根據事件響應運行,你可以動態地更改它們,而無需重啟機器,這比編寫內核模塊要強大得多。
We've seen eBPF becoming really a hot technology over the last few years, because it's so powerful and because you can hook it into so many different events, we're seeing it used in lots of observability tools, we're starting to see it used for security as well.
在過去幾年裡,我們看到 eBPF 成為了一項熱門技術,因為它功能強大,而且可以與許多不同的事件掛鉤,我們看到它被用於許多可觀察性工具,我們也開始看到它被用於安全領域。
Today, I'm going to talk more about how it works and hopefully give you enough grounding that you can go away and start writing your own eBPF code.
今天,我將詳細介紹它是如何工作的,希望能給你足夠的基礎,讓你可以開始編寫自己的 eBPF 代碼。
Now, if we're going to run some code in the kernel, but as developers, we're usually writing applications in user space, so how do we communicate between user space and kernel?
The answer is system calls.
System calls provide the interface between user space and kernel.
So, I can make a pretty good guess that there would be a system call related to eBPF, and there certainly is.
是以,我可以很肯定地猜測,會有一個與 eBPF 相關的系統調用,而且肯定是有的。
If we look up the man page for BPF, we'll find lots of really helpful information about what BPF is and how we use it.
如果我們查找 BPF 的 man 頁面,就會發現很多關於 BPF 是什麼以及如何使用它的有用資訊。
So, first of all, BPF stands for Berkeley Packet Filters.
首先,BPF 代表伯克利數據包過濾器。
I'm going to use the word eBPF and BBF pretty interchangeably.
我打算把 eBPF 和 BBF 互換使用。
Historically, it was about filtering network packets, running custom code when a network packet arrived.
That's been extended.
You can now run your BPF programs in response to lots and lots of different types of events, not just the arrival of a network packet.
現在,您可以運行 BPF 程序來響應大量不同類型的事件,而不僅僅是網絡數據包的到達。
So, whether it's classic BPF or eBPF doesn't really matter these days.
是以,無論是傳統的 BPF 還是 eBPF,如今都已經不重要了。
But in both cases, you're running code in the kernel, and you really, really don't want the kernel to crash or hang.
So, when we want to run some eBPF code, it goes through a verification step to make sure that it's safe to run.
是以,當我們要運行一些 eBPF 代碼時,它需要經過一個驗證步驟,以確保可以安全運行。
So, that's something we'll talk a little bit more about later.
So, we have some eBPF code that's going to run in the kernel.
是以,我們有一些 eBPF 代碼將在內核中運行。
We have user space application code that, as developers, is normally where we're used to writing code.
And there's a system call interface between the two.
And I think if we look at the system calls, that can actually be quite helpful for understanding really what's happening when we're talking about inserting code into the kernel.
So, I'm going to start by using an example.
I'm going to use BPF trace.
我要使用 BPF 跟蹤器。
This is quite a widely used tool for running BPF scripts on a system.
這是在系統上運行 BPF 腳本時使用相當廣泛的工具。
And so, I'm going to show this partly as an example to show the kind of things you can do with BPF, and partly so that we can examine the system calls that happen when we run it.
是以,我將舉例說明 BPF 可以實現的功能,並檢查運行時發生的系統調用。
So, let's explore some BPF trace and the system calls it uses.
是以,讓我們來探索一些 BPF 跟蹤及其使用的系統調用。
So, I have an example that I'm going to use.
It doesn't really matter that much what my example is.
But in this case, I'm setting up a script that's going to run on a trace point.
It's running when we call the sysenter.
The sysenter function actually gets triggered for every single system call.
實際上,每次系統調用都會觸發 sysenter 函數。
So, I'm going to run a script every time any process on my virtual machine calls a system call.
And it's going to run this script.
What that script actually does is it takes, it sets up a counter for the number of times each different command makes a system call.
So, if I try to run that, well, you have to be a privileged user to do it.
That kind of makes sense.
You don't want every unprivileged user running code in your kernel.
So, I can use sudo.
是以,我可以使用 sudo。
I just run that for a few seconds, and then I'll interrupt it.
And it's going to show us for each different kind of command that's running on my machine, the number of system calls.
I mean, that's quite interestingly powerful that we can get to such fine granularity as counting every single system call that's happening on my machine.
So, I'm going to run the same thing again.
But this time, let's have a look at the BPF system calls that are being called.
但這次,讓我們來看看正在調用的 BPF 系統調用。
So, I do that with strace.
I'm just going to look for BPF system calls and quit it after a few seconds.
我只是要查找 BPF 系統調用,幾秒鐘後就退出。
And the thing that I want to show you is these BPF system calls that are being called.
我想向大家展示的是正在調用的 BPF 系統調用。
So, we see a few map create, we see map update element, and we see program load.
So, that tells us something or indicates something about a couple of concepts we need to know about.
Those are BPF programs and BPF maps.
這些是 BPF 程序和 BPF 地圖。
So, we use the same call, but with a different parameter to manipulate programs and maps.
So, starting with the programs, what are those programs?
Well, I mean, they're programs.
They run on the CPU.
它們在 CPU 上運行。
They're essentially machine code instructions.
But for BPF, we're restricted in what we can run because of that requirement for BPF code to be safe.
但對於 BPF 而言,由於要求 BPF 代碼必須安全,我們在運行方面受到了限制。
It mustn't crash, it mustn't loop.
So, we typically write our BPF programs in C, a restricted set of C, which we don't use any loops.
是以,我們通常用 C 語言編寫 BPF 程序,這是一套受限的 C 語言,我們不使用任何循環。
We always have to check that a pointer is not null before we dereference it.
And then we use the Clang compiler to convert it into an eBPF object, a set of bytecode instructions that are going to get run inside a BPF virtual machine inside the kernel.
然後,我們使用 Clang 編譯器將其轉換為 eBPF 對象,即一組字節碼指令,這些指令將在內核中的 BPF 虛擬機內運行。
So, we're going to write the kernel code in C.
是以,我們要用 C 語言編寫內核代碼。
We get some helper functions that give us some useful contextual information.
So, for example, we can print debugging messages with a helper function.
We can get information about the current running command.
That's how BPF trace knows which command is running, as we saw in the previous example.
這就是 BPF 跟蹤如何知道哪個命令正在運行的,正如我們在前面的示例中所看到的。
Lots of, I guess, a few dozen of these helper functions that can help us with contextual information.
The other thing I talked about was maps, or the other thing we saw in our system course was maps.
And maps are really how we get information between our eBPF program running in the kernel and user space.
而映射實際上就是我們在內核和用戶空間運行的 eBPF 程序之間獲取信息的方式。
We'll come back to a bit more detail about maps shortly.
And then the last kind of conceptual thing we really need to know about is the fact that these programs are triggered by an event happening.
We saw an event, an example where we run a program in response to system calls, in response to triggering a hook at the entry to the function called sysenter.
我們看到了一個事件,一個我們運行程序以響應系統調用的例子,一個在名為 sysenter 的函數入口處觸發鉤子的例子。
There are tons of these hooks already predefined, essentially every function entry and exit, every system call, every trace point in the kernel, every time a network packet arrives.
All of these are possible points where you can trigger an eBPF program.
所有這些都是觸發 eBPF 程序的可能點。
And we also have the term k-probe and u-probe.
我們還有 k-探針和 u-探針。
The k-probe is the entry to a kernel function.
k 探針是內核函數的入口。
A k-rep probe is the exit from a kernel function, and correspondingly the same for user space.
k-rep 探針是內核函數的出口,相應地,用戶空間也是如此。
Combination of all these different types of events means we can really run eBPF code in response to pretty much anything that's happening in your Linux machine.
將所有這些不同類型的事件結合起來,意味著我們真的可以運行 eBPF 代碼,以響應 Linux 機器中發生的幾乎任何事情。
So how do we attach the program to an event?
And again, I think it's a little bit helpful to look at the system calls that are happening.
We're going to not just use the BPF call, but also a couple more system calls per event open, which sets up a trace point.
我們不僅要使用 BPF 調用,還要在每個事件打開時再使用幾個系統調用,這樣就能設置一個跟蹤點。
So program load gives us a file descriptor that I've called x here.
是以,程序加載為我們提供了一個文件描述符,我在這裡稱之為 x。
The trace point comes back as y, and then there's an IO control event that associates the trace point with the program that should be triggered.
跟蹤點的返回值為 y,然後會出現一個 IO 控制事件,將跟蹤點與應觸發的程序關聯起來。
So again we can take a look at that in our BPF trace example.
是以,我們可以在 BPF 跟蹤示例中再次看到這一點。
Again, I just trace out those additional system calls to event, oops, event open, and let's see.
Again, it doesn't really matter too much exactly what's happening.
I just really want to show you this program load BPF call that comes back with the file descriptor of nine.
我只是想向你展示一下這個程序加載 BPF 調用,它返回的文件描述符是 9。
There should be perf event open.
Yeah, this perf event open here, which comes back with a file descriptor of eight.
是的,這個 perf 事件在這裡打開,返回的文件描述符為 8。
And then here is the IO control that associates eight and nine and says, this is the BPF program that I want you to run when we hit that trace point.
然後這裡是 IO 控制,它將 8 和 9 聯繫起來,並說:"這是 BPF 程序,當我們到達跟蹤點時,我希望你運行它。
So loading the program and associating the program with the trace point is something we're going to have to do from user space.
Okay, so if we want to write hello world in eBPF, what do we need to do?
好了,如果我們想用 eBPF 寫 hello world,需要做些什麼呢?
What do we need to have in place?
We know we're going to have to write some C code that's going to run in the kernel and that's going to get compiled by Clang.
我們知道,我們需要編寫一些 C 代碼,這些代碼將在內核中運行,並由 Clang 進行編譯。
And we're going to have to write some user space code that gets the tracing, the hello world message from the kernel and displays it.
我們需要編寫一些用戶空間代碼,從內核獲取並顯示 "hello world "資訊。
And we can write that, at least in theory, we can write it in any language of our choice.
For most of us, we don't typically interact with system calls very often when we're writing user space applications.
There's usually some level of abstraction.
And in fact, many of us don't know that system calls exist.
We don't have to deal with them on a day-to-day basis.
For BPF, there is, we would want a library, a BPF library that gives us a higher level of abstraction over those BPF system calls and things like the perf event open that we just saw.
對於 BPF 來說,我們需要一個庫,一個 BPF 庫,它能為我們提供更高級別的 BPF 系統調用抽象,比如我們剛才看到的 perf 事件打開。
And the library that I'm going to use today is called libbpf-go.
我今天要使用的庫叫做 libbpf-go。
And we actually wrote this as part of a tool called Tracy that's an eBPF security event detection tool we're working on.
實際上,我們將其作為一個名為 Tracy 的工具的一部分來編寫,該工具是我們正在開發的一個 eBPF 安全事件檢測工具。
And we've isolated the libbpf wrapper.
我們已經隔離了 libbpf 封裝程序。
So there's a C library called libbpf, which is a wrapper for the system calls.
是以,有一個名為 libbpf 的 C 語言庫,它是系統調用的包裝器。
And libbpf-go is a pretty thin go wrapper, giving us go bindings around those libbpf interface.
libbpf-go 是一個很薄的 go 封裝器,為我們提供了 libbpf 接口的 go 綁定。
So we're going to write some go code that uses libbpf-go.
是以,我們要編寫一些使用 libbpf-go 的 go 代碼。
And we're also going to write some C code, which we're going to compile into eBPF objects using the Clang compiler.
我們還將編寫一些 C 代碼,並使用 Clang 編譯器將其編譯成 eBPF 對象。
And then the go code is going to read that object file, get the contents out, insert it into the kernel.
然後,go 代碼將讀取該對象文件,取出其中的內容,並將其插入內核。
So we have an object file that has the eBPF code and the definition of any maps.
是以,我們有一個包含 eBPF 代碼和地圖定義的對象文件。
Talk about maps a bit more later.
We have our user space code that's driving our system calls and has the kind of logic around what programs we want to run, what we want to attach them to.
When the user space code calls that BPF program load, it sends the program to the kernel.
當用戶空間代碼調用 BPF 程序加載時,它會將程序發送到內核。
The kernel will verify it, make sure that it's safe to run.
And if it is, it will start running it inside this BPF virtual machine.
如果是,它就會開始在這個 BPF 虛擬機內運行。
So we're going to build two objects.
We've got two different compilation steps.
We've got to use the go compiler, go build, to create a go executable.
我們必須使用 go 編譯器 go build 來創建 go 可執行文件。
And we're going to use Clang to build the BPF object file.
我們將使用 Clang 來構建 BPF 對象文件。
All right.
I think we have enough to actually start writing some code.
So let's go to my editor.
And this is my make file.
It's pretty much exactly what I just showed you on the slide.
So we have a go build step and a Clang step for building the eBPF object.
是以,我們有一個 go 生成步驟和一個 Clang 生成步驟來生成 eBPF 對象。
And let's start with the C code.
讓我們從 C 代碼開始。
So I'm going to write a function called hello.
是以,我要編寫一個名為 hello 的函數。
Exit context pointer, they all do.
I'm going to just do some hello world tracing.
我要做一些 hello world 跟蹤。
So let's say hello, go topia.
And we'll return zero exit code.
The other thing I have to do is define an object code section.
This tells the, essentially, the object loader what kind of BPF program this is going to be.
這主要是告訴對象加載器這將是一個什麼樣的 BPF 程序。
This is kind of a level of detail we don't need to worry about too much today.
But you can write, you can use different helper functions and do different things depending on the type of program you're running and the type of event you're attaching it to.
So I'm going to attach to a K probe the entry point to a function in a kernel.
是以,我要在 K 探針上附加內核函數的入口點。
And I'm actually going to run this whenever the system call exec V gets triggered.
實際上,只要系統調用 exec V 被觸發,我就會運行這個程序。
So that's my C code.
這就是我的 C 代碼。
And let's compile that.
So I'm just going to run the make on my BPF object file target to start with.
是以,我將首先在我的 BPF 對象文件目標上運行 make。
And that should give me an object file that I can look at.
And there are a couple of interesting things to look at in this object file.
So, first of all, it's a little engine machine.
I will need that in a moment.
And this object file is designed, it's compiled to run in a Linux BPF virtual machine.
這個對象文件經過設計和編譯,可以在 Linux BPF 虛擬機中運行。
We can see here's the section declaration for the fact that it's running as a K probe on sysex exec V.
我們可以看到這裡的部分聲明,它是作為 K 探針在 sysex exec V 上運行的。
And here is the function name.
So I might say eBPF program.
是以,我可以說是 eBPF 計劃。
A program is really a function.
So that information from the L file is what is going to help the Go code know how to insert it into the kernel.
是以,L 文件中的資訊將幫助 Go 代碼知道如何將其插入內核。
So let's write some Go code.
那麼,讓我們來編寫一些 Go 代碼吧。
I already have a reference to libBPF Go here.
我已經在這裡引用了 libBPF Go。
And I have a convenience function called must that I'm going to use to trap any errors and panic crash if we see any errors.
我有一個名為 must 的便利函數,用來捕獲任何錯誤,並在出現任何錯誤時立即崩潰。
Hopefully we won't hit that.
Don't do that in production.
Really bad idea.
But it will be fine for demo purposes.
So the first thing I'm going to do is I'm going to open this file.
I'm going to do new module from file.
And reading from that object file that we just built.
We want to catch any errors.
And I'm going to use a defer.
I don't know if we have any Go programmers here.
If you're not familiar with Go, this defer keyword may be new to you.
如果你對 Go 不熟悉,這個 defer 關鍵字對你來說可能很陌生。
Basically make sure that on the exit from whatever function we're in, run this code.
In this case, I want to make sure that we tidy up and we close our file at the exit from this function.
And just for fun, I'm going to write cleaning up here so that we know when we're exiting from the statement that's been printed.
So I've opened my object file.
And I now need to load that into the kernel.
And that has to succeed.
Now I can I want to get the hello function program.
現在我可以獲取 hello 函數程序了。
And I want to attach it to a kprobe.
我想把它連接到 kprobe 上。
So first of all, I need to get the program.
Got a nice function to get that program.
And we know it's called hello.
我們知道這叫 "你好"。
And we need to attach that to kprobe.
我們需要將其連接到 kprobe。
So I'm attaching it to, yes, the P is my program.
所以,我把它附在了我的程序上,是的,P 就是我的程序。
And I'm attaching it to the function call that relates to exec VE.
我將其附加到與執行 VE 有關的函數調用中。
Now, on this particular kernel, the function name is this.
And this could return me an error.
So I need to catch that error.
So I've got my object opened.
I've got the program from inside that object.
And I have associated it with the exec VE system call.
我把它與 exec VE 系統調用聯繫起來。
The C code is writing some tracing information whenever it sees that system call.
每當 C 代碼看到系統調用時,就會寫入一些跟蹤資訊。
And I need to do something in user space to print it out.
And there is a convenient function.
There we go.
Trace print.
Now, this will basically block and print out whatever it receives from the debug tracing.
So I think we should be able to make this and run it.
I have to run it as a privileged user.
And hooray!
We have the equivalent of hello world.
我們有了 "hello world"。
Every time exec VE is running on this machine, we're getting the trace written out.
每次執行 VE 在這臺機器上運行時,我們都會得到寫出的跟蹤記錄。
Now, something didn't happen.
And that something is, we never saw the cleaning up line that I put.
So remember, I've got this here.
And that's because when I interrupted the program, well, just interrupted the program and it stopped.
In fact, it was blocked somewhere in here in trace print and got interrupted.
If I want to clean up properly, I'm gonna have to catch that interrupt.
Which I can do quite conveniently in Go.
And this is also gonna illustrate Go channels, which we're gonna use a bit in a moment.
這也是 Go 通道的示意圖,我們稍後會用到。
So I'm gonna make a channel.
This is a really nice feature of Go channels.
這是 Go 頻道的一大特色。
And this channel receives one item at a time and that item is signals from the operating system.
And I want to be notified whenever there is an interrupt signal.
That's gonna say if someone triggers interrupt, send a message on this signal channel.
Actually send the interrupt into the signal channel.
And I'm gonna block on that here.
This is essentially wait until you get an event on that signal and then throw it away on that channel.
And the last thing I need to do is send this blocking function off into its own Go routine.
This is how Go handles concurrency.
這就是 Go 處理併發的方式。
This basically means it's doing its own thing in another thread.
So this won't block anymore.
So I haven't done anything very different in terms program.
But it should now run a bit more cleanly in that if I hit control C, we now see our cleaning up message.
不過,現在運行起來應該更乾淨利落了,因為如果我按下控制 C,就會看到我們的清理資訊。
And we know that things like my defer function will be executed.
我們知道,像我的 defer 函數這樣的東西會被執行。
Because it will there's nothing interrupting it before it gets to complete the function.
All right.
So that's hello world.
這就是 hello world。
But it's not terribly useful.
In particular, this print K function is writing data to one well known pipe location on the machine.
特別是,該打印 K 功能正在將數據寫入機器上的一個已知管道位置。
If I ran any number of EPF programs and they all call print K, they'd all be writing to the same pipe.
如果我運行任意多個 EPF 程序,而且它們都調用打印 K,那麼它們都會向同一個管道寫入數據。
Which is not very useful for real world.
So we're gonna have to go back and think a bit more about maps.
So I mentioned before, maps are the way we can share data between the kernel code and whatever's happening in user space.
There are lots of different types of map.
I'm gonna use a thing called the perf event array.
我要使用一個叫 perf 事件數組的東西。
And this is nice partly because we can write an arbitrary blob of data.
這一點很好,部分原因是我們可以寫入任意 blob 數據。
So any kind of data we want to write, we can write it into this perf event buffer.
是以,我們想寫入的任何數據,都可以寫入這個 perf 事件緩衝區。
And on the user space side, there's a perf buffer implementation that can receive these data blobs on a go channel.
在用戶空間方面,有一個 perf 緩衝區實現,可以通過 go 通道接收這些數據塊。
So it's very sort of idiomatic way of receiving data from the EPF code.
是以,這是一種從 EPF 代碼接收數據的慣用方式。
So let's use BPF perf event output.
是以,讓我們使用 BPF perf 事件輸出。
So we're gonna do that here.
BPF perf event output.
BPF perf 事件輸出。
Get rid of my tracing call.
And what does this require?
This requires context.
We need a map.
I'm gonna just call it, I'll define it in a second, but we'll call it GoTopia.
我把它叫做 "GoTopia",稍後再給它下定義。
I have to pass a flag that indicates it's the current CPU.
我必須傳遞一個標誌,表明這是當前的 CPU。
And I'm gonna pass some data.
I have to say how big the data is.
So let's make things easy.
Let's pass some data.
We'll just pass a value, make up a value and pass it.
So whenever execve gets called, we're gonna pass this value into the perf buffer.
是以,每當 execve 被調用時,我們就會把這個值傳入 perf 緩衝區。
It just remains for me to define the perf buffer here.
我只需要在這裡定義 perf 緩衝區。
Perf output.
Perf 輸出。
And that is called GoTopia.
這就是 GoTopia。
And if I were to make that object and have a quick look at it again.
Oh, read elf.
And this time we can see in addition to the function name, we've also got an object defined called GoTopia.
這次我們可以看到,除了函數名稱外,我們還定義了一個名為 GoTopia 的對象。
That's the definition of the map that has to exist in the object file.
So the kernel side is writing this data into my perf buffer.
是以,內核會將這些數據寫入我的 perf 緩衝區。
And I need to read it from the Go side.
So I'm not gonna be using trace print anymore.
But I am going to be using a perf buffer.
So we'll call it PB.
所以我們稱之為 PB。
And we init a perf buffer.
And it's called GoTopia.
它的名字叫 GoTopia。
And I need to pass in a channel for the events that we're gonna receive.
I'll define that in a second.
I'm going to ignore any lost events.
Page size that I know works.
That has to succeed.
And we have to start, oops, PB, not PS.
我們必須開始,哎呀,是 PB,不是 PS。
The perf buffer.
And when we get to cleaning up, we're going to stop it.
I need to define this events channel.
So we'll make a channel.
And the type of data that we get from here is a slice of bytes.
So that's what we need to see.
And we'll just say some arbitrary length.
So that's set up the perf buffer.
We now need to receive these events.
I'm gonna do it in a Go routine again.
Which I'll call inline.
And we're going to loop reading information out of this channel.
So every time data arrives on that channel, it will get assigned to data.
And let's print it out.
Got something.
Now, we know that it's an unsigned 64-bit integer.
現在,我們知道這是一個無符號 64 位整數。
So I can convert my slice of bytes.
Our little endian.
我們的 "小 endian"。
We saw that before.
US64 data.
US64 數據。
So let's see if that builds.
It does.
And hopefully this time, every time exec VE gets called by any process on the machine, it's sending us 64-bit integer.
希望這次機器上的任何進程每次調用執行 VE 時,都能向我們發送 64 位整數。
So we're using this perf buffer, but we're not doing anything very useful with that.
It's just some number that I've decided to pass.
How about we get some information about the current context, like what's the name of the command?
The current command being called that triggered this, the syscall, the exec VE syscall.
當前調用的觸發命令、系統調用、執行 VE 系統調用。
So we, instead of passing numeric data, let's make this into some characters.
Again, a bit of an arbitrary length.
Size of that data.
And we write it into the perf buffer in exactly the same way.
我們以完全相同的方式將其寫入 perf 緩衝區。
And I just need to change this so that converts my string of bytes, my series of bytes into a string.
So now I should be able to say that.
So this time, every time exec VE gets called, we should see the name of the command.
這樣,每次調用 exec VE 時,我們都能看到命令的名稱。
And yeah, we can see, I happen to have my Kubernetes and Docker running on this virtual machine.
是的,我們可以看到,我的 Kubernetes 和 Docker 恰好運行在這臺虛擬機上。
So they're spawning quite a few new processes.
So this is starting to get pretty close to the example we saw with BPF traffic.
是以,這與我們看到的 BPF 流量示例非常接近。
So let's do a little bit of trace.
If you remember that example, well, it was summing up a counter for each different command.
So I think that would be pretty easy to implement in Go.
是以,我認為這在 Go 中很容易實現。
Let's make, so we're going to have a map of the command name, which is a string and the counter.
Make some arbitrary size of that.
And instead of printing out the name of the command, we can just increment the counter for that name.
And when we finish, we will loop over that counter and print out the results.
So the key and the value from my counter will print out the name and the value.
Name and the value.
Excuse me.
So have I missed anything there?
I think that's okay.
So this is still associated with the exec vehicle.
Let's see if it works.
So we'll just run that for a couple of seconds.
When we interrupt it, it should print out the counters.
The last thing we need to do is change the attachment point.
So currently we're attached to a K-probe at the entry of execve.
是以,目前我們在 execve 的入口處連接了一個 K-探針。
I'm going to change it so that it's associated with a trace point for sysenter.
我要將其改為與 sysenter 的跟蹤點相關聯。
And as I mentioned before, sysenter gets called every time any system call is invoked.
正如我之前提到的,每次調用任何系統調用時,都會調用 sysenter。
So I need to change that in two places.
I need to change the section declaration here.
So this becomes raw trace point sysenter.
And I need to change where we attached it here.
A raw trace point called sysenter.
名為 sysenter 的原始跟蹤點。
And with a bit of luck, this is going to recreate that BPF script.
如果運氣好的話,這將重現 BPF 腳本。
So we'll just run it for a few seconds.
And then when we interrupt it, we should see a counter.
And those counters should tell us how many system calls have been invoked by each of those different commands.
So we've recreated that BPF trace command.
是以,我們重新創建了 BPF 跟蹤命令。
We've done it in just under 60 lines of Go code and a handful of lines of C.
我們只用了不到 60 行 Go 代碼和幾行 C 代碼就做到了這一點。
I hope that's given an illustration of the kind of things you could do.
Now, obviously, you can attach to many different trace points.
I've only slightly scratched the surface of the things that you can do with BPF helper functions and all the different range of contextual information you could then observe and manipulate and pass up to user space.
對於使用 BPF 輔助函數所能做的事情,以及可以觀察、處理並傳遞到用戶空間的各種上下文資訊,我只是略微觸及皮毛。
I've had to gloss over lots and lots of details in the interest of time.
But the code that I've written is available on GitHub.
但我編寫的代碼可以在 GitHub 上找到。
And I'm really hoping that you have some questions for me.
Thank you.