Subtitles section Play video Print subtitles I built an AI server for my daughters. Well, first, it was more for me. I wanted to run all of my AI locally. And I'm not just talking command line with Olama. No, no, no. We have a GUI, a beautiful chat interface. And this thing's feature-filled. It's got RBAC, chat histories, multiple models. We can even add stable diffusion. And I was able to add this to my notes application, Obsidian, and have my chat interface right there. I'm gonna show you how to do this. Now, you don't need something crazy like Terry. That's what I named my AI server. It can be something as simple as this, this laptop. I'll actually demo the entire setup on this laptop. So luckily, the computer you're using right now, the one you're watching this video on, will probably work. And seriously, you're gonna love this. It's customizable. It's wicked fast, like way faster than anything else I've used. Isn't that amazing? And again, it's local. It's private. I control it, which is important because I'm giving it to my daughters. I want them to be able to use AI to help with school, but I don't want them to cheat or do anything else weird. But because I have control, I can put in special model files that restrict what they can do, what they can ask. And I'll show you how to do that. So here we go. We're about to dive in. But first, let me have you meet Terry. Now, Terry has a lot of muscle. So for the case, I needed something big. I got the Leon Lee 011 Dynamic Evo XL. It's a full tower E-ATX case. Perfect to hold my Asus X670-E Creator ProArt motherboard. This thing's also a beast. I'll put it in the description so you can look at it. Now, I also gave Terry a big brain. He's got the AMD Ryzen 9 7950X. That's 4.2 gigahertz and 16 cores. For memory, I went a little crazy. I've got 128 gigabytes of the G.Skill Trident Z5 Neo. It's DDR5-6000 and way overkill for what I'm doing. I think. I got a Leon Lee water cooler for the CPU. I'm not sure if I'm saying Leon Lee right. I don't know. Correct me in the comments. You always do. And then for the stuff AI loves, I got two 4090s. It's the MSI Supreme and they're liquid cooled so they could fit on my motherboard. 24 gigabytes of memory each, giving me plenty of muscle for my AI models. For storage, we got two Samsung 990 Pros, two terabytes, which you can't see because they're behind stuff. And also a Corsair AX1600i power supply. 1600 watts to power the entire build. Terry is ready. Now I'm surprised to say my system actually posted on the first attempt, which is amazing. But what's not amazing is the fact that Ubuntu would not install. I tried for hours, actually for a whole day. And I almost gave up and installed Windows, but I said, no, Chuck, you're installing Linux. So I tried something new, something I've never messed with before. It's called Pop OS by System76. This thing is awesome. It worked the first time. It even had a special image with NVIDIA drivers built in. It just stinking worked. So I sipped some coffee, didn't question the magic and moved on. Now, if you do want to build something similar, I've got all the links below. But anyways, let's talk about how to build your very own local AI server. First, what do you need? Really, all you'll need is a computer. That's it. It can be any computer running Windows, Mac or Linux. And if you have a GPU, you'll have a much better time. Now, again, I have to emphasize this. You won't need something as beefy as Terry, but the more powerful your computer is, the better time you'll have. Don't come at me with a Chromebook, please. Now, step one, Olama. This is the foundation for all of our AI stuff and what we'll use to run AI models. So we'll head on over to olama.ai and click on download. And they get a flavor for every OS. I love that. Now, if you're on Mac, just download it right now and run it. If you're on Windows, they do have a preview version, but I don't want you to do that. Instead, I want you to try the Linux version. We can install it with one command. And yes, you can run Linux on Windows with WSL. Let's get that going real quick. First thing I'll do is go to the start bar and search for terminal. I launched my terminal. Now, this first bit is for Windows folks only, Linux people, just hang on for a moment. We got to get WSL installed or the Windows subsystem for Linux. It's only one command, WSL dash dash install. And that's it actually. Hit enter, and it's gonna start doing some stuff. When it's done, we'll set up a username and password. I got a new keyboard, by the way. Do you hear that? Link below, it's my favorite keyboard in the entire world. Now, some of you may have to reboot, that's fine. Just pause the video and come back. Mine is ready to go though, and we're rocking Ubuntu 22.04, which is still amazing to me that we're running Linux on Windows. That's just magic, right? Now, we're about to install Olama, but before we do that, you got to do some best practice stuff, like updating our packages. So we'll do a sudo apt update, and then we'll do a sudo apt upgrade dash y to apply all those updates. And actually, while it's updating, can I tell you something about our sponsor? IT Pro by ACI Learning. Now, in this video, we're gonna be doing lots of heavy Linux things. I'm gonna walk you through it. I'm gonna hold your hand, and you may not really understand what's happening. That's where IT Pro comes in. If you want to learn Linux or really anything in IT, they are your go-to. That's what I use to learn new stuff. So if you want to learn Linux to get better at this stuff, or you want to start making this whole hobby thing your career, actually learn some skills, get some certifications, get your A+, get your CCNA, get your AWS certifications, your Azure certifications, and go down this crazy IT path, which is incredible, and it's the whole reason I make this channel and make these videos. Check out IT Pro. They've got IT training that won't put you to sleep. They have labs, they have practice exams, and if you use my code NetworkChuck right now, you'll get 30% off forever. So go learn some Linux, and thank you to IT Pro for sponsoring this video and making things like this possible. And speaking of, my updates are done. And by the way, I will have a guide for this entire thing. Every step, all the commands, you can find it at the free NetworkChuck Academy membership. Click the link below to join and get some other cool stuff as well. I can't wait to see you there. Now we can install Olama with one command. And again, all commands are below. Just gonna paste this in. A nice little curl command, little magic stuff, and this, I love how easy this is, watch. You just sit there and let it happen. Do you not feel like a wizard when you're installing stuff like this? And the fact that you're installing AI right now, come on. Now notice one thing real quick. Olama did automatically find out that I have an NVIDIA GPU, and it's like, awesome, you're gonna have a great time. If it didn't see that and you do have a GPU, you may have to install some NVIDIA CUDA drivers. I'll put a link for that below, but not everyone will have to do that. And if you're rocking a Mac with an M1 through M3 chip, you're gonna have a good time too. They will use the embedded GPU. Now at this point, our Mac users, our Linux users, and our Windows users are all converged. We're on the same path. Welcome, we can hold hands and sing. It's getting weird. Anyways, first we have to test a few things to make sure Olama is working. And for that, we're gonna open our web browser. I know, it's kind of weird. Just stick with me. I'm gonna launch Chrome here, and here in my address bar, I'm gonna type in localhost, which is looking right here at my computer, and port 11434. Hit enter, and if you see this right here, this message, you're good to go, and you're about to find this out. Port 11434 is what Olama's API service is running on, and it's how our other stuff is gonna interact with it. It's so powerful, just check this out. I'm so excited to show you this. Now, before we move on, let's go ahead and add an AI model to Olama. And we can do that right now with olama pull, and we'll pull down Lama 2, a very popular one. Hit enter, and it's ready. Now, let's test it out real quick. We'll do olama run Lama 2. And if this is your first time doing this, this is kind of magic. We're about to interact with a chat GPT-like AI right here. No internet required, it's all just happening in that five gigabyte file. Tell me about the solar eclipse. Boom, and you can actually control C that to stop it. Now, I wanna show you this. I'm gonna open up a new window. This is actually an awesome command. And with this WSL command, I'm just connecting to the same instance again, a new window. I'm gonna type in watch-n 0.5, not four, five, nvidia-smi. This is going to watch the performance of my GPU right here in the terminal and keep refreshing. So keep an eye on this right here as I chat with Lama 2. Lama 2, give me a list of all Adam Sandler movies. And look at that GPU go, ah, it's so fun. Now, can I show you what Terry does real quick? I gotta show you Terry. Terry has two GPUs. Here they are right here. And olama can actually use both of them at the same time. Check this out, it's so cool. List all the Samuel L. Jackson movies. And look at that. Isn't that amazing? And look how fast it went. That's ridiculous. This is just the beginning. So anyways, I had to show you Terry. So now we have olama installed. That's just our base. Remember, I'm gonna say bye. So forward slash bye to end that session. Step two is all about the web UI. And this thing is amazing. It's called OpenWebUI. And it's actually one of many web UIs you can get for olama. But I think OpenWebUI is the best. Now OpenWebUI will be run inside a Docker container. So you will need Docker installed. And we'll do that right now. So we'll just copy and paste the commands from Network Shrug Academy. This is also available on Docker's website. First step is updating our repositories and getting Docker's GPG key. And then with one command, we'll install Docker and all its goodies. Ready, set, go. Yes, let's do it. And now with Docker installed, we'll use it to deploy our OpenWebUI container. There'll be one command. You can simply copy and paste this. This docker run command is going to pull this image to run this container from OpenWebUI. It's looking at your local computer for the olama base URL, because it's going to integrate and use olama. And it's going to be using the host network adapter to make things nice and easy. Keeping in mind, this will use port 8080 on whatever system you're using. Now all we have to do is hit enter after we add some sudo at the beginning, sudo docker run, and let it do its thing. Let's verify it real quick. We'll do a little sudo docker ps. We can see that it is indeed running. And now let's go log in. It's kind of exciting. Okay, let's go to our web browser and we'll simply type in localhost colon port 8080. Whoa, okay, it's really zoomed in. I'm not sure why. You shouldn't do that. Now for the first time you run it, you'll want to click on sign up right here at the bottom and just put your stuff in. This login info is only pertinent to this instance, this local instance. We'll create the account and we're logged in. Now, just so you know, the first account you log in with or sign up with will automatically become an admin account. So right now you as a first time user logging in, you get the power, but look at this. How amazing is this? Let's play with it. So the first thing we have to do is select the model. I'll click that dropdown and we should have one, llama2. Awesome. And that's how we know also our connection is working. I'll go and select that. And by the way, another way to check your connection is by going to your little icon down here at the bottom left and clicking on settings and then connections. And you can see our olamavase URL is right here, if you ever have to change that for whatever reason. Now with llama2 selected, we can just start chatting. And just like that, we have our own little chat GPT. That's completely local. And this sucker is beautiful and extremely powerful. Now, first things, we can download more models. We can go out to olama and see what they have available. Click on their models to see their list of models. Code Jemma is a big one. Let's try that. So to add Code Jemma, our second model, we'll go back to our command line here and type in olama pull code jemma. Cool, it's done. Once that's pulled, we can go up here and just change our model by clicking on the little dropdown icon at the top. Yep, there's Code Jemma. We can switch. And actually, I've never done this before, so I have no idea what's gonna happen. I'm gonna click on my original model, olama2. You can actually add another model to this conversation. Now we have two here. What's gonna happen? So Code Jemma is answering it first. I'm actually not sure what that does. Maybe you guys can try it out and tell me. I'm gonna move on though. Now some of the crazy stuff. You can see right here. It's almost more featured than ChatGPT in some ways. You got a bunch of options for editing your responses, copying, liking and disliking it to help it learn. You can also have it read things out to you, continue response, regenerate response, or even just add stuff with your own voice. I can also go down here, and this is crazy. I can mention another model, and it's gonna respond to this and think about it. Did you see that? I just had my other model talk to my current, like, that's just weird, right? Let's try to make them have a conversation. Like, they're gonna have a conversation. What are they gonna talk about? Let's bring back in olama2 to ask the question. This is hilarious. I love this so much. Okay, anyways, I could spend all day doing this. We can also, with this plus sign, upload files. This includes a lot of things. Let's try, do I have any documents here? I'll just copy and paste the contents of an article. Save that, and that'll be our file. Summarize this. You can see our GPU being used over here. I love that so much. You're running locally. Cool. We can also add pictures for multimodal models. I'm not sure Code Jemma can do that. Let's try it out real quick. So olama can't do it, but there is a multimodal model called lava. Let's pull that down real quick. With lava pulled, let's go to our browser here once more. We'll refresh it, change our model to lava, add the image. That's really scary. That's pretty cool. Now, here in a moment, I will show you how we can generate images right here in this web interface by using Stable Diffusion. But first, let's play around a bit more. And actually, the first place I wanna go to is the admin panel for you, the admin. We have one user, and if we click on the top right, we have admin settings. Here's where a ton of power comes in. First, we can restrict people from signing up. We can say enabled or disabled. Now, right now, by default, it's enabled. That's perfect. And when they try to sign up initially, they'll be a pending user until you're approved. Let me show you. So now, real quick. If you wanna have someone else use this server on your laptop or computer or whatever it is, they can access it from anywhere as long as they have your IP address. So let me do a new user signup real quick just to show you. I'll open an incognito window. Create account. And look, it's saying, hey, you gotta wait. Your guy has to approve you. And if we go here and refresh our page, on the dashboard, there is Bernard Hackwell. We can say, you know what? He's a user. Or click it again. He's an admin. No, no, he's not. He's gonna be a user. And if we check again, boom, we have access. Now, what's really cool is if I go to admin settings and I go to users, I can say, hey, you know what? Don't allow chat deletion, which is good if I'm trying to monitor what my daughters are kind of up to on their chats. I can also whitelist models. Like, so you know what? They're only allowed to use Llamatu. And that's it. So when I get back to Bernard Hackwell's session over here, I should only have access to Llamatu. It's pretty sick. And it becomes even better when you can make your own models that are restricted. We're gonna mosey on over to the section called model files right up here. And we'll click on create a model file. Now, you can also go to the community and see what people have created. That's pretty cool. I'm gonna show you what I've done for my daughter, Chloe, to prevent her from cheating. She named her assistant Debra. And here's the content. I'm gonna paste it in right now. The main thing is up here where it says from, and you choose your model, so from Llamatu. And then you have your system prompt, which is gonna be between three double quotes. And I've got all this, telling it what it can and can't do, what Chloe's allowed to ask. And it ends down here with three double quotes. You can do a few more things. I'm just gonna say as an assistant, education, save and create. Then I'll go over to my settings once more and make sure that for the users, this model is whitelisted. I'll add one more, Debra. Notice she has an option now. And if Bernard's gonna try and use Debra and say, Debra, paper for me on the Civil War. And immediately I will shut down saying, hey, that's cheating. Now, Llamatu, the model we're using, it's okay. There's a better one called Mixedrel. Let me show you Terry. I'll use Debra or Deb and say, write me a paper. I'm Benjamin Franklin. And notice how it didn't write it for me, but it says it's gonna guide me. And that's what I told it to do, to be a guide. I tried to push it and it said no. So that's pretty cool. You can customize these prompts, put in some guardrails for people that don't need full access to the kind of stuff right now. I think it's awesome. Now, OpenWebUI does have a few more bells and whistles, but I wanna move on to getting stable diffusion set up because this thing is so cool and powerful. Step three, stable diffusion. I didn't think that image generation locally would be as fun or as powerful as chat GPT, but it's more. Like, it's crazy. You gotta see it. Now, we'll be installing stable diffusion with a UI called Automatic 1111. So let's knock it out. Now, before we install it, we got some prereqs. And one of them is an amazing tool I've been using a lot called PyENV, which helps us manage our Python versions and switch between them, which is normally such a pain. Anyways, the first thing we gotta do is make sure we have a bunch of prereqs that's installed. Go ahead and copy and paste this from the Network Check Academy. Let it do its thing for a bit. And with the prereqs installed, we'll copy and paste this command, a curl command that'll automatically do everything for us. I love it. Run that. And then right here, it tells us we need to add all this or just run this command to put this in our bashrc file so we can actually use the PyENV command. I'll just copy this, paste it, and we'll type in source.bashrc to refresh our terminal. And let's see if PyENV works. PyENV, we'll do a dash H to see if it's up and running. Perfect. Now let's make sure we have a version of Python installed that will work for most of our stuff. We'll do PyENV install 3.10. This will, of course, install Python 3.10, the latest version. Excellent, Python 3.10 is installed. We'll make it our global Python by typing in PyENV, global 3.10. Perfect. And now we're gonna install automatic 1.1.1.1. The first thing we'll do is make a new directory, mkdir for make directory. We'll call it stable diff. Then we'll jump in there, cd stable diff. And then we'll use this wget command to wget this bash script. We'll type in ls to make sure it's there. There it is. Let's go ahead and make that sucker executable by typing in chmod. We'll do a plus x and then webui.sh. Now it's executable. Now we can run it. Period forward slash webui.sh. Ready, set, go. This is gonna do a lot of stuff. It's gonna install everything you need for open web UI. It's gonna install PyTorch and download stable diffusion. It's awesome. Again, a little coffee break. Okay, that took a minute. A long time. I hope you got plenty of coffee. Now it might not seem like it's ready, but it actually is running. And you'll see the URL pop up like around here. It's kind of messed up. But it's running on port 7860. Let's try it out. And this is gonna, this is fun. Oh my gosh. So localhost 7860. What you're seeing here is hard to explain. Let me just show you. And let's generate. Okay, it got confused. Let me take away the oompa loompas part. But this isn't being sped up. This is how fast this is. No, that's a little terrible. What do you say we make it look a little bit better? Okay, that's terrifying. But just one of the many things you can do with your own AI. Now you can actually download other models. Let me show you what it looks like on Terry and my new editor, Mike, telling me to do this. That's weird. Let's make it take more time. But look how fast this is. Like it's happening in real time as I'm talking to you right now. But if you've ever made images with GPT-4, it just takes forever. But I just love the fact that this is running on my own hardware. And it's kind of powerful. Let me know in the comments below which is your favorite image. Actually post on Twitter and tag me. This is awesome. Now this won't be a deep dive on stable diffusion. I barely know what I'm doing. But let me show you real quick how you can easily integrate automatic 11111. Did I do enough ones? I'm not sure. And their stable diffusion inside OpenWebUI. So it's just right here. Back at OpenWebUI, if we go down to our little settings here and go to settings, you'll see an option for images. Here we can put our automatic 11111 base URL, which will simply be HTTP colon whack whack 127.0.0.1, which is the same as saying localhost port 78, was it 06, 60? 60, I think that's what it is. We'll hit the refresh option over here to make sure it works. And actually, no, it didn't. And here's why. There's one more thing you gotta know. Here we have OpenWebUI running in our terminal. Then I control C, it's gonna stop it from running. In order to make it work with OpenWebUI, we gotta use two switches to make it work. So let's go ahead and run our script one more time. OpenWebUI or webui.sh. And we'll do dash dash listen and dash dash API. Once we see the URL come up, okay, cool, it's running. We can go back over here and say, why don't you try that again, buddy? Perfect. And then over here we have image generation experimental. They're still trying it out. We'll say on and we'll say save. So now if we go to any prompt, let's do a new chat. We'll chat with Llama2. I'll say describe a man in a dog suit. This is for a stable diffusion prompt. A bit wordy for my taste, but then notice we have a new icon. This is so neat. Boom, an image icon. And all we have to do is click on that to generate an image based on that prompt. I clicked on it, it's doing it. And there it is, right in line. That is so cool. And that's really terrifying. I love this, it's so fun. Now this video is getting way too long, but there are still two more things I wanna show you. I'm gonna do that really quickly right now. The first one is, it's just magic. Check it out. There's another option here inside OpenWebUI. A little section right here called documents. Here we can simply just add a document. I'll add that one from before. It's there, available for us. And now when we have a new chat, I'll chat with Code Jemma. All I have to do is do a hashtag and say, let's talk about this. And say, give me five bullet points about this. Cool. Give me three social media posts. Okay, Code Jemma, let me try it again. What just happened? Yeah, let's do a new prompt. Oh, there we go. And I'm just scratching the surface. Now, second thing I wanna show you. Last thing. I am a huge Obsidian nerd. It's my notes application. It's what I use for everything. It's been very recent. I haven't made a video about it, but I plan to. But one of the cool things about this, this very local private notes taking application, is that you can add your own local GPT to it, like what we just deployed. Check this out. I'm gonna go to settings. I'll go to community plugins. I'll browse for one. I'm gonna search for one called BMO, BMO chatbot. I'm gonna install that, enable it. And then I'm gonna go to settings of BMO chatbot. And right here, I can have an Olama connection, which is gonna connect to, let's say Terry. So I'll connect him to Terry. And I'll choose my model. I'll use Olamatu, why not? And now, right here in my note, I can have a chatbot come right over here to the side and say like, hey, how's it going? And I can do things like look at the help file, see what I can use here. Ooh, turn on reference. So I'm gonna say reference on. It's now gonna reference the current note I'm in. Tell me about the system prompt. Yep, there it is. And it's actually going through and telling me about the note I'm in. So I have a chatbot right there, always available for me to ask questions about what I'm doing. I can even go in here and go highlight this, do a little prompts like generate. It's generating right now. And it's generating some stuff for me. I'm gonna undo that. Let me do another note. So I wanna tell a story about a man in a dog suit. I'll quickly talk to my chatbot and start to do some stuff. I mean, that's pretty crazy. And this I think for me is just scratching the surface of running local AI private in your home on your own hardware. This is seriously so powerful and I can't wait to do more stuff with this. Now I would love to hear what you've done with your own projects. If you attempted this, if you have this running in your lab, let me know in the comments below. Also, do you know of any other cool projects I can try that I can make a video about? I will love to hear that. I think AI is just the coolest thing. But also privacy is a big concern for me. So to be able to run AI locally and play with it this way is just the best thing ever. Anyways, that's all I got. If you wanna continue the conversation and talk more about this, please check out our Discord community. The best way to join that is to jump through our Network Chuck Academy membership, the free one. And if you do wanna join the paid version, we do have some extra stuff for you there too. And I'll help support what we do here. But I'd love to hang out with you and talk more. That's all I got. I'll catch you guys next time.
B1 US ai linux install terry model docker host ALL your AI locally 13 0 Jasper posted on 2024/10/01 More Share Save Report Video vocabulary