proxmox
Self-Hosted AI Image Generation – InvokeAI

Self-Hosted AI Image Generation – InvokeAI
#SelfHosted #Image #Generation #InvokeAI
“Craft Computing”
Thanks to Vultr for sponsoring today’s video. Visit to start your free trial, and receive $300 credit for signing up!
Grab yourself a Pint Glass or Bottle Opener at
AI image generation… we’ve seen it used for everything from memes to…
source
To see the full content, share this page by clicking one of the buttons below |
Try using AMD's Vega GPUs. HBCC allows you to use Ram as VRAM. Using Optane PMEM or just 4 optane NVME SSDs in a pool as swap space can get you enough VRAM to run LLAMA 405b, the full 810GBs of it.
I'd recommend a Vega 2 GPU like the MI60 if you are just using PMEM or ram, because PCIE 4.0 X16 is the bottle neck. If you are using Vega 64/56 or the Radeon 7 which is Vega 2 but for some reason doesn't run in PCIE 4.0, then optane NVME is just fine.
It's stupid, and totally insane to run such a big model, but token quality matters more than Tokens per second, so I made the jankiest setup ever just to say I could.
If I could run it on a raspberry pi, I would.
As an artist I protest the use of AI. It takes work away from artists and musicians.
I am assuming you performed the image creation on your monster of a machine. What would be really interesting is having a watt meter running and take some readings of how many watts a particular image creation and/or editing consumed. I've read you can create images, but it may take hours on a small rig. You have a fast rig. So watt consumption could be a way of assessing and possibly estimating time when comparing your fast rig to something less.
It would be interesting to see how all this Ai works on a mid range gaming machine. The type of machine most people may have just have at home anyway for a new creator. Like a getting started style video 🙂
Invoke is a great tool but pretty much all roads are leading you to use ComfyUI; hitting the token limit vastly affects the output of the generation.
Invoke has major limits with no support for increased batch sizes.
With ComfyUI you can take inpainting to the max with plugins such as Krita AI Diffusion – which is substantially more customizable than adobe's.
With new models like flux coming out that require 32gb of vram I fear that most people will not be able to host these models anymore locally.
Another cool thing you could check out would be Stable Audio Open which is trained on all the sounds from freesounds — it allows you to prompt the model to generate SFX
I personally feel the backdrops of the server in your garage or the wall of trinkets is what adds charm / personality to the videos. The "oh, I remember that location from that one video" would be lost with AI generated back drops.
Otherwise, you do you
I understand the appeal of generating backgrounds and music. I've tried these systems, but haven't used anything in my videos thus far. I'm tempted to actually use it as well.
Video creators are already not immune. Some of my transcripts from my larger YouTube channel are in EleutherAI's 825 GiB "the pile" dataset that I didn't agree to. It's being used by companies like Apple, Nvidia, and Salesforce according to news articles I saw in July.
At this point I don't think there is any going back, so I'd rather see it advance quickly to reach real AGI. That way social and economic change must be made, but it is frustrating to see creative works targeted first because it's the easiest type of AI to create. In my opinion the current systems are more like an advanced form of a web search rather than an actually intelligence.
So so cool… This was an awesome ep. Thanks heaps…
I was messing around with something similar to this a couple a of months back. I liked some of the results I got, but I still needed to touch them up in Affinity. One thing I really noticed is when you add multiple descriptors, especially with colour. So to keep things fantasy themed, a prompt like: "A very pale male Elf, with long silver curly hair, golden catlike eyes, and wearing a tattered blue robe, is standing in a vacant vandalised ally, under the midnight sky." it just wouldn't work, but if I give that to an artist, they'd be trying to picture what I'm trying to explain, focusing on those little details (and adding their own flair).
Im interested. I particularly liked how you didn't try to dress things up and showest the actual state of the tech for folks with real situations
Adding something like this has been something I've wanted to add to my home lab. 2 question:
I have a Dell T440 running Proxmox with a lot of moderate speed cores, more than enough ram, but my Nvidia p1000 is passed through to a plex VM. Can that be shared easily enough between VMs or is a GPU even essential to doing this?
Whats the training process like? My sailing club has thousands of photos that I'd want to train it on to possibly use for marketing content. Can you train it to know difference between 2 different, but similar classes of boats? Formula 18 vs Hobie 16 if you feel like looking up what those are.
As a creator, I am very interested (especially in Unreal Engine for set design).
By nanoseconds, the average time for "I'm Jeff" is steadily increasing. If Jeff were immortal and continues Craft Computing into the far future, the intro will eventually take longer than the lifespan of a star. The very last sound ever heard before the heat death of the universe could become the nearly-endless, final F of our reality. Only then will Jeff finally be able to rest.
…and why do You ignore negative prompt? Besides, fixing images is easy – just use the same SEED and adjust the prompt – you'll be good. Append there what you don't want to see – e.g. deformed face, misfigured etc.
I need Jeff to ask for pictures of Spiderman with that AD voice
YES!!! I am currently setting up my own production system to generate all kinds of art and stuff.
Kudos to you Jeff. Also just moved to PDX. So much better than I thought it would be. Now our heat wave is over
No one has a problem with AI until it starts to impact on their own situation. When image generators appeared a lot of musicans were really happy to use them for cover art for their music and had no problem with the fact that Artist's work had been used without permission to train those generators. But later on when Music generators appeared the Musicians were really angry that their music had been used to train those music machies. The same is true of writers who were happy to use AI Art for their book covers but not so happy to discover that their books have been used to train AI's to write books.
Content creators on youtube are now startng to get angry that their content has been used to train Video creation AI's and will probably get more angry when they find themselves competing with AI generated youtube channels where all of the content-including the people- is totaly AI generated.
AI could have been a positive force in our society by empowering the ordinary person- but you don't empower people by taking their work and using it without permission or payment in order to create machines that replace them, which is what is happening now. In the end the only people who are going to be 'empowered' by AI will be those who own and control it- almost eveyone else will rendered poweless and poor by this technology.
Orwells 1984 will look positively benign compared to the dytopian reality that an AI powered surveillance society will represent- not only will AI likely take your job, it will watch you 24/7 to make sure you adapt nicely and quietly to your new status as an unemployed non entity.
I thought this was a very interesting video. As someone who enjoys the physical hardware and the building of computers, I think the software side for me personally is a little out of my skill range and interest range. That said, as far as you using whatever program you want to use to create the art and to produce the the videos that you want to produce, I support it. Appreciate your content and I enjoyed this video.
would be scary fun to mess with voice to text to translation to voice, so you could reach non-english audience easier 😀
Since I only listen to your vids, it doesn’t matter to me either way…although I would watch a de-aged Jeff do the beer reviews just to see how YouTube and viewers not in the know would react.
Buzzword Buzzword Buzzword Buzzword Buzzword Buzzword Buzzword Buzzword Buzzword Buzzword Buzzword Buzzword Buzzword Buzzword Buzzword Buzzword Buzzword Buzzword Buzzword Buzzword Buzzword Buzzword Buzzword Buzzword Buzzword Buzzword
There, I fixed it for you.
Dont trust the "Real Ale Wanker". He will be drinking AI Ale soon. Mark my words!
invoke ai "is compatible with"……
Wait… You cant use own safetensor model files? im definately using my SD3. using it with automatic1111's GUI or with ComfyUI atm but ill check invokeai ui out too.
pretty…. "basic" tutorial for stable-diffusion but this will definately get you going.
I like that you do tutorials on these things and go into detail. It really helps me get started and learning on my own.
I'm curious. Have you been playing around with lora's to see if it can give you some more clean results?
please get a green screen
This is really great.
finally self hosted furry porn
JUST STAY YOU. I know all the bs is a fad now, but what doesnt change is just be YOU.
I was playing with Automatic1111 for Stable Diffusion as a result of Wendell's video, so I'll give this a shot as well. Why not? 🙂
use a layered png file for your backgrounds
My concern is what is used to train the models made available for InvokeAI. There was little discussion here and a brief google search failed to turn any up.
Ubuntu 22.04 is actually still supported in the Ubuntu LTS lineup until 2026 under standard support, so any of my fellow linux bros would be wrong to come at you 🙂
That said if it works with 3.11 it probably would work with 3.12 but they probably have not tested it 3.12 is still relatively speaking new.
What’s happening with the new studio project?
Michael Jackson on a horse?
Perfect use case
convincing? i think not
I want to be able to have the original voice of the actor from a foreign film read the sub title track in my language. Any software to do this?
26:28 Jeff has the "Disney Volume at home" too (Disney has a large screen room thing they use as a "better greenscreen" in a similar way, they call it "Volume")
Sadly artificial intelligence is still more intelligence than most C levels have had access too before….
15:43 As noticed also by other youtubers that dabbled in AI images (Shadiversity among others, on his secondary channel SHAD AI), the AI training on swords and swordfighting is very bad, so to get anything resembling a decent sword and a correct swordfighting pose you have to feed the AI sketches to use as a template.
The same for guns or any other specialist tool
Build The Volume in your garage?
Python is garbage, but you can run multiple versions concurrently on any version of Ubuntu using Virtual Environments which can be created and managed with Anaconda / Conda / Miniconda
Don't ever be tempted to hack a different version of Python into Ubuntu: some internal Ubuntu stuff requires Python, and because Python is garbage, it requires exactly the version that comes with Ubuntu.
Gotta be thankful of self-hosted AI, we finally have an excu.. AHEM good reason to get all those old compute cards without display.
Also dechutes gateway, ouch…, right in the beer belly.
sudo add-apt-repository ppa:deadsnakes/ppa -y for other versions of python
Sponsor spots are normally what i skip over but i did appreciate the spaceballs reference "ludicrous intelligence". Well done!!
So, horse tells; saddle is not period and has weird extra stuff and metal stirrups the left stirrup is to far forward. the horses neck is too thick, the front right leg to skinny, the mane has something wrong but I can’t tell from your video, the bridal and bit are modern as well.