I saw this post today on Reddit and was curious to see if views are similar here as they are there.
- What are the best benefits of self-hosting?
- What do you wish you would have known as a beginner starting out?
- What resources do you know of to help a non-computer-scientist/engineer get started in self-hosting?
The big thing for #2 would be to seperate out what you actually need vs what people keep recommending.
General guidance is useful, but there’s a lot of ‘You need ZFS!’ and ‘You should use K8s!’ and ‘Use X software!’
My life got immensely easier when I figured out I did not need any features ZFS brought to the table, and I did not need any of the features K8s brought to the table, and that less is absolutely more. I ended up doing MergerFS with a proper offsite backup method because, well, it’s shockingly low-complexity.
And I ended up doing Docker with a bunch of compose files and bind mounts, because it’s shockingly low-complexity. And it’s just running on Debian, instead of some OS that has a couple of layers of additional software to make things “easier” because, again, it’s low-complexity.
I can re-deploy the entire stack on new hardware in about ~10 minutes (I’ve tested this a few times just to make sure my backup scripts work), and there’s basically zero vendor tie-in or dependencies that you’d have to get working first since it’s just a pile of tarballs and packages from the distro’s package manager on, well, ANY distro.
deleted by creator
IMO a homelab for learning and a server that you’re self-hosting services on really aren’t the same thing and maybe shouldn’t be treated that way, if you can swing it.
I’d rather my password manager or jellyfin or my peertube instance or whatever not be relying on a tech stack I don’t entirely understand and might not be able to easily fix if it breaks.
I guess a lot of it is new to doing this vs greybeard split, since the longer I’ve done sysadmin work the less I care about the cool new thing and have a preference for the old, stable, documented, bugfixed, supported, and with a clear roadmap software.
I should probably get a job doing sysadmin work for a bank, lmao.
deleted by creator
This is going to be a bit of my grumpy-greybeard, but again: if you’re learning, then something like Docker and docker-compose is much simpler and less prone to fuckups than a bunch of K8s.
If you don’t know ANYTHING about what you’re doing, starting with the simplest tools and then deciding if you want to learn the more complicated ones is probably a less insane path than jumping right into the configuration-as-code DevOps pipeline.
And, at that point, you should have your “production” and “testing” environments set up in such a way they won’t eat each other when you do an oops.
deleted by creator
btrfs with its send/receive (incremental fs-level backups) is already stable enough for mostly everything (just has some issues with raid 5/6), and is much more performant than zfs. And it is also in the linux kernel tree (quite hugely useful). Of course, if more zfs-like functionality is what you look for.
“Already stable enough”
- no it isn’t.
- if fucking should be, it’s been around 15 years!
My only experience with btrfs was when trying out Opensuse Tumbleweed. Within a couple days my home partition was busted, next time it was another partition. No idea if the problems could be fixed as these were fairly new installations to give Opensuse a try and I couldn’t be bothered to fix a system that’s troubling me from the very beginning.
Between all the options that just work ™, btrfs is the one I’ve learned to stay away from.
EDIT: that was four or five years ago
Honestly it’s not; BTRFS has been in my ‘that’s neat, but it’s still got a non-zero chance of deciding to light everything on fire because it’s bored’ list for, uh, a decade now?
The NAS build is old enough to more or less predate BTRFS being usable (closing in on a decade since I did the initial OS install, jeez) and none of the features matter for what I’m storing: if every drive in my NAS died today, I’d be very annoyed for a couple of hours during the rebuild, and would lose terrabytes of linux ISOs that I can just download again, if I wanted to use Jellyfin to install them a 2nd time. (Any data I care about is pulled offsite at least once a day, so I’ve got pretty comprehensive backups minus the ISOs.)
I know EXT4 and mergerfs and snapraid are not cool, or have shiny features, but I’ve also had zero problems with them over the last decade, even between Ubuntu upgrades (16.04, 18.04, 20.04, 22.04) and hardware platform upgrades (6600k, 8700k, 10950k) and the entire replacement of all the system drives (hdd -> ssd -> nvme) and the expansion of and replacement of dead HDDs, of varying sizes (4tb drives to 8tb drives to 16tb drives to some 20tb drives).
It all just… worked, and at no point was I concerned about the filesystem not working if I replaced or upgraded or changed something, which is not something ZFS or BTRFS would have guaranteed during that same time window.
IMHO 99% of the time btrfs features are used as a band-aid for things that would be much better done otherwise. Generally by using a stable distro and a decent backup solution (like Debian + Borg). And you get to use a truly stable, proven, boring fs ike ext4 or xfs.
Stable yes, but no protection from bitrot, and the journal of ext4 is the band aid, instead of a cow fs like zfs or btrfs.
You can protect important data with backups, which you should do anyway, and in practice I feel like the added complexity of BTRFS and ZFS is not worth the COW.
BTRFS is cool but they tried to cram way too much too fast into it and it added a ton of complexity and it’s still not 100% done after all these years. A COW mode for ext4 would have been adopted much faster.
Can you elaborate on how your backup script re-deploys on new hardware? Sounds very nice to have.
elaborate
It’s a really simple script.
Everything is deployed with a docker compose, and all the docker volume data are bind mounts and, for example, a Jellyfin install would have everything in /stacks/jellyfin.
The backup script makes a tarball of each service individually (and stops the stack if there’s anything in there doing database things or anything else that might end up being inconsistent by just archiving the filesystem), and uploads them to a S3 storage provider AND burns them to a BluRay.
The recovery script does the opposite: it downloads and unarchives the data.
As long as you’re on Linux and have Docker, it should just magically work.
I see! Thanks, will try to back up my docker compose services this way.
If you write the script yourself, just make sure you test it a couple of times, and preferably with different datasets from different runs.
I found some edgecase stuff that would have prevented a restore even after I had tested it successfully (some permission issues due to changes in containers and whatnot were resulting in less than the expected data being archived and restored) a couple of times.
I wish I knew not to trust closed source self-hosted applications, such as Plex. Would have saved a lot of time and money.
Can you elaborate?
Plex is a great example here. I’ve been Hetzner customer for many many years, and bought a lifetime license to Plex. Only to receive few months later a notification from Plex that I am no longer allowed to self-host Plex for myself(and only myself) at Hetzner and that they will block all access to my self-hosted Plex instance. I tried to ask for leniency or a refund, but that was wasted effort as well.
In short, I was caught on a crossfire when for-profit company tried to please hollywood by attempting to reduce piracy, so they could get new VC funding.
…
I am now a happy Jellyfin user and warmly recommend all Plex users to try it, the Jellyfin community is awesome!
(Use your favourite search engine to look up “Hetzner Plex ban” for more details)
Yes, correct.
I apologize if someone misunderstood my reply, Plex was the bad actor here.
Are you still on Hetzner? How’s their customer support in general?
Still with Hetzner yeah. Haven’t had to deal with Hetzner customer support in the recent years at all, but they have been great in the past.
It is much easier to buy one “hefty” physical machine and run ProxMox with virtual machines for servers than it is to run multiple Raspberry Pis. After living that life for years, I’m a ProxMox shill now. Backups are important (read the other comments), and ProxMox makes backup/restore easy. Because eventually you will fuck a server up beyond repair, you will lose data, and you will feel terrible about it. Learn from my mistakes.
My reason for self hosting is being in control of my shit, and not the cloud provider.
I run jellyfin, soulseek, freshRSS, audiobookshelf and nextcloud. All of that on a pi 4 with an SSD attached and then accessible via wireguard. Also that sad is accessible as nfs share.
As I had already known Linux very well before I’ve started my own cloud, I didn’t really had to learn much.
The biggest resource I could recommend is that GitHub repository where a huge amount of awesomely selfhosted solutions are linked.
Yes that one, thanks.
2.What do you wish you would have known as a beginner starting out?
Caddy. Once you try Caddy there’s no turning back to Nginx or Apache.
That’s what everyone thinks for a while, and then they go back to Nginx.
As someone who just learned about Caddy, could you elaborate?
You usually want less integration, not more. Simple self-contained things. Nginx is good at that. That’s also why you don’t want to use Nginx Proxy Manager or Certbot’s Nginx integration etc. It first looks like they make it easier, but there is too much hidden complexity under the hood.
Also, sooner or later you will run into some software that you would really like to try, which is only documented for Nginx and uses some sort of image caching or so, that is hard to replicate with Caddy etc.
Certbot’s Nginx integration etc
How do you obtain ssl certs then?
I switched to Dehydrated (with dns-01 challenge), but Certbot itself is fine, the problem is the Nginx integration that tries to automatically change your Nginx config files.
Do you manually update the config every 3 months?
??? The location and the file name of the certificates don’t change, so why would I have to do that?
On the contrary, before I disabled the certbot’s Nginx integration, every three months certbot would “manage” to break my Nginx and I had to manually repair it.
I think we are not talking about the same thing. I mean the Certbot extension that automatically modifies the Nginx config files. A telltale sign are usually the comments "#managed by certbot” that it likes to leave behind all over your config files.
Not sure I agree about proxy manager. Anything you need to access is in the gui. You can easily add advanced configs to the entries. Been using it for 5 or so years, and there hasnt been anything I was missing from using straight nginx before that.
The benefit of using config files is easy version management via git.
Makes it easy to rebuild from scratch and easy to rollback a change that breaks somethingFair enough. I manage the same by backing up the vm its on.
I’m currently in the process of separating the certificate renewal service from the reverse proxy completely.
But if you’re just starting out Nginx Proxy Manager makes it so easy.
Out of curiosity, what’s the benefit of splitting those?
It lets you change reverse proxy or run a website with TLS completely independently of the certbot. The certbot deals with obtaining certs and leaves them in a dir, and the proxies or webservers just take them from that dir. If the proxy container breaks the certbot still does its thing etc.
It also makes it easier to do stuff like run different proxies in paralel for different things, chain proxies (for instance if you need to use a VPS because you can’t forward ports) and so on.
But it’s all for advanced setups, for basic stuff I’d still go with NPM.
Cool makes sense, thanks for the reply! And yeah, I don’t think I’m quite there yet.
Silly question how safe are caddy plugins? (especially dns challenge modules like cloudflare, duckdns, etc).
https://github.com/caddy-dns https://github.com/caddy-dns/cloudflare/tree/master
Not sure if those plugins are covered by caddy’s security disclosure policy
I maintain the DNS plugin for Vultr and I can say that it’s “safe”, but if you’re worried you should check their source code.
I believe it’s easier to have a vulnerability in the external provider’s API (for example, caddy-dns/vultr uses govultr) than Caddy. But I wouldn’t take things for granted if I was skeptical about these plugins.
Apparently traefik might be better if you run docker compose and such, as it does auto-discovery, which reduces the amount of manual configuration required.
I’ve been meaning to try Caddy, but I just can’t even imagine something simpler than NginxProxyManager.
I’ll parrot the top reply from Reddit on that one: to me, self hosting starts as a learning journey. There’s no right or wrong way, if anything I intentionally do whacky weird things to test the limits of my knowledge. The mistakes and troubles are when you learn. You don’t really understand the significance of good backups until you had to restore from them.
Even in production, it differs wildly. I have customers whom I set up a bare metal Ubuntu in some datacenter for cheap, they’ve been running on that setup for 10 years. Small mom and pop shop, they will never need a whole cluster of machines. Then at my day job we’re looking at things like Kubernetes and very heavyweight stacks because we handle a lot of traffic.
Some people self-host a PiHole on a Raspberry Pi and that’s all they need. Some people have entire NAS setups with smart TVs accessing their Plex/Jellyfin servers for the whole extended family. I host my own emails, which is a pain in the ass to get working reliably and clean your IP reputation.
I guess the only thing you should know is, you need some time to commit to maintaining your stuff if you don’t want it to break or get breached (if exposed to the Internet), and a willingness to learn because self hosting isn’t a turnkey experience. It can be a turnkey installation but when your SD card/drives fails you’re still on your own to troubleshoot and fix it. You don’t set a NextCloud server to replace Google Drive with the expectation that you shove the server in a closet forever. Owning your infrastructure and data comes at a small but very important upkeep time investment.
-
- Learning. If you ever found yourself tired of learning new things, your life is basically done.
- Cost. You already have an internet connection at home. It’s practically a necessity these days. The connection is likely fast enough for most things. Renting even the most piddly of VPS is wildly expensive. Just throw a spare machine at it and go wild.
- Freedom. Your own data is constantly being collected, regurgitated, and sold back to you. More people need to care about this incessant invasion of our lives.
-
- Backups. 3 copies, on different forms of storage, in multiple PHYSICALLY distinct locations. Just when you have that teeny little imp in the back of your mind say “hmm, I should probably back up soon” – stop everything you’re doing and run a backup.
- Test your recovery! Backups are only good if you can recover from them. Many have lost data because they failed to ever fail-test their backups.
-
- Google. Legitimately the best skill you can ever attain is simply being able to search effectively and be able to learn jargon quickly. Once you have the lingo down, searches become clearer, quicker, more precise.
-
- less is more, it’s fine to sunset stuff you don’t use enough to afford them using cpu cycles, memory and power
- search warrants are a real thing and you should not trust others to use your infrastructure responsibly because you will be the one paying for it if they don’t.
Is there a story attached to no. 2?
Well, turns out that when you host a private service that allows others to share files, they might share files that they are not allowed to share. But in return your door gets kicked in in the morning and suddenly no one wants to take credit for the actual upload anymore.
Yeah… Becoming a public-facing file host for others to use seem rather irresponsible.
If/when a user’s given a means of uploading files to my server, there’s no method/permissions for them to share those files with others; it’s really just for them to send files to me. (Filebrowser is pretty good for that)
That and almost nothing is public access; auth or gtfo.
For 2.: use dns-01 challenge to generate wildcard SSL certs. Saves so much time and nerves.
My open-source, zero dependency JS library for requesting and generating certs with dns01: https://github.com/clshortfuse/acmejs
I only coded for name.com but it is compatible with anything really. Also can run in the browser, which could be useful in a pinch.
Podman quadlets have been a blessing. They basically let you manage containers as if they were simple services. You just plop a container unit file in
/etc/containers/systemd/
, daemon-reload and presto, you’ve got a service that other containers or services can depend on.Is containers here used in the same context as docker? I’m not familiar with podman.
I would’ve wished
- don’t rush things into production.
- dont offer a service to a friend without really knowing and having the experience to keep it up when needed.
- dont make it your life. The services are there to help you, not to be your life.
- use docker. Podman is not yet ready for mainstream, in my experience. When the services move to podman officially it’s time to move. Just because jellyfin offers official documentation for it, doesn’t mean it’ll work with podman (my experience)
- just test all services with the base docker install. If something isn’t working, there may be a bug or two. Report if it is a bug. Hunt a bug down if you can. maybe it’s just something that isn’t documented (well enough) for a beginner.
- start on your own machine before getting a server. A pi is enough for lightweight stuff but probably not for a fast and smooth experience with e.g. nextcloud.
- backup.
- search for help. If not available in a forum. ask for help. Dont waste many many hours if something isnt working. But research it first and read the documentation.
Podman is not yet ready for mainstream, in my experience
My experience varies wildly from yours, so please don’t take this bit as gospel.
Have yet to find a container that doesn’t work perfectly well in podman. The options may not be the same. Most issues I’ve found with running containers boil down to things that would be equally a problem in docker. A sample:
- “rootless” containers are hard to configure. It can almost always be fixed with “–privileged” or some combination of permission flags. This would be equally true for docker; the only meaningful difference is podman tries to push everything into rootless. You don’t have to.
- network filesystems cause headaches, especially smbfs + sqlite app. I’ve had to use NFS or ext4 inside a network-mounted image for some apps. This problem is identical for docker.
- container networking–for specific cases–needs to managed carefully. These cases are identical for docker.
And that’s it. I generally run things once from the podman command line, then use podlet to create a quadlet out of that configuration, something you can’t do with docker. If you are having any trouble with running containers under podman, try the --privileged shortcut, see that it works, and then double back if you think you really need rootless.
-
data stays local for the most part. Every file you send to the cloud becomes property of the cloud. Yeah, you get access, but so does the hosting provider, their 3rd party resources, and typical government compliances. Hard drives are cheap and fast enough.
-
not quite answering this right, but I very much enjoy learning and evolving. But technology changes and sometimes implementing new software like caddy/traefik on existing setups is a PITA! I suppose if I went back in time, I would tell myself to do it the hard way and save a headache later. I wouldn’t have listened to me though.
-
Portainer is so nice, but has quirks. It’s no replacement for the command line, but wow, does it save time. The console is nerdy, but when time is on the line, find a good GUI.
-
For me #2 would be “you have ADHD and won’t be able to be medicated so just don’t”
I’ve mentioned elsewhere my server upgrade project took longer than expected.
Just last night I threw it all into the trash because I just can’t anymore
NixOS is awesome!
although maybe not for beginners. for beginners use docker compose and do backups however you like
Not as good as Ansible although they are different tools
That sex isn’t love.
And love isn’t sex