xcjs

xcjs@programming.dev · 2 months ago

It would be extremely barebones, but you can do something like this with Pandoc.

xcjs@programming.dev · 3 months ago

That I agree with. Microsoft drafted the recommendation to use it for local networks, and Apple ignored it or co-opted it for mDNS.

xcjs@programming.dev · 3 months ago

Macs aren’t the only thing that use mDNS, either. I have a host monitoring solution that I wrote that uses it.

xcjs@programming.dev · 3 months ago

Yeah, that’s why I started using .lan.

xcjs@programming.dev · 3 months ago

I was using .local, but it ran into too many conflicts with an mDNS service I host and vice versa. I switched to .lan, but I’m certainly not going to switch to .internal unless another conflict surfaces.

I’ve also developed a host-monitoring solution that uses mDNS, so I’m not about to break my own software. 😅

xcjs@programming.dev · edit-2 3 months ago

Coincidentally, I just found this other thread that mentions EasyEffects: https://programming.dev/post/17612973

You might be able to use a virtual device to get it working for your use case.

xcjs@programming.dev · 5 months ago

It depends on the model you run. Mistral, Gemma, or Phi are great for a majority of devices, even with CPU or integrated graphics inference.

xcjs@programming.dev · 6 months ago

We all mess up! I hope that helps - let me know if you see improvements!

xcjs@programming.dev · edit-2 6 months ago

I think there was a special process to get Nvidia working in WSL. Let me check… (I’m running natively on Linux, so my experience doing it with WSL is limited.)

https://docs.nvidia.com/cuda/wsl-user-guide/index.html - I’m sure you’ve followed this already, but according to this, it looks like you don’t want to install the Nvidia drivers, and only want to install the cuda-toolkit metapackage. I’d follow the instructions from that link closely.

You may also run into performance issues within WSL due to the virtual machine overhead.

xcjs@programming.dev · 6 months ago

Good luck! I’m definitely willing to spend a few minutes offering advice/double checking some configuration settings if things go awry again. Let me know how things go. :-)

xcjs@programming.dev · edit-2 6 months ago

It should be split between VRAM and regular RAM, at least if it’s a GGUF model. Maybe it’s not, and that’s what’s wrong?

xcjs@programming.dev · 6 months ago

Ok, so using my “older” 2070 Super, I was able to get a response from a 70B parameter model in 9-12 minutes. (Llama 3 in this case.)

I’m fairly certain that you’re using your CPU or having another issue. Would you like to try and debug your configuration together?

xcjs@programming.dev · 6 months ago

Unfortunately, I don’t expect it to remain free forever.

xcjs@programming.dev · 6 months ago

No offense intended, but are you sure it’s using your GPU? Twenty minutes is about how long my CPU-locked instance takes to run some 70B parameter models.

On my RTX 3060, I generally get responses in seconds.

xcjs@programming.dev · 8 months ago

My go-to solution for this is the Android FolderSync app with an SFTP connection.

xcjs@programming.dev · 11 months ago

Of course!

xcjs@programming.dev · edit-2 11 months ago

The Docker client communicates over a UNIX socket. If you mount that socket in a container with a Docker client, it can communicate with the host’s Docker instance.

It’s entirely optional.

xcjs@programming.dev · 11 months ago

There’s a container web UI called Portainer, but I’ve never used it. It may be what you’re looking for.

I also use a container called Watchtower to automatically update my services. Granted there’s some risk there, but I wrote a script for backup snapshots in case I need to revert, and Docker makes that easy with image tags.

There’s another container called Autoheal that will restart containers with failed healthchecks. (Not every container has a built in healthcheck, but they’re easy to add with a custom Dockerfile or a docker-compose.)

xcjs@programming.dev · 11 months ago

It’s really not! I migrated rapidly from orchestrating services with Vagrant and virtual machines to Docker just because of how much more efficient it is.

Granted, it’s a different tool to learn and takes time, but I feel like the tradeoff was well worth it in my case.

I also further orchestrate my containers using Ansible, but that’s not entirely necessary for everyone.

xcjs@programming.dev · edit-2 11 months ago

You can tinker in the image in a variety of ways, but make sure to preserve your state outside the container in some way:

Extend the image you want to use with a custom Dockerfile
Execute an interactive shell session, for example docker exec -it containerName /bin/bash
Replace or expose filesystem resources using host or volume mounts.

Yes, you can set a variety of resources constraints, including but not limited to processor and memory utilization.

There’s no reason to “freeze” a container, but if your state is in a host or volume mount, destroy the container, migrate your data, and resume it with a run command or docker-compose file. Different terminology and concept, but same result.

It may be worth it if you want to free up overhead used by virtual machines on your host, store your state more centrally, and/or represent your infrastructure as a docker-compose file or set of docker-compose files.