I wish I could. I have an RTX 3060 12GB, I run mostly llama3.1 8B versions in fp8, at 30-35 tokens/s.
Just a stranger trying things.
I wish I could. I have an RTX 3060 12GB, I run mostly llama3.1 8B versions in fp8, at 30-35 tokens/s.
Sure! It can be a bit of a steep learning curve at times but there are heaps of resources online, and LLMs can also be useful, even if it just in pointing you in the direction for further reading. Regardless, you can reach out to me or other great folks from the [email protected] or similar AI, ML or related communities!
Enjoy :)
For RAG, there are some tools available in open-webui, which are documented here: https://docs.openwebui.com/tutorials/features/rag They have plans for how to expand and improve it, which they describe here: https://docs.openwebui.com/roadmap#information-retrieval-rag-
For fine-tuning, I think this is (at least for now) out of scope. They focus on inferencing. I think the direction is to eventually help you create/manage your own data which you get from using LLMs using Open-WebUI, but the task of actually fine-tuning is not possible (yet) using either ollama or open-webui.
I have not used the RAG function yet, but besides following the instructions on how to set it up, your experience with RAG may also be somewhat limited depending on which embedding model you use. You may have to go and look for a good model (which is probably both small and efficient to re-scan your documents yet powerful to generate meaningful embeddings). Also, in case you didn’t know, the embeddings you generate are specific to an embedding model, so if you change that model you’ll have to rescan your whole documents library.
Edit: RAG seems a bit limited by the supported file types. You can get it here: https://github.com/open-webui/open-webui/blob/2fa94956f4e500bf5c42263124c758d8613ee05e/backend/apps/rag/main.py#L328 It seems not to support word documents, or PDFs, so mostly incompatible with documents which have advanced formatting and are WYSIWYG.
The interface called open-webui can run in a container, but ollama runs as a service on your system, from my understanding.
The models are local and only answer queries by default. It all happens on the system without any additional tools. Now, if you want to give them internet access, you can, it is an option you have to setup and open-webui makes that possible though I have not tried it myself. I just see it.
I have never heard of any llm “answer base queries offline before contacting their provider for support”. It’s almost impossible for the LLM to do it by itself without you setting things up for it that way.
whats great is that with ollama and webui, you can as easily run it all on one computer locally using the open-webui pip package or in a remote server using the container version of open-webui.
Ive run both and the webui is really well done. It offers a number of advanced options, like the system prompt but also memory features, documents for RAG and even a built in python ide for when you want to execute python functions. You can even enable web browsing for your model.
I’m personally very pleased with open-webui and ollama and they both work wonders together. Hoghly recommend it! And the latest llama3.1 (in 8 and 70B variants) and llama3.2 (in 1 and 3B variants) work very well, even on CPU only, for the latter! Give it a shot, it is so easy to set up :)
deleted by creator
They don’t have to have a backdoor. They are most likely in possession of a master key to decrypt your data:
The framework laptop, a modular laptop, now has a risc v motherboard, to be used in their computers. Framework prides itself in being a good open source steward and you can read more about the motherboard here and buy it here (when it will be available):
https://frame.work/products/deep-computing-risc-v-mainboard
https://frame.work/blog/introducing-a-new-risc-v-mainboard-from-deepcomputing
It may have been the case in the past but Ive used both the GTX 680 and RTX 3060 on Fedora with no issue whatsoever. I have veen using the nvidia peoprietary drivers and they work well.
In japanese for those who want the original experience:
The story was difficult to follow, for me, and plays a significant role in the game and is likely to influence your decisions. What I wish I did and what I recommend you, is make sure you pay close attention to it in the beginning, knowing who’s who, who’s battling who and why. Consider taking notes haha
Edit: the story and the game are fantastic, I hope you enjoy it like I did. I recently finished the game and started with the extensions.
Google, Microsoft, OpenAI, Anthropic and co should help fight this fight, their tools are the problem here.
indeed! so there must be more to this. So how does it actually work?
Many such lawsuits have ended in settlements outside of courts, so I’m guessing many legal claims have not been validated or invalidated in court yet. This can be good or bad of course. But now, if this guy goes to court, I’m actually concerned because it may give an unchallenged path to Nintendo’s legal arguments and assuming the court decides he’s guilty, there will be precedent of these legal claims having been vetted in court. Would that not be worse for anyone in the future who would want to challenge Nintendo’s legal claims?
Thank you for your help.
I decided to give dracut a shot, see how far I could get.
I created a directory /usr/lib/dracut/modules.d/99usb-mount
in which I created two scripts:
A first module /usr/lib/dracut/modules.d/99usb-mount/module-setup.sh
, executable:
#!/bin/bash
check() {
return 0
}
depends() {
echo "crypt"
return 0
}
install() {
inst_hook pre-mount 90 "$moddir/usb-mount.sh"
}
And a second script /usr/lib/dracut/modules.d/99usb-mount/usb-mount.sh
, also executable:
#!/bin/bash
LUKS_PARTITION=/dev/sda3
USB_NKL=/dev/disk/by-uuid/<MY-UUID>
USB_MOUNT_DIR=/mnt/my-usb/
KEY_FILENAME=mykey.key
# Wait for the USB to be detected and available
for i in {1..10}; do
if [ -b ${USB_NKL} ]; then
break
fi
sleep 1
done
# Mount the USB stick
mount ${USB_NKL} ${USB_MOUNT_DIR}
# Check if the mount was successful
if [ $? -ne 0 ]; then
echo "Failed to mount USB stick"
exit 1
fi
# Unlock the LUKS partition using the keyfile
if [ -e "${USB_MOUNT_DIR}/${KEY_FILENAME}" ]; then
cryptsetup luksOpen "${LUKS_PARTITION}" cryptroot --key-file "${USB_MOUNT_DIR}/${KEY_FILENAME}"
else
echo "Keyfile not found!"
echo "Failed to unlock LUKS partition"
exit 1
fi
I then fixed some dependencies and got around installing device-mapper
, providing dmsetup
, required by dm
, required by crypt
, required by my scripts.
Then I ran: dracut -f
, which didn’t seem to have any issue and includes my module:
[...]
dracut[I]: *** Including module: usb-mount ***
[...]
dracut[E]: ldconfig exited ungracefully
[...]
dracut[I]: *** Creating initramfs image file '/boot/initramfs-6.6.54-0-lts.img' done ***
Not sure if this ldconfig
error should be of any concern? The end image seems to have been created successfully.
When I check the verbose output, I see my module being included:
dracut[D]: -rwxr-xr-x 0/0 747 2024-10-07 22:30:00 lib/dracut/hooks/pre-mount/90-usb-mount.sh
However, it is here numbered 90 when above I had placed it in 99, no idea what that’s about? (edit: actually I wrote 90 in the module-setup.sh
, so this is normal I guess).
Then I rebooted with my key and the prompt for my password to unlock my LUKS partition still appeared.
In the kernel messages I see my usb stick being detected (perhaps not mounted?) prior to the password prompt, so not sure what’s going on. Do you see any issue with my attempt? Or would you happen to have any propositions for debugging this further? I’m a bit lost as to how I can diagnose the issue.
Here are the kernel messages regarding the usb detection and a few seconds later, me unlocking the LUKS partition:
[ 1.748076] usb 1-1: new high-speed USB device number 2 using xhci_hcd # usb 1-1 / sdb is my USBkey. It seems to have been detected but not mounted?
[ 2.068060] device-mapper: uevent: version 1.0.3
[ 2.068190] device-mapper: ioctl: 4.48.0-ioctl (2023-03-01) initialised: [email protected]
[ 2.078157] Key type encrypted registered
[ 2.153792] usb 1-1: New USB device found, idVendor=067b, idProduct=2517, bcdDevice= 1.00
[ 2.153799] usb 1-1: New USB device strings: Mfr=1, Product=4, SerialNumber=6
[ 2.153801] usb 1-1: Product: ClipDrive
[ 2.153803] usb 1-1: Manufacturer: BUFFALO
[ 2.153805] usb 1-1: SerialNumber: A9200502030000221
[ 2.155494] usb-storage 1-1:1.0: USB Mass Storage device detected
[ 2.157341] scsi host3: usb-storage 1-1:1.0
[ 2.159772] usbcore: registered new interface driver uas
[ 3.221531] scsi 3:0:0:0: Direct-Access BUFFALO ClipDrive 1.00 PQ: 0 ANSI: 0 CCS
[ 3.224250] sd 3:0:0:0: [sdb] 507904 512-byte logical blocks: (260 MB/248 MiB)
[ 3.227885] sd 3:0:0:0: [sdb] Write Protect is off
[ 3.227899] sd 3:0:0:0: [sdb] Mode Sense: 23 00 00 00
[ 3.231635] sd 3:0:0:0: [sdb] No Caching mode page found
[ 3.231645] sd 3:0:0:0: [sdb] Assuming drive cache: write through
[ 3.247551] sd 3:0:0:0: [sdb] Attached SCSI removable disk
[ 6.323670] EXT4-fs (dm-0): orphan cleanup on readonly fs # the 3 seconds gap is me unlocking the LUKS using the keyboard
[ 6.323954] EXT4-fs (dm-0): mounted filesystem 33a8b408-37ff-4b8a-98bf-bba8b6f00604 ro with ordered data mode. Quota mode: none.
[ 6.324134] Mounting root: ok.
It seems you might be right. There is so little documentation for initramfs in Alpine Linux (the wiki page is very barebones), but I did manage to find this open issue:
https://gitlab.alpinelinux.org/alpine/mkinitfs/-/issues/18
So I guess this confirms that it is not yet possible.
Could you expand on your suggestion with customizing the init script? Where is this file located, and would you have some pointers of how to get started to customize it for my use case?
This is great! I had wondered, with my GoG games, as they provide offline installers for the games, what would be the best way to manage and distribute them for myself and this really hits the spot! Very glad to be able to folly self-manage my games in this way. Thanks!
This is about Alpine linux, as I wrote in the title and twice in the post.
I have no idea if ollama can handle multi-GPU. The 70B in it’s q2_k quantized form requires already 26GB of memory, so you would need at least that to run it well and that would only imply it could be entirely run on GPU, which is the best case scenario, but not at what speed.
I know some people with apple silicon who have enough memory to run the 70B model and for them it runs fast enough to be usable. You may be able to find more info about it online.