Konubinix' opinionated web of thoughts

Local Llm

models that I tested

hf.co/bartowski/Llama-3.2-3B-Instruct-GGUF:Q4_K_L

simonw/llm

External reference: https://github.com/simonw/llm

ollama

use Ollama with any GGUF Model on Hugging Face Hub

External reference: https://huggingface.co/docs/hub/en/ollama

The snippet would be in format:

ollama run hf.co/{username}/{repository}

Please note that you can use both hf.co and huggingface.co as the domain name.

Here are some models you can try:

ollama run hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF

ollama run hf.co/mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated-GGUF

ollama run hf.co/arcee-ai/SuperNova-Medius-GGUF

ollama run hf.co/bartowski/Humanish-LLama3-8B-Instruct-GGUF

Custom Quantization

By default, the Q4_K_M quantization scheme is used, when it’s present inside the model repo. If not, we default to picking one reasonable quant type present inside the repo.

To select a different scheme, simply:

From Files and versions tab on a model page, open GGUF viewer on a particular GGUF file. Choose ollama from Use this model dropdown.

— https://huggingface.co/docs/hub/en/ollama

Notes linking here

Permalink