Konubinix' opinionated web of thoughts

Local Llm

Fleeting

models that I tested

hf.co/bartowski/Llama-3.2-3B-Instruct-GGUF:Q4_K_L

simonw/llm

ollama

use Ollama with any GGUF Model on Hugging Face Hub

  • External reference: https://huggingface.co/docs/hub/en/ollama

    The snippet would be in format:

    ollama run hf.co/{username}/{repository}

    Please note that you can use both hf.co and huggingface.co as the domain name.

    Here are some models you can try:

    ollama run hf.co/bartowski/Llama-3.2-1B-Instruct-GGUF

    ollama run hf.co/mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated-GGUF

    ollama run hf.co/arcee-ai/SuperNova-Medius-GGUF

    ollama run hf.co/bartowski/Humanish-LLama3-8B-Instruct-GGUF

    Custom Quantization

    By default, the Q4_K_M quantization scheme is used, when it’s present inside the model repo. If not, we default to picking one reasonable quant type present inside the repo.

    To select a different scheme, simply:

    From Files and versions tab on a model page, open GGUF viewer on a particular GGUF file. Choose ollama from Use this model dropdown.

    https://huggingface.co/docs/hub/en/ollama

Notes linking here