Cxmpute

nomic-embed-text

768 dimensions

`nomic-embed-text` is a large context length text encoder that surpasses OpenAI `text-embedding-ada-002` and `text-embedding-3-small` performance on short and long context tasks.

embeddingstextvector

by Nomic AI

8192 context

As of March 2024, this model archives SOTA performance for Bert-large sized models on the MTEB. It outperforms commercial models like OpenAIs `text-embedding-3-large` model and matches the performance of model 20x its size. `mxbai-embed-large` was trained with no overlap of the MTEB data, which indicates that the model generalizes well across several domains, tasks and text length.

embeddingstextvector

by Mixedbread AI

512 context

bge-m3

1024 dimensions

`BGE-M3` is based on the XLM-RoBERTa architecture and is distinguished for its versatility in Multi-Functionality, Multi-Linguality, and Multi-Granularity: * **Multi-Functionality**: It can simultaneously perform the three common retrieval functionalities of embedding model: dense retrieval, multi-vector retrieval, and sparse retrieval. * **Multi-Linguality**: It can support more than 100 working languages. * **Multi-Granularity**: It is able to process inputs of different granularities, spanning from short sentences to long documents of up to 8192 tokens.

embeddingstextvector

by BAAI

8192 context

snowflake-arctic-embed:22m

384 dimensions

`snowflake-arctic-embed` is a suite of text embedding models that focuses on creating high-quality retrieval models optimized for performance. The models are trained by leveraging existing open-source text representation models, such as `bert-base-uncased`, and are trained in a multi-stage pipeline to optimize their retrieval performance. This model is available in 5 parameter sizes: * `snowflake-arctic-embed:335m` (default) * `snowflake-arctic-embed:137m` * `snowflake-arctic-embed:110m` * `snowflake-arctic-embed:33m` * `snowflake-arctic-embed:22m`

embeddingstextvector

by Snowflake

512 context

snowflake-arctic-embed:33m

384 dimensions

`snowflake-arctic-embed` is a suite of text embedding models that focuses on creating high-quality retrieval models optimized for performance. The models are trained by leveraging existing open-source text representation models, such as `bert-base-uncased`, and are trained in a multi-stage pipeline to optimize their retrieval performance. This model is available in 5 parameter sizes: * `snowflake-arctic-embed:335m` (default) * `snowflake-arctic-embed:137m` * `snowflake-arctic-embed:110m` * `snowflake-arctic-embed:33m` * `snowflake-arctic-embed:22m`

embeddingstextvector

by Snowflake

512 context

snowflake-arctic-embed:110m

768 dimensions

`snowflake-arctic-embed` is a suite of text embedding models that focuses on creating high-quality retrieval models optimized for performance. The models are trained by leveraging existing open-source text representation models, such as `bert-base-uncased`, and are trained in a multi-stage pipeline to optimize their retrieval performance. This model is available in 5 parameter sizes: * `snowflake-arctic-embed:335m` (default) * `snowflake-arctic-embed:137m` * `snowflake-arctic-embed:110m` * `snowflake-arctic-embed:33m` * `snowflake-arctic-embed:22m`

embeddingstextvector

by Snowflake

512 context

snowflake-arctic-embed:137m

768 dimensions

`snowflake-arctic-embed` is a suite of text embedding models that focuses on creating high-quality retrieval models optimized for performance. The models are trained by leveraging existing open-source text representation models, such as `bert-base-uncased`, and are trained in a multi-stage pipeline to optimize their retrieval performance. This model is available in 5 parameter sizes: * `snowflake-arctic-embed:335m` (default) * `snowflake-arctic-embed:137m` * `snowflake-arctic-embed:110m` * `snowflake-arctic-embed:33m` * `snowflake-arctic-embed:22m`

embeddingstextvector

by Snowflake

8192 context

snowflake-arctic-embed:335m

1024 dimensions

`snowflake-arctic-embed` is a suite of text embedding models that focuses on creating high-quality retrieval models optimized for performance. The models are trained by leveraging existing open-source text representation models, such as `bert-base-uncased`, and are trained in a multi-stage pipeline to optimize their retrieval performance. This model is available in 5 parameter sizes: * `snowflake-arctic-embed:335m` (default) * `snowflake-arctic-embed:137m` * `snowflake-arctic-embed:110m` * `snowflake-arctic-embed:33m` * `snowflake-arctic-embed:22m`

embeddingstextvector

by Snowflake

512 context

all-minilm:33m

384 dimensions

The model is intended to be used as a sentence and short paragraph encoder. Given an input text, it outputs a vector which captures the semantic information. The sentence vector may be used for information retrieval, clustering or sentence similarity tasks.

embeddingstextvector

by Sentence Transformers

256 context

all-minilm:22m

384 dimensions

The model is intended to be used as a sentence and short paragraph encoder. Given an input text, it outputs a vector which captures the semantic information. The sentence vector may be used for information retrieval, clustering or sentence similarity tasks.

embeddingstextvector

by Sentence Transformers

256 context

snowflake-arctic-embed2

1024 dimensions

Snowflake's frontier embedding model. Arctic Embed 2.0 adds multilingual support without sacrificing English performance or scalability.

embeddingstextvector

by Snowflake

512 context

gemma3:1b

8192

Gemma is a lightweight, family of models from Google built on Gemini technology. The Gemma 3 models are multimodal—processing text and images—and feature a 128K context window with support for over 140 languages. Available in 1B, 4B, 12B, and 27B parameter sizes, they excel in tasks like question answering, summarization, and reasoning, while their compact design allows deployment on resource-limited devices.

visiontextimagetext

by Google

32k context

gemma3:4b

8192

Gemma is a lightweight, family of models from Google built on Gemini technology. The Gemma 3 models are multimodal—processing text and images—and feature a 128K context window with support for over 140 languages. Available in 1B, 4B, 12B, and 27B parameter sizes, they excel in tasks like question answering, summarization, and reasoning, while their compact design allows deployment on resource-limited devices.

visiontextimagetext

by Google

128k context

gemma3:12b

8192

Gemma is a lightweight, family of models from Google built on Gemini technology. The Gemma 3 models are multimodal—processing text and images—and feature a 128K context window with support for over 140 languages. Available in 1B, 4B, 12B, and 27B parameter sizes, they excel in tasks like question answering, summarization, and reasoning, while their compact design allows deployment on resource-limited devices.

visiontextimagetext

by Google

128k context

gemma3:24b

8192

Gemma is a lightweight, family of models from Google built on Gemini technology. The Gemma 3 models are multimodal—processing text and images—and feature a 128K context window with support for over 140 languages. Available in 1B, 4B, 12B, and 27B parameter sizes, they excel in tasks like question answering, summarization, and reasoning, while their compact design allows deployment on resource-limited devices.

visiontextimagetext

by Google

128k context

llama3.2-vision:11b

4096

The Llama 3.2-Vision collection of multimodal large language models (LLMs) is a collection of instruction-tuned image reasoning generative models in 11B and 90B sizes (text + images in / text out). The Llama 3.2-Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image. The models outperform many of the available open source and closed multimodal models on common industry benchmarks. **Supported Languages**: For text only tasks, English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai are officially supported. Llama 3.2 has been trained on a broader collection of languages than these 8 supported languages. Note for image+text applications, English is the only language supported.

visiontextimagetext

by Meta

128k context

minicpm-v

8192

`MiniCPM-V 2.6` is the latest and most capable model in the MiniCPM-V series. The model is built on SigLip-400M and Qwen2-7B with a total of 8B parameters. It exhibits a significant performance improvement over MiniCPM-Llama3-V 2.5, and introduces new features for multi-image and video understanding. Notable features of MiniCPM-V 2.6 include: * **🔥 Leading Performance**: MiniCPM-V 2.6 achieves an average score of 65.2 on the latest version of OpenCompass, a comprehensive evaluation over 8 popular benchmarks. With only 8B parameters, it surpasses widely used proprietary models like GPT-4o mini, GPT-4V, Gemini 1.5 Pro, and Claude 3.5 Sonnet for single image understanding. * **🖼️ Multi Image Understanding and In-context Learning**: MiniCPM-V 2.6 can also perform conversation and reasoning over multiple images. It achieves state-of-the-art performance on popular multi-image benchmarks such as Mantis-Eval, BLINK, Mathverse mv and Sciverse mv, and also shows promising in-context learning capability. * **💪 Strong OCR Capability**: MiniCPM-V 2.6 can process images with any aspect ratio and up to 1.8 million pixels (e.g., 1344x1344). It achieves state-of-the-art performance on OCRBench, surpassing proprietary models such as GPT-4o, GPT-4V, and Gemini 1.5 Pro. Based on the the latest RLAIF-V and VisCPM techniques, it features trustworthy behaviors, with significantly lower hallucination rates than GPT-4o and GPT-4V on Object HalBench, and supports multilingual capabilities on English, Chinese, German, French, Italian, Korean, etc. * **🚀 Superior Efficiency**: In addition to its friendly size, MiniCPM-V 2.6 also shows state-of-the-art token density (i.e., number of pixels encoded into each visual token). It produces only 640 tokens when processing a 1.8M pixel image, which is 75% fewer than most models. This directly improves the inference speed, first-token latency, memory usage, and power consumption.

visiontextimagetext

by openBMB

131k context

llava-llama3

4096

`llava-llama-3-8b-v1_1` is a LLaVA model fine-tuned from `meta-llama/Meta-Llama-3-8B-Instruct` and `CLIP-ViT-Large-patch14-336` with ShareGPT4V-PT and InternVL-SFT by XTuner.

visiontextimagetext

by xtuner

131k context

moondream

`Moondream` is an open-source visual language model that understands images using simple text prompts. It's fast, wildly capable — and just 1GB in size. * **Vision AI at Warp Speed**: Forget everything you thought you needed to know about computer vision. With Moondream, there's no training, no ground truth data, and no heavy infrastructure. Just a model, a prompt, and a whole world of visual understanding. * **Ridiculously lightweight**: Under 2B parameters. Quantized to 4-bit. Just 1GB. Moondream runs anywhere — from edge devices to your laptop. * **Actually affordable**: Run it locally for free. Or use our cloud API to process a high volume of images quickly and cheaply. Free tier included. * **Simple by design**: Choose a capability. Write a prompt. Get results. That's it. Moondream is designed for developers who don't want to babysit models. * **Versatile as hell**: Go beyond basic visual Q&A. Moondream can caption, detect objects, locate things, read documents, follow gaze, and more. * **Tried, tested, trusted**: 6M+ downloads. 8K+ GitHub stars. Used across industries — from healthcare to robotics to mobile apps.

visiontextimagetext

by moondream

granite3.2-vision

Model Summary: `granite-vision-3.2-2b` is a compact and efficient vision-language model, specifically designed for visual document understanding, enabling automated content extraction from tables, charts, infographics, plots, diagrams, and more. The model was trained on a meticulously curated instruction-following dataset, comprising diverse public datasets and synthetic datasets tailored to support a wide range of document understanding and general image tasks. It was trained by fine-tuning a Granite large language model with both image and text modalities.

visiontextimagetext

by IBM

mistral-small3.1

128k

Model Card for Mistral-Small-3.1-24B-Instruct-2503 Building upon Mistral Small 3 (2501), Mistral Small 3.1 (2503) adds state-of-the-art vision understanding and enhances long context capabilities up to 128k tokens without compromising text performance. With 24 billion parameters, this model achieves top-tier capabilities in both text and vision tasks. This model is an instruction-finetuned version of: `Mistral-Small-3.1-24B-Base-2503`. Mistral Small 3.1 can be deployed locally and is exceptionally "knowledge-dense," fitting within a single RTX 4090 or a 32GB RAM MacBook once quantized. It is ideal for: * Fast-response conversational agents. * Low-latency function calling. * Subject matter experts via fine-tuning. * Local inference for hobbyists and organizations handling sensitive data. * Programming and math reasoning. * Long document understanding. * Visual understanding. For enterprises requiring specialized capabilities (increased context, specific modalities, domain-specific knowledge, etc.), we will release commercial models beyond what Mistral AI contributes to the community. Learn more about Mistral Small 3.1 in our [blog post](https://mistral.ai/news/mistral-small-3-1). **Key Features** * **Vision**: Vision capabilities enable the model to analyze images and provide insights based on visual content in addition to text. * **Multilingual**: Supports dozens of languages, including English, French, German, Greek, Hindi, Indonesian, Italian, Japanese, Korean, Malay, Nepali, Polish, Portuguese, Romanian, Russian, Serbian, Spanish, Swedish, Turkish, Ukrainian, Vietnamese, Arabic, Bengali, Chinese, Farsi. * **Agent-Centric**: Offers best-in-class agentic capabilities with native function calling and JSON outputting. * **Advanced Reasoning**: State-of-the-art conversational and reasoning capabilities. * **Apache 2.0 License**: Open license allowing usage and modification for both commercial and non-commercial purposes. * **Context Window**: A 128k context window. * **System Prompt**: Maintains strong adherence and support for system prompts. * **Tokenizer**: Utilizes a Tekken tokenizer with a 131k vocabulary size.

visiontextimagetext

by Mistral

128k context

cogito:14b

128k

The Cogito v1 Preview LLMs are instruction tuned generative models (text in/text out). All models are released under an open license for commercial use. * Cogito models are hybrid reasoning models. Each model can answer directly (standard LLM), or self-reflect before answering (like reasoning models). * The LLMs are trained using Iterated Distillation and Amplification (IDA) - an scalable and efficient alignment strategy for superintelligence using iterative self-improvement. * The models have been optimized for coding, STEM, instruction following and general helpfulness, and have significantly higher multilingual, coding and tool calling capabilities than size equivalent counterparts. * In both standard and reasoning modes, Cogito v1-preview models outperform their size equivalent counterparts on common industry benchmarks. * Each model is trained in over 30 languages and supports a context length of 128k.

texttexttext

by Cogito

128k context

cogito:32b

128k

The Cogito v1 Preview LLMs are instruction tuned generative models (text in/text out). All models are released under an open license for commercial use. * Cogito models are hybrid reasoning models. Each model can answer directly (standard LLM), or self-reflect before answering (like reasoning models). * The LLMs are trained using Iterated Distillation and Amplification (IDA) - an scalable and efficient alignment strategy for superintelligence using iterative self-improvement. * The models have been optimized for coding, STEM, instruction following and general helpfulness, and have significantly higher multilingual, coding and tool calling capabilities than size equivalent counterparts. * In both standard and reasoning modes, Cogito v1-preview models outperform their size equivalent counterparts on common industry benchmarks. * Each model is trained in over 30 languages and supports a context length of 128k.

texttexttext

by Cogito

128k context

cogito:3b

128k

The Cogito v1 Preview LLMs are instruction tuned generative models (text in/text out). All models are released under an open license for commercial use. * Cogito models are hybrid reasoning models. Each model can answer directly (standard LLM), or self-reflect before answering (like reasoning models). * The LLMs are trained using Iterated Distillation and Amplification (IDA) - an scalable and efficient alignment strategy for superintelligence using iterative self-improvement. * The models have been optimized for coding, STEM, instruction following and general helpfulness, and have significantly higher multilingual, coding and tool calling capabilities than size equivalent counterparts. * In both standard and reasoning modes, Cogito v1-preview models outperform their size equivalent counterparts on common industry benchmarks. * Each model is trained in over 30 languages and supports a context length of 128k.

texttexttext

by Cogito

128k context

cogito:70b

128k

The Cogito v1 Preview LLMs are instruction tuned generative models (text in/text out). All models are released under an open license for commercial use. * Cogito models are hybrid reasoning models. Each model can answer directly (standard LLM), or self-reflect before answering (like reasoning models). * The LLMs are trained using Iterated Distillation and Amplification (IDA) - an scalable and efficient alignment strategy for superintelligence using iterative self-improvement. * The models have been optimized for coding, STEM, instruction following and general helpfulness, and have significantly higher multilingual, coding and tool calling capabilities than size equivalent counterparts. * In both standard and reasoning modes, Cogito v1-preview models outperform their size equivalent counterparts on common industry benchmarks. * Each model is trained in over 30 languages and supports a context length of 128k.

texttexttext

by Cogito

128k context

cogito:8b

128k

The Cogito v1 Preview LLMs are instruction tuned generative models (text in/text out). All models are released under an open license for commercial use. * Cogito models are hybrid reasoning models. Each model can answer directly (standard LLM), or self-reflect before answering (like reasoning models). * The LLMs are trained using Iterated Distillation and Amplification (IDA) - an scalable and efficient alignment strategy for superintelligence using iterative self-improvement. * The models have been optimized for coding, STEM, instruction following and general helpfulness, and have significantly higher multilingual, coding and tool calling capabilities than size equivalent counterparts. * In both standard and reasoning modes, Cogito v1-preview models outperform their size equivalent counterparts on common industry benchmarks. * Each model is trained in over 30 languages and supports a context length of 128k.

texttexttext

by Cogito

128k context

deepseek-r1:1.5b

33k

`DeepSeek R1 Distill Qwen 1.5B` is a distilled large language model based on `Qwen 2.5 Math 1.5B`, using outputs from DeepSeek R1. It's a very small and efficient model which outperforms GPT 4o 0513 on Math Benchmarks. Other benchmark results include: * AIME 2024 pass@1: 28.9 * AIME 2024 cons@64: 52.7 * MATH-500 pass@1: 83.9 The model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models.

texttexttext

by Deepseek

131k context

deepseek-r1:14b

33k

`DeepSeek R1 Distill Qwen 14B` is a distilled large language model based on Qwen 2.5 14B, using outputs from DeepSeek R1. It outperforms OpenAI's o1-mini across various benchmarks, achieving new state-of-the-art results for dense models. Other benchmark results include: * AIME 2024 pass@1: 69.7 * MATH-500 pass@1: 93.9 * CodeForces Rating: 1481 The model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models.

texttexttext

by Deepseek

131k context

deepseek-r1:32b

64k

`DeepSeek R1 Distill Qwen 32B` is a distilled large language model based on Qwen 2.5 32B, using outputs from DeepSeek R1. It outperforms OpenAI's o1-mini across various benchmarks, achieving new state-of-the-art results for dense models. Other benchmark results include: * AIME 2024 pass@1: 72.6 * MATH-500 pass@1: 94.3 * CodeForces Rating: 1691 The model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models.

texttexttext

by Deepseek

131k context

deepseek-r1:70b

64k

`DeepSeek R1 Distill Llama 70B` is a distilled large language model based on `Llama-3.3-70B-Instruct`, using outputs from DeepSeek R1. The model combines advanced distillation techniques to achieve high performance across multiple benchmarks, including: * AIME 2024 pass@1: 70.0 * MATH-500 pass@1: 94.5 * CodeForces Rating: 1633 The model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models.

texttexttext

by Deepseek

131k context

deepseek-r1:7b

33k

DeepSeek's first-generation of reasoning models with comparable performance to OpenAI-o1, including six dense models distilled from DeepSeek-R1 based on Llama and Qwen.

texttexttext

by Deepseek

131k context

deepseek-r1:8b

33k

`DeepSeek R1 Distill Llama 8B` is a distilled large language model based on `Llama-3.1-8B-Instruct`, using outputs from DeepSeek R1. The model combines advanced distillation techniques to achieve high performance across multiple benchmarks, including: * AIME 2024 pass@1: 50.4 * MATH-500 pass@1: 89.1 * CodeForces Rating: 1205 The model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models.

texttexttext

by Deepseek

131k context

llama3.1:8b

16k

The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes (text in/text out). The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.

texttexttext

by Meta

128k context

llama3.2:1b

8k

The Meta Llama 3.2 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction-tuned generative models in 1B and 3B sizes (text in/text out). The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks. They outperform many of the available open source and closed chat models on common industry benchmarks.

texttexttext

by Meta

131k context

llama3.2:3b

8k

The Meta Llama 3.2 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction-tuned generative models in 1B and 3B sizes (text in/text out). The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks. They outperform many of the available open source and closed chat models on common industry benchmarks.

texttexttext

by Meta

131k context

llama3.3

16k

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks. **Supported languages**: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.

texttexttext

by Meta

131k context

mistral

8k

Mistral is a 7B parameter model, distributed with the Apache license. It is available in both instruct (instruction following) and text completion. The Mistral AI team has noted that Mistral 7B: * Outperforms Llama 2 13B on all benchmarks * Outperforms Llama 1 34B on many benchmarks * Approaches CodeLlama 7B performance on code, while remaining good at English tasks

texttexttext

by Mistral

33k context

mistral-nemo

16k

Mistral NeMo is a 12B model built in collaboration with NVIDIA. Mistral NeMo offers a large context window of up to 128k tokens. Its reasoning, world knowledge, and coding accuracy are state-of-the-art in its size category. As it relies on standard architecture, Mistral NeMo is easy to use and a drop-in replacement in any system using Mistral 7B.

texttexttext

by Mistral

128k context

phi4

8k

`phi-4` is a state-of-the-art open model built upon a blend of synthetic datasets, data from filtered public domain websites, and acquired academic books and Q&A datasets. The goal of this approach was to ensure that small capable models were trained with data focused on high quality and advanced reasoning. `phi-4` underwent a rigorous enhancement and alignment process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures.

texttexttext

by Microsoft

16k context

phi4-mini

128k

`Phi-4-mini-instruct` is a lightweight open model built upon synthetic data and filtered publicly available websites - with a focus on high-quality, reasoning dense data. The model belongs to the Phi-4 model family and supports 128K token context length. The model underwent an enhancement process, incorporating both supervised fine-tuning and direct preference optimization to support precise instruction adherence and robust safety measures

texttexttext

by Microsoft

128k context

qwen2.5:14b

8k

`Qwen2.5` is the latest series of Qwen large language models. For Qwen2.5, a range of base language models and instruction-tuned models are released, with sizes ranging from 0.5 to 72 billion parameters. Qwen2.5 introduces the following improvements over Qwen2: * It possesses significantly more knowledge and has greatly enhanced capabilities in coding and mathematics, due to specialized expert models in these domains. * It demonstrates significant advancements in instruction following, long-text generation (over 8K tokens), understanding structured data (e.g., tables), and generating structured outputs, especially in JSON format. It is also more resilient to diverse system prompts, improving role-play and condition-setting for chatbots. * It supports long contexts of up to 128K tokens and can generate up to 8K tokens. * It offers multilingual support for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more.

texttexttext

Your multi-model, multi-modal, multi-lingual inference platform.

Featuring models from:

nomic-embed-text

mxbai-embed-large

bge-m3

snowflake-arctic-embed:22m

snowflake-arctic-embed:33m

snowflake-arctic-embed:110m

snowflake-arctic-embed:137m

snowflake-arctic-embed:335m

all-minilm:33m

all-minilm:22m

snowflake-arctic-embed2

gemma3:1b

gemma3:4b

gemma3:12b

gemma3:24b

llama3.2-vision:11b

minicpm-v

llava-llama3

moondream

granite3.2-vision

mistral-small3.1

cogito:14b

cogito:32b

cogito:3b

cogito:70b

cogito:8b

deepseek-r1:1.5b

deepseek-r1:14b

deepseek-r1:32b

deepseek-r1:70b

deepseek-r1:7b

deepseek-r1:8b

llama3.1:8b

llama3.2:1b

llama3.2:3b

llama3.3

mistral

mistral-nemo

phi4

phi4-mini

qwen2.5:14b

qwen2.5:32b

qwen2.5:72b

qwen2.5:7b

qwq

mathstral

qwen2-math:7b

qwen2-math:72b

deepscaler

qwen2.5-coder:3b

qwen2.5-coder:7b

qwen2.5-coder:14b

qwen2.5-coder:32b

deepcoder:14b

codegemma:2b

codegemma:7b

deepseek-coder:1.3b

deepseek-coder:6.7b

deepseek-coder-v2:16b

kokoro-82m

granite3.3:8b

qwen2.5vl:3b

qwen2.5vl:7b

qwen2.5vl:32b

qwen2.5vl:72b

phi4-reasoning:14b

qwen3:4b

qwen3:8b

qwen3:14b

qwen3:30b-a3b

qwen3:32b