Supported LLM Base Models
Supported LLM Base Models
Navigator supports a variety of open source LLMs, and is adding more all the time. Below is the current list of supported models with links to their Hugging Face model pages where you can learn more about each model’s strengths and weaknesses.
For most use cases that require training or inference on consumer hardware, we recommend using a 7B parameter model.
This is the sweet spot between model performance and size for consumer devices. All LLM features in Navigator work well with 7B parameter models.
Full List of Supported Models
-
mlx-community/codegemma-7b-it-8bit
Parameters: 7 billion
Description: Google's CodeGemma model optimized for code generation, with 8-bit quantization requiring 5.30GB RAM.
View on Hugging Face -
mlx-community/Codestral-22B-v0.1-4bit
Parameters: 22 billion
Description: Mistral AI's code-specialized model with 4-bit quantization requiring 6.48GB RAM.
View on Hugging Face -
mlx-community/Codestral-22B-v0.1-8bit
Parameters: 22 billion
Description: 8-bit quantized version of Codestral requiring 11.66GB RAM for higher precision.
View on Hugging Face -
mlx-community/Meta-Llama-3.1-8B-Instruct-4bit
Parameters: 8 billion
Description: Meta's Llama 3.1 model with 4-bit quantization requiring only 2.34GB RAM.
View on Hugging Face -
mlx-community/Meta-Llama-3.1-8B-Instruct-8bit
Parameters: 8 billion
Description: 8-bit quantized version of Llama 3.1 requiring 4.21GB RAM for better precision.
View on Hugging Face -
mlx-community/Meta-Llama-3.1-70B-Instruct-4bit
Parameters: 70 billion
Description: Large-scale Llama 3.1 model with 4-bit quantization requiring 20.54GB RAM.
View on Hugging Face -
mlx-community/Ministral-8B-Instruct-2410-bf16
Parameters: 8 billion
Description: Unquantized Mistral AI model requiring 14.94GB RAM for full precision.
View on Hugging Face -
mlx-community/Mistral-7B-Instruct-v0.3
Parameters: 7 billion
Description: Original Mistral instruction model requiring 13.50GB RAM.
View on Hugging Face -
mistralai/Mixtral-8x7B-Instruct-v0.1
Parameters: 7 billion
Description: Mixture of experts model requiring significant RAM (261.94GB) for unquantized operation.
View on Hugging Face -
mistralai/Mixtral-8x22B-Instruct-v0.1
Parameters: 22 billion
Description: Larger mixture of experts model requiring 86.99GB RAM for unquantized operation.
View on Hugging Face -
mlx-community/Mistral-NeMo-Minitron-8B-Instruct
Parameters: 8.41 billion
Description: NVIDIA's implementation requiring 15.67GB RAM for unquantized operation.
View on Hugging Face -
mlx-community/nvidia_Llama-3.1-Nemotron-70B-Instruct-HF_4bit
Parameters: 70 billion
Description: NVIDIA's Llama 3.1 implementation with 4-bit quantization requiring 20.54GB RAM.
View on Hugging Face -
mlx-community/Phi-3-medium-128k-instruct-bf16
Parameters: 14 billion
Description: Microsoft's Phi-3 model with extended context window requiring 26.00GB RAM.
View on Hugging Face -
mlx-community/Phi-3.5-mini-instruct-bf16
Parameters: 3.8 billion
Description: Compact version of Phi requiring 7.12GB RAM for unquantized operation.
View on Hugging Face -
mlx-community/quantized-gemma-7b-it
Parameters: 7 billion
Description: Google's Gemma model with 4-bit quantization requiring 3.72GB RAM.
View on Hugging Face -
mlx-community/Qwen2-7B-Instruct-8bit
Parameters: 7.62 billion
Description: Qwen's instruction model with 8-bit quantization requiring 3.99GB RAM.
View on Hugging Face -
mlx-community/Qwen2-72B-Instruct-4bit
Parameters: 72.7 billion
Description: Large Qwen model with 4-bit quantization requiring 21.16GB RAM.
View on Hugging Face -
mlx-community/Qwen2.5-72B-Instruct-4bit
Parameters: 72.7 billion
Description: Updated Qwen 2.5 model with 4-bit quantization requiring 21.16GB RAM.
View on Hugging Face -
mlx-community/sum-small-unquantized
Parameters: 3.82 billion
Description: Omi Health's compact model requiring 7.12GB RAM for unquantized operation.
View on Hugging Face -
Qwen/Qwen2.5-14B-Instruct
Parameters: 14.8 billion
Description: Mid-sized Qwen 2.5 model requiring 27.51GB RAM for unquantized operation.
View on Hugging Face -
Qwen/Qwen2.5-Coder-7B-Instruct
Parameters: 7.62 billion
Description: Code-specialized Qwen model requiring 14.19GB RAM for unquantized operation.
View on Hugging Face -
Qwen/QwQ-32B-Preview
Parameters: 32.8 billion
Description: Preview version of QwQ requiring 61.03GB RAM for unquantized operation.
View on Hugging Face -
meta-llama/Llama-3.3-70B-Instruct
Parameters: 70.6 billion
Description: Latest Llama 3.3 model requiring 131.42GB RAM for unquantized operation.
View on Hugging Face -
microsoft/phi-4
Parameters: 14.7 billion
Description: Microsoft's latest Phi model requiring 27.32GB RAM for unquantized operation.
View on Hugging Face -
deepseek-ai/DeepSeek-V3
Parameters: 671 billion
Description: Massive model requiring 1275.04GB RAM for unquantized operation.
View on Hugging Face -
deepseek-ai/Deepseek-R1-Distill-Llama-8B
Parameters: 8 billion
Description: Distilled Llama model requiring 14.96GB RAM for unquantized operation.
View on Hugging Face