Overview of OnlySq models
OnlySq offers a wide range of models for various user needs. All models supports "system" role for more customisation. You can get json of models by using:
GET https://api.onlysq.ru/ai/models
ChatGPT models
ChatGPT is a free and easy-to-use app that can help you with writing, learning, brainstorming, and more.
| Model name | Description | Type | Modality | Endpoints |
|---|---|---|---|---|
gpt-4o-mini | A small model with superior textual intelligence and multimodal reasoning | Keys | Text | API2.0 OpenAI SDK |
gpt-4o-mini-2024-07-18 | A July 18, 2024 release of GPT‑4o Mini, optimized for speed and low-cost usage while maintaining strong reasoning capabilities for common tasks. | Keys | Text | API2.0 OpenAI SDK |
gpt-4o-2024-05-13 | A May 13, 2024 version of GPT‑4o, providing robust performance for general-purpose and multimodal tasks with improved efficiency. | Keys | Text | API2.0 OpenAI SDK |
gpt-4o-2024-08-06 | An August 6, 2024 release of GPT‑4o, featuring updates for improved alignment, reasoning, and cost-effectiveness in production scenarios. | Keys | Text | API2.0 OpenAI SDK |
gpt-4o-2024-11-20 | A November 20, 2024 version of GPT‑4o, tuned for better performance on complex reasoning and long-context conversations. | Keys | Text | API2.0 OpenAI SDK |
gpt-4o | GPT‑4o (“o” for “omni”) is a step towards much more natural human-computer interaction | Keys | Text | API2.0 OpenAI SDK |
chatgpt-4o-latest | The latest ChatGPT experience powered by GPT‑4o, providing the most up-to-date behaviour for conversational and assistant-style interactions. | Keys | Text | API2.0 OpenAI SDK |
gpt-4.1 | GPT‑4.1 is a highly capable general-purpose model with strong reasoning, coding, and multilingual abilities, suitable for most production workloads. | Keys | Text | API2.0 OpenAI SDK |
gpt-4.1-2025-04-14 | An April 14, 2025 release of GPT‑4.1, with refinements for reliability, alignment, and broader domain coverage in real-world scenarios. | Keys | Text | API2.0 OpenAI SDK |
gpt-4.1-mini | A compact variant of GPT‑4.1, designed for lower latency and cost while retaining strong performance on everyday tasks. | Keys | Text | API2.0 OpenAI SDK |
gpt-4.1-mini-2025-04-14 | An April 14, 2025 version of GPT‑4.1 Mini with further optimizations for speed, efficiency, and typical assistant workloads. | Keys | Text | API2.0 OpenAI SDK |
gpt-4.1-nano | An ultra-small GPT‑4.1-based model designed for on-device or very low-resource environments and simple assistance tasks. | Keys | Text | API2.0 OpenAI SDK |
gpt-4.1-nano-2025-04-14 | A refined April 14, 2025 build of GPT‑4.1 Nano, offering improved quality for resource-constrained deployments. | Keys | Text | API2.0 OpenAI SDK |
o3-mini | OpenAI o3‑mini is a small reasoning model that supports function calling, structured outputs, and developer messages, making it production-ready out of the gate. | Keys | Text | API2.0 OpenAI SDK |
o3-mini-2025-01-31 | A January 31, 2025 release of o3‑mini with improved reasoning stability, speed, and integration for developer workflows. | Keys | Text | API2.0 OpenAI SDK |
o3 | A larger OpenAI o3 reasoning model focused on highly complex, multi-step problem solving, coding, and analytical tasks. | Keys | Text | API2.0 OpenAI SDK |
o3-2025-04-16 | An April 16, 2025 build of o3, offering refined reasoning traces and stronger performance on challenging benchmarks. | Keys | Text | API2.0 OpenAI SDK |
o4-mini | A next-generation small reasoning model in the o‑series, balancing cost, speed, and advanced reasoning for mainstream applications. | Keys | Text | API2.0 OpenAI SDK |
o4-mini-2025-04-16 | An April 16, 2025 release of o4‑mini, featuring the latest updates in reasoning quality and latency improvements. | Keys | Text | API2.0 OpenAI SDK |
o1 | An OpenAI reasoning model designed for complex step-by-step problem solving, coding, and analytical work with detailed intermediate reasoning. | Keys | Text | API2.0 OpenAI SDK |
o1-2024-12-17 | A December 17, 2024 snapshot of o1 with updated capabilities and refinements to reasoning reliability. | Keys | Text | API2.0 OpenAI SDK |
gpt-4-turbo | GPT‑4 Turbo is a fast, cost-effective variant of GPT‑4, optimized for conversation and common assistant tasks. | Keys | Text | API2.0 OpenAI SDK |
gpt-4-turbo-2024-04-09 | An April 9, 2024 release of GPT‑4 Turbo that improves speed, cost, and quality balance for production applications. | Keys | Text | API2.0 OpenAI SDK |
gpt-4-turbo-preview | A preview build of GPT‑4 Turbo offering access to the latest experimental features and behaviors before general release. | Keys | Text | API2.0 OpenAI SDK |
gpt-4o-search-preview | A GPT‑4o-based search-optimized model preview, tuned for retrieval-augmented generation and search-style tasks. | Keys | Text | API2.0 OpenAI SDK |
gpt-4o-search-preview-2025-03-11 | A March 11, 2025 release of GPT‑4o Search Preview, with refined ranking, retrieval, and grounded answer generation. | Keys | Text | API2.0 OpenAI SDK |
gpt-4o-mini-search-preview | A lightweight GPT‑4o Mini model optimized for search and retrieval-style tasks with lower latency. | Keys | Text | API2.0 OpenAI SDK |
gpt-4o-mini-search-preview-2025-03-11 | A March 11, 2025 version of GPT‑4o Mini Search Preview offering improved relevance and grounded responses at low cost. | Keys | Text | API2.0 OpenAI SDK |
gpt-5 | GPT‑5 is a next-generation large language model providing superior reasoning, coding, and multilingual capabilities for demanding applications. | Keys | Text | API2.0 OpenAI SDK |
gpt-5-2025-08-07 | An August 7, 2025 build of GPT‑5 featuring the latest training data and architectural improvements for state-of-the-art performance. | Keys | Text | API2.0 OpenAI SDK |
gpt-5-mini | A smaller, efficient GPT‑5 variant aimed at cost-effective deployment while inheriting many of GPT‑5’s capabilities. | Keys | Text | API2.0 OpenAI SDK |
gpt-5-mini-2025-08-07 | An August 7, 2025 release of GPT‑5 Mini with optimizations for responsiveness and typical assistant tasks. | Keys | Text | API2.0 OpenAI SDK |
gpt-5-nano | An ultra-small GPT‑5-based model targeted at extremely low-resource environments and simple, fast responses. | Keys | Text | API2.0 OpenAI SDK |
gpt-5-nano-2025-08-07 | An August 7, 2025 version of GPT‑5 Nano with incremental quality improvements while remaining highly efficient. | Keys | Text | API2.0 OpenAI SDK |
gpt-5.1 | GPT‑5.1 is an updated generation building on GPT‑5, offering better reasoning reliability, safety, and domain coverage. | Keys | Text | API2.0 OpenAI SDK |
gpt-5.1-2025-11-13 | A November 13, 2025 snapshot of GPT‑5.1 incorporating the latest fine-tuning and safety updates. | Keys | Text | API2.0 OpenAI SDK |
gpt-5-chat-latest | The latest chat-focused configuration of GPT‑5, providing an optimized conversational experience. | Keys | Text | API2.0 OpenAI SDK |
gpt-5.1-chat-latest | The latest chat-optimized build of GPT‑5.1, tuned for multi-turn dialogue, safety, and helpfulness. | Keys | Text | API2.0 OpenAI SDK |
gpt-5-search-api | A GPT‑5-based model optimized as a search API, designed for integrating search and retrieval into applications. | Keys | Text | API2.0 OpenAI SDK |
gpt-5-search-api-2025-10-14 | An October 14, 2025 build of the GPT‑5 Search API with updated ranking, retrieval, and grounding behavior. | Keys | Text | API2.0 OpenAI SDK |
gpt-4 | GPT‑4 is OpenAI’s flagship model generation, delivering strong general-purpose performance across many tasks. | Keys | Text | API2.0 OpenAI SDK |
gpt-4-0613 | A June 13, 2023 build of GPT‑4 with stable function‑calling and tool-use support widely used in production. | Keys | Text | API2.0 OpenAI SDK |
gpt-4-1106-preview | A November 6, 2023 preview of GPT‑4 featuring newer capabilities and experimental improvements. | Keys | Text | API2.0 OpenAI SDK |
gpt-4-0125-preview | A January 25, 2024 GPT‑4 preview with updated training data and enhanced reasoning in complex scenarios. | Keys | Text | API2.0 OpenAI SDK |
gpt-4-turbo-preview | A preview version of GPT‑4 Turbo providing early access to upcoming improvements in speed and capability. | Keys | Text | API2.0 OpenAI SDK |
gpt-3.5-turbo | GPT‑3.5‑turbo is a fast, cost-effective version of GPT‑3.5, optimized for dialogue and general tasks with broad capabilities but less powerful than GPT‑4. | Keys | Text | API2.0 OpenAI SDK |
gpt-3.5-turbo-16k | A 16k-context version of GPT‑3.5‑turbo, ideal for longer documents and extended conversations. | Keys | Text | API2.0 OpenAI SDK |
gpt-3.5-turbo-1106 | A November 6, 2023 build of GPT‑3.5‑turbo with updates for reliability and cost efficiency. | Keys | Text | API2.0 OpenAI SDK |
gpt-3.5-turbo-0125 | A January 25, 2024 release of GPT‑3.5‑turbo with improved instruction-following and robustness. | Keys | Text | API2.0 OpenAI SDK |
DeepSeek models
Deepseek is a series of large language models developed by the company Deepseek, known for their strong performance in code generation, reasoning, and natural language tasks.
| Model name | Description | Type | Modality | Endpoints |
|---|---|---|---|---|
deepseek-r1 | Deepseek-R1 is a specialized AI model by Deepseek, optimized for reasoning and code generation tasks, with strong performance in logical problem-solving and programming. It’s designed to handle complex, multi-step tasks efficiently. | Keys | Text | API2.0 OpenAI SDK |
deepseek-v3 | Deepseek-V3 is an advanced AI model by Deepseek, excelling in reasoning, code generation, and multi-step problem-solving. It builds on prior versions with enhanced capabilities for complex tasks and improved efficiency. | Keys | Text | API2.0 OpenAI SDK |
Llama models
Llama is a series of open-source large language models developed by Meta. Known for their versatility and multilingual support, Llama models excel in natural language understanding, code generation, and reasoning tasks.
| Model name | Description | Type | Modality | Endpoints |
|---|---|---|---|---|
llama-3.3-8b | Llama‑3.3‑8B is a compact 8‑billion‑parameter model optimized for efficient deployment with strong language understanding and coding support. | Keys | Text | API2.0 OpenAI SDK |
llama-3.3-70b | Llama‑3.3‑70B is a larger, more capable variant suited for complex reasoning, coding, and multilingual tasks. | Keys | Text | API2.0 OpenAI SDK |
llama-4-maverick | Llama‑4 Maverick is a next-gen Llama model designed for strong general-purpose performance and efficient deployment. | Keys | Text | API2.0 OpenAI SDK |
llama-4-maverick-17b-128e-instruct | A 17B Llama‑4 Maverick instruct-tuned model with extended context (128k) for advanced reasoning and long-document tasks. | Keys | Text | API2.0 OpenAI SDK |
llama-4-scout | Llama‑4 Scout is optimized for fast inference and exploratory tasks while maintaining strong language capabilities. | Keys | Text | API2.0 OpenAI SDK |
llama-3-3-70b | Llama‑3.3‑70B offers high-end reasoning, coding, and multilingual support suitable for complex enterprise tasks. | Keys | Text | API2.0 OpenAI SDK |
llama-3-1-8b | Llama‑3.1‑8B is an 8‑billion‑parameter model focused on efficiency and high-quality language understanding for everyday workloads. | Keys | Text | API2.0 OpenAI SDK |
llama-3.1 | A provider-based Llama‑3.1 configuration, offering strong general-purpose performance and multilingual capabilities. | Provider | Text | API2.0 OpenAI SDK |
Claude models
Claude is a series of large language models developed by Anthropic, known for their strong natural language understanding, reasoning, and code generation capabilities.
| Model name | Description | Type | Modality | Endpoints |
|---|---|---|---|---|
claude-3.5-sonnet | Claude‑3.5‑Sonnet is a powerful AI model by Anthropic, excelling in natural language tasks, reasoning, and code generation, with a focus on safety, accuracy, and handling complex, multi-step problems. | Provider | Text | API2.0 OpenAI SDK |
Le Chat models
Mistral is a series of open-weight large language models developed by Mistral AI, known for their efficiency, scalability, and strong performance in natural language tasks, reasoning, and code generation.
| Model name | Description | Type | Modality | Endpoints |
|---|---|---|---|---|
mistral-small-3.1 | Mistral‑Small‑3.1 is a compact, efficient variant of the Mistral series, designed for cost-effective and fast performance on simpler tasks while maintaining strong language understanding and reasoning capabilities. | Keys | Text | API2.0 OpenAI SDK |
e5-mistral-7b | E5‑Mistral‑7B is a 7‑billion‑parameter model optimized for embeddings and semantic understanding while still supporting general text tasks. | Keys | Text | API2.0 OpenAI SDK |
Qwen models
Qwen is a series of large language models developed by Alibaba Cloud, designed for versatility and high performance across a wide range of tasks.
| Model name | Description | Type | Modality | Endpoints |
|---|---|---|---|---|
qwen-3-235b-a22b-instruct-2507 | A large-scale Qwen‑3 instruct model with 235B parameters (A22B), tuned for complex reasoning, long-context tasks, and precise instruction-following. | Keys | Text | API2.0 OpenAI SDK |
qwen3-max-2025-10-30 | Qwen‑3 Max (October 30, 2025) is a highly capable general-purpose model suited for demanding workloads requiring advanced reasoning and coding. | Keys | Text | API2.0 OpenAI SDK |
qwen3-vl-plus | Qwen‑3 VL Plus is a vision-language model able to process and reason over both images and text. | Keys | Text | API2.0 OpenAI SDK |
qwen3-vl-32b | A 32B-parameter Qwen‑3 vision-language model, offering strong multimodal understanding and generation. | Keys | Text | API2.0 OpenAI SDK |
qwen3-coder-plus | Qwen‑3 Coder Plus is specialized for programming tasks, code generation, and debugging across multiple languages. | Keys | Text | API2.0 OpenAI SDK |
qwen | Qwen is a large language model developed by Alibaba Cloud, excelling in natural language understanding, code generation, and multi-language support, with a focus on versatility and handling complex tasks efficiently. | Provider | Text | API2.0 OpenAI SDK |
Google models
Gemini is a series of large language models developed by Google, known for their advanced reasoning, natural language understanding, and multimodal capabilities.
| Model name | Description | Type | Modality | Endpoints |
|---|---|---|---|---|
gemini-2.5-flash | A fast and efficient version of Google's Gemini model, optimized for real-time, low-latency tasks while maintaining strong multi-modal understanding. | Keys | Text | API2.0 OpenAI SDK |
gemini-2.5-flash-lite-preview-06-17 | A lightweight preview release of the Gemini Flash model from June 17, designed for cost-effective, high-speed inference on less complex tasks. | Keys | Text | API2.0 OpenAI SDK |
gemini-2.0-flash | A fast, efficient model optimized for quick responses and cost-effective tasks while maintaining strong performance. | Keys | Text | API2.0 OpenAI SDK |
gemini-2.0-flash-lite | A lighter version of gemini-2.0-flash, prioritizing speed and lower resource usage for simpler tasks. | Keys | Text | API2.0 OpenAI SDK |
gemini-1.5-flash | A fast and capable model from the 1.5 series, designed for general-purpose tasks with a balance of speed and accuracy. | Keys | Text | API2.0 OpenAI SDK |
gemini-1.5-flash-8b | A smaller, 8-billion-parameter variant of gemini-1.5-flash, offering faster inference for less complex tasks. | Keys | Text | API2.0 OpenAI SDK |
gemini-1.5-pro | A high-performance model in the 1.5 series, optimized for complex, multi-step tasks requiring advanced reasoning and precision. | Keys | Text | API2.0 OpenAI SDK |
gemma-3-27b-it | gemma-3-27b-it is a 27‑billion‑parameter Gemma model fine-tuned for instruction following, delivering strong performance on complex language and coding tasks. | Keys | Text | API2.0 OpenAI SDK |
gemma-3-4b-it | gemma-3-4b-it is a 4‑billion‑parameter model from the Gemma series, optimized for instructional tasks (IT). It provides strong performance in natural language understanding and task-specific applications, balancing efficiency and capability. | Keys | Text | API2.0 OpenAI SDK |
gemma-2-2b-it | A 2‑billion‑parameter Gemma‑2 model tuned for instruction following, ideal for lightweight applications with limited resources. | Keys | Text | API2.0 OpenAI SDK |
gemma-2-9b-it-fast | A 9‑billion‑parameter Gemma‑2 model optimized for fast inference while maintaining strong instruction-following performance. | Keys | Text | API2.0 OpenAI SDK |
Cohere models
Command is a specialized AI model developed by Cohere, designed for understanding and executing commands or instructions effectively. It focuses on task-specific applications, such as interpreting user inputs, automating workflows, and performing actions based on natural language commands.
| Model name | Description | Type | Modality | Endpoints |
|---|---|---|---|---|
command-a-03-2025 | A March 2025 iteration of the Command model, optimized for advanced text generation, reasoning, and conversational tasks. | Keys | Text | API2.0 OpenAI SDK |
command-r7b-12-2024 | A December 2024 Command‑R7B model focusing on reasoning and instruction execution with improved robustness. | Keys | Text | API2.0 OpenAI SDK |
command-r-plus-04-2024 | An April 2024 enhanced variant of Command‑R+, designed for superior reasoning and task-specific applications. | Keys | Text | API2.0 OpenAI SDK |
command-r-plus | An advanced version of the Command‑R model, emphasizing improved reasoning and execution capabilities. | Keys | Text | API2.0 OpenAI SDK |
command-r-08-2024 | An August 2024 release of the Command‑R model, optimized for interpreting and executing commands. | Keys | Text | API2.0 OpenAI SDK |
command-r-03-2024 | A March 2024 release of the Command‑R model, focused on command-driven tasks and automation. | Keys | Text | API2.0 OpenAI SDK |
command-r | The standard Command‑R model, tailored for understanding and executing instructions effectively. | Keys | Text | API2.0 OpenAI SDK |
command | The foundational Command model, designed for general-purpose text generation, conversations, and question-answering. | Keys | Text | API2.0 OpenAI SDK |
command-nightly | A nightly build of the Command model, offering the latest experimental features and updates. | Keys | Text | API2.0 OpenAI SDK |
command-light | A lightweight version of the Command model, optimized for speed and efficiency in resource-constrained environments. | Keys | Text | API2.0 OpenAI SDK |
command-light-nightly | A nightly build of the lightweight Command model, providing rapid updates and experimental features for lightweight use cases. | Keys | Text | API2.0 OpenAI SDK |
c4ai-aya-expanse-32b | A 32‑billion‑parameter model developed by C4AI, designed for expansive reasoning, complex tasks, and high-performance applications. | Keys | Text | API2.0 OpenAI SDK |
Other models
Additional providers and specialized models available via OnlySq.
| Model name | Description | Type | Modality | Endpoints |
|---|---|---|---|---|
phi-4 | Phi‑4 is a compact, efficient model known for strong reasoning-to-size ratio, ideal for low-cost deployments and experimentation. | Keys | Text | API2.0 OpenAI SDK |
searchgpt | A specialized GPT-based configuration optimized for search and retrieval applications. | Keys | Text | API2.0 OpenAI SDK |
grok | Grok is an AI model focused on real-time, internet-aware conversations and reasoning, suited for up-to-date question answering. | Keys | Text | API2.0 OpenAI SDK |
evil | An experimental model configuration intended for testing robustness and safety systems against adversarial or “jailbreak” prompts. | Keys | Text | API2.0 OpenAI SDK |
mirexa | Mirexa is a custom model focused on dialogue and creative writing tasks. | Keys | Text | API2.0 OpenAI SDK |
gpt-oss-120b | GPT‑OSS‑120B is an open-source style 120‑billion‑parameter model providing high-end reasoning and generation capabilities. | Keys | Text | API2.0 OpenAI SDK |
zai-glm-4.6 | ZAI GLM‑4.6 is a high-capacity model in the GLM family, designed for complex reasoning, multilingual support, and coding. | Keys | Text | API2.0 OpenAI SDK |
zai-glm-4-5 | ZAI GLM‑4.5 is a previous-generation GLM model offering strong general-purpose language and coding performance. | Keys | Text | API2.0 OpenAI SDK |
zai-glm-4-5-air | A lightweight “Air” variant of ZAI GLM‑4.5, focused on speed and lower resource usage while keeping good quality. | Keys | Text | API2.0 OpenAI SDK |
kimi-k2 | Kimi‑K2 is a model focused on multilingual chat and reasoning, with strong performance on practical assistant tasks. | Keys | Text | API2.0 OpenAI SDK |
c4ai-aya-expanse-32b | A 32‑billion‑parameter Aya Expanse model from C4AI, designed for large-scale reasoning, multilingual understanding, and complex tasks. | Keys | Text | API2.0 OpenAI SDK |
Image models
Image Models are AI models designed for tasks involving images, such as recognition, segmentation, generation, enhancement, and style transfer.
| Model name | Description | Type | Modality | Endpoints |
|---|---|---|---|---|
flux | Flux is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions. | Keys | Images | ImaGen |
Sound models
Sound models are designed to convert text to speech and handle audio-focused tasks.
| Model name | Description | Type | Modality | Endpoints |
|---|---|---|---|---|
gtts | gTTS (Google Text-to-Speech) generates natural-sounding speech audio from text in multiple languages. | Provider | Sound | TTS |