Text-to-Text (LLM)

Chat completions and text generation with large language models

Generate human-like text responses using state-of-the-art language models through Cxmpute's distributed AI network.

Overview

Cxmpute's Text-to-Text service provides access to powerful large language models (LLMs) for chat completions, text generation, and conversational AI. Our OpenAI-compatible API makes it easy to integrate with existing applications.

Key Features

  • OpenAI Compatibility: Drop-in replacement for OpenAI's chat completions API
  • Multiple Models: Access to dozens of popular LLMs
  • Streaming Support: Real-time response generation
  • Global Network: Low-latency access through distributed providers
  • Advanced Features: Tool calling, JSON mode, and custom formats

Quick Start

Basic Chat Completion

Shell
curl -X POST https://cxmpute.cloud/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "X-User-Id: YOUR_USER_ID" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama3.1:8b",
    "messages": [
      {"role": "user", "content": "Explain quantum computing in simple terms."}
    ]
  }'

Python Example

Python
import requests

url = "https://cxmpute.cloud/api/v1/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "X-User-Id": "YOUR_USER_ID",
    "Content-Type": "application/json"
}

data = {
    "model": "llama3.1:8b",
    "messages": [
        {"role": "user", "content": "Write a short story about AI and humanity."}
    ],
    "temperature": 0.7,
    "max_tokens": 500
}

response = requests.post(url, headers=headers, json=data)
result = response.json()

print(result["choices"][0]["message"]["content"])

Using OpenAI Library

Python
import openai

client = openai.OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://cxmpute.cloud/api/v1",
    default_headers={"X-User-Id": "YOUR_USER_ID"}
)

response = client.chat.completions.create(
    model="llama3.1:8b",
    messages=[
        {"role": "system", "content": "You are a helpful AI assistant."},
        {"role": "user", "content": "What are the benefits of renewable energy?"}
    ]
)

print(response.choices[0].message.content)

API Reference

Endpoint

HTTP
POST /v1/chat/completions

Request Parameters

ParameterTypeRequiredDescription
modelstringYesModel name (see available models)
messagesarrayYesArray of message objects
streambooleanNoEnable streaming responses (default: false)
temperaturenumberNoSampling temperature 0-2 (default: 0.7)
max_tokensnumberNoMaximum tokens to generate
top_pnumberNoNucleus sampling parameter (0-1)
response_formatobjectNoResponse format specification
toolsarrayNoAvailable tools for function calling

Available Models

ModelSizeDescriptionBest For
llama3.1:8b8BFast, general-purposeMost applications
llama3.1:70b70BHigh-quality responsesComplex reasoning
codellama:13b13BCode generationProgramming tasks
mixtral:8x7b8x7BMixture of expertsSpecialized tasks
qwen2.5:14b14BBalanced performanceGeneral use

See our full model catalog for more options.

Streaming Responses

Enable Streaming

Python
import requests
import json

def stream_chat_completion(messages, model="llama3.1:8b"):
    url = "https://cxmpute.cloud/api/v1/chat/completions"
    headers = {
        "Authorization": "Bearer YOUR_API_KEY",
        "X-User-Id": "YOUR_USER_ID",
        "Content-Type": "application/json"
    }
    
    data = {
        "model": model,
        "messages": messages,
        "stream": True
    }
    
    response = requests.post(url, headers=headers, json=data, stream=True)
    
    for line in response.iter_lines():
        if line.startswith(b"data: "):
            chunk = line[6:]  # Remove "data: " prefix
            if chunk == b"[DONE]":
                break
            
            try:
                data = json.loads(chunk)
                if "choices" in data and data["choices"]:
                    delta = data["choices"][0].get("delta", {})
                    if "content" in delta:
                        yield delta["content"]
            except json.JSONDecodeError:
                continue

# Usage
messages = [{"role": "user", "content": "Write a poem about the ocean."}]
for chunk in stream_chat_completion(messages):
    print(chunk, end="", flush=True)

Use Cases

1. Chatbot Development

Build conversational AI applications:

Python
class ChatBot:
    def __init__(self, system_prompt="You are a helpful assistant."):
        self.conversation_history = [
            {"role": "system", "content": system_prompt}
        ]
    
    def chat(self, user_message):
        self.conversation_history.append({
            "role": "user", 
            "content": user_message
        })
        
        response = requests.post(
            "https://cxmpute.cloud/api/v1/chat/completions",
            headers=headers,
            json={
                "model": "llama3.1:8b",
                "messages": self.conversation_history,
                "temperature": 0.7
            }
        )
        
        ai_response = response.json()["choices"][0]["message"]["content"]
        
        self.conversation_history.append({
            "role": "assistant",
            "content": ai_response
        })
        
        return ai_response

# Usage
bot = ChatBot("You are a friendly coding assistant.")
response = bot.chat("How do I reverse a string in Python?")
print(response)

2. Content Generation

Generate blog posts, articles, and marketing content:

Python
def generate_blog_post(topic, tone="professional", length="medium"):
    length_map = {
        "short": "Write a concise 300-word blog post",
        "medium": "Write a comprehensive 800-word blog post", 
        "long": "Write a detailed 1500-word blog post"
    }
    
    prompt = f"""
    {length_map[length]} about {topic}.
    Tone: {tone}
    Include:
    - Engaging introduction
    - Clear main points with examples
    - Actionable takeaways
    - Compelling conclusion
    """
    
    response = requests.post(
        "https://cxmpute.cloud/api/v1/chat/completions",
        headers=headers,
        json={
            "model": "llama3.1:70b",
            "messages": [{"role": "user", "content": prompt}],
            "temperature": 0.8,
            "max_tokens": 2000
        }
    )
    
    return response.json()["choices"][0]["message"]["content"]

# Usage
article = generate_blog_post("sustainable energy solutions", "informative", "medium")
print(article)

3. Code Generation

Generate and explain code:

Python
def code_assistant(task, language="python"):
    prompt = f"""
    Task: {task}
    Language: {language}
    
    Please provide:
    1. Clean, well-commented code
    2. Explanation of how it works
    3. Example usage
    4. Best practices
    """
    
    response = requests.post(
        "https://cxmpute.cloud/api/v1/chat/completions",
        headers=headers,
        json={
            "model": "codellama:13b",
            "messages": [{"role": "user", "content": prompt}],
            "temperature": 0.3
        }
    )
    
    return response.json()["choices"][0]["message"]["content"]

# Usage
code_help = code_assistant("Create a REST API endpoint for user authentication using Flask")
print(code_help)

Best Practices

1. Temperature Control

Adjust response creativity:

Python
def generate_with_creativity(prompt, creativity_level="balanced"):
    temperature_map = {
        "factual": 0.1,      # Very consistent, factual responses
        "balanced": 0.7,     # Good balance of accuracy and creativity
        "creative": 1.2,     # More creative and diverse responses
    }
    
    response = requests.post(
        "https://cxmpute.cloud/api/v1/chat/completions",
        headers=headers,
        json={
            "model": "llama3.1:8b",
            "messages": [{"role": "user", "content": prompt}],
            "temperature": temperature_map[creativity_level]
        }
    )
    
    return response.json()["choices"][0]["message"]["content"]

2. Error Handling

Implement robust error handling:

Python
import time
import random

def resilient_chat_completion(messages, model="llama3.1:8b", max_retries=3):
    for attempt in range(max_retries):
        try:
            response = requests.post(
                "https://cxmpute.cloud/api/v1/chat/completions",
                headers=headers,
                json={
                    "model": model,
                    "messages": messages,
                    "temperature": 0.7
                },
                timeout=60
            )
            
            if response.status_code == 200:
                return response.json()
            elif response.status_code == 503:
                wait_time = (2 ** attempt) + random.uniform(0, 1)
                time.sleep(wait_time)
                continue
            else:
                response.raise_for_status()
                
        except requests.exceptions.RequestException as e:
            if attempt == max_retries - 1:
                raise e
            time.sleep(1)
    
    raise Exception("Failed to get response after all retries")

Pricing

During our testnet phase, all services are completely free for all users! Pricing for the mainnet launch is to be determined (TBD).

Join our Discord community to stay updated on pricing announcements, give feedback, and connect with other developers building with Cxmpute.

Support


Ready to build with AI? Start with our OpenAI-compatible API and create intelligent applications powered by the world's best language models!