Using Language Models
Once you've deployed a language model on RunGen.AI, you can interact with it via an OpenAI-compatible client or our Inference API (REST). This guide walks you through both methods.
1. Prerequisites
Before getting started, ensure you have:
- A deployed Language Model on RunGen.AI
- Your API key (See how to get your API key)
- Your app ID (found in the dashboard under your deployed app)
2. Using the OpenAI Client
The easiest way to get started is by using our OpenAI-compatible API, allowing you to use the OpenAI compatible libraries.
This section includes code examples for the official OpenAI SDK for Python, but other SDKs (such as LangChain) or languages will also work.
Setting Up the OpenAI Client
from openai import OpenAI
client = OpenAI(
api_key="your-rungen-api-key",
base_url="https://api.rungen.ai/app/{app_id}/llm/v1"
)
Chat Completions (Streaming & Non-Streaming)
Using the SDK allows you to use streaming (i.e., real-time) or non-streaming completions (waiting until the response is finished). Depending on your needs, you can chose the one you prefer.
Note: Some models don't support chat completions. Text completions (next section) are going to work, regardless.
Streaming Response
response_stream = client.chat.completions.create(
model="deepseek-ai/DeepSeek-R1-Distill-Llama-8B",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Tell me a joke."}
],
temperature=0.7,
max_tokens=50,
stream=True
)
for response in response_stream:
print(response.choices[0].delta.content or "", end="", flush=True)
Non-Streaming Response
response = client.chat.completions.create(
model="deepseek-ai/DeepSeek-R1-Distill-Llama-8B",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Tell me a joke."}
],
temperature=0.7,
max_tokens=50,
stream=False
)
print(response)
Text Completions (Non-Chat Models)
Some models do not support chat-based prompts and instead use direct text completion. This method follows OpenAI’s completions.create()
function.
Streaming Text Completion
response_stream = client.completions.create(
model="deepseek-ai/DeepSeek-R1-Distill-Llama-8B",
prompt="Explain quantum physics in simple terms.",
temperature=0.7,
max_tokens=100,
stream=True
)
for response in response_stream:
print(response.choices[0].text or "", end="", flush=True)
Non-Streaming Text Completion
response = client.completions.create(
model="deepseek-ai/DeepSeek-R1-Distill-Llama-8B",
prompt="Explain quantum physics in simple terms.",
temperature=0.7,
max_tokens=100,
stream=False
)
print(response)
3. Using the REST API
If you don't want to use an OpenAI-compatible client, you can prompt your deployed model by sending requests to the Run Job endpoint.
Endpoint
POST https://api.rungen.ai/app/{app_id}/run_async
Headers
Header | Description |
---|---|
x-api-key | Your API key for authentication |
Content-Type | application/json |
Request Body Example
{
"input": {
"messages": [
{"role": "system", "content": "You are an AI assistant."},
{"role": "user", "content": "How does AI work?"}
],
"temperature": 0.7,
"max_tokens": 100
}
}
Example Request Using cURL
curl --location 'https://api.rungen.ai/app/{app_id}/run_async' \
--header 'x-api-key: your_api_key' \
--header 'Content-Type: application/json' \
--data '{
"input": {
"messages": [
{"role": "system", "content": "You are an AI assistant."},
{"role": "user", "content": "How does AI work?"}
],
"temperature": 0.7,
"max_tokens": 100
}
}'
Handling Responses
If successful, the request returns a job_id
that you can use to retrieve results.
Checking Job Status
GET https://api.rungen.ai/app/{app_id}/job/{job_id}
Example Response:
{
"data": {
"status": "COMPLETED",
"result": {
"choices": [
{"message": {"role": "assistant", "content": "AI is a field of computer science that enables machines to learn and make decisions."}}
]
}
}
}
4. Next Steps
- Experiment with your model in the Playground.
- Integrate your model into your application.
- Explore advanced configurations in the API documentation.
For any questions, reach out to our support team.