featherlessChat
Qrwkv 72B
featherless/qwerky-72b
33KContext Window
Online
Qrwkv-72B is a linear-attention RWKV variant of the Qwen 2.5 72B model, optimized to significantly reduce computational cost at scale. Leveraging linear attention, it achieves substantial inference speedups (>1000x) while retaining competitive accuracy on common benchmarks like ARC, HellaSwag, Lambada, and MMLU. It inherits knowledge and language support from Qwen 2.5, supporting approximately 30 languages, making it suitable for efficient inference in large-context applications.
Capabilities
Text GenerationCode GenerationAnalysis & Reasoningmodels.reasoning
Technical Specs
Input Modality
Text
Output Modality
Text
Arch
—
Pricing
Pay per use, no monthly feesInput Token< ¥0.001/1K Token
Output Token< ¥0.001/1K Token
Quick Start
from openai import OpenAI
client = OpenAI(
base_url="https://api.uniontoken.ai/v1",
api_key="YOUR_UNIONTOKEN_API_KEY",
)
response = client.chat.completions.create(
model="featherless/qwerky-72b",
messages=[
{"role": "user", "content": "Hello!"}
],
)
print(response.choices[0].message.content)FAQ
Qrwkv 72B
featherless/qwerky-72b
In< ¥0.001/1K
Out< ¥0.001/1K
Context Window33K
Ready to get started?
Get 1M free tokens on registration, no monthly fees or minimum spend
Register Now →