featherlessChat

Qrwkv 72B

featherless/qwerky-72b

33KContext Window

Online

Qrwkv-72B is a linear-attention RWKV variant of the Qwen 2.5 72B model, optimized to significantly reduce computational cost at scale. Leveraging linear attention, it achieves substantial inference speedups (>1000x) while retaining competitive accuracy on common benchmarks like ARC, HellaSwag, Lambada, and MMLU. It inherits knowledge and language support from Qwen 2.5, supporting approximately 30 languages, making it suitable for efficient inference in large-context applications.

Capabilities

Text GenerationCode GenerationAnalysis & Reasoningmodels.reasoning

Technical Specs

Input Modality

Text

Output Modality

Text

Arch

—

Pricing

Pay per use, no monthly fees

Input Token< ¥0.001/1K Token

Output Token< ¥0.001/1K Token

Quick Start

from openai import OpenAI

client = OpenAI(
    base_url="https://api.uniontoken.ai/v1",
    api_key="YOUR_UNIONTOKEN_API_KEY",
)

response = client.chat.completions.create(
    model="featherless/qwerky-72b",
    messages=[
        {"role": "user", "content": "Hello!"}
    ],
)

print(response.choices[0].message.content)

FAQ

Qrwkv 72B

featherless/qwerky-72b

In< ¥0.001/1K

Out< ¥0.001/1K

Context Window33K

Start Using →View Integration Docs

← ← Back to Models

Ready to get started?

Get 1M free tokens on registration, no monthly fees or minimum spend