llm orchestrationQuick Start ↓

Get Started with MOSS-RLHF

PPO-based Reinforcement Learning for Large Language Models

Getting Started

The MOSS-RLHF team maintains comprehensive docs that cover installation, configuration, and common patterns.

MOSS-RLHF offers a free tier — sign up to get started without any payment.

Our full tool profile covers MOSS-RLHF's strengths, weaknesses, pricing, and how it compares to alternatives.

Teams working on improving the quality of their pre-trained language models through RLHF techniques

Academic researchers studying reinforcement learning in NLP contexts

Developers who need a flexible and customizable tool for training large language models