Skip to content

Configuration API Reference

free_transformer.config.ModelConfig(vocab_size=32000, hidden_dim=4096, num_layers=32, num_heads=32, num_kv_heads=None, ffn_hidden_dim=11008, max_seq_len=2048, latent_dim=16, split_layer=None, use_rmsnorm=True, use_rope=True, use_swiglu=True, dropout=0.0, attention_dropout=0.0) dataclass

Configuration for model architecture.

free_transformer.config.TrainingConfig(learning_rate=0.0003, weight_decay=0.1, beta1=0.9, beta2=0.95, grad_clip=1.0, warmup_steps=2000, max_steps=100000, beta_kl=1.0, kappa_free_bits=0.3466, batch_size=64, gradient_accumulation_steps=1, use_fsdp=False, use_deepspeed=False, fsdp_config=dict(), deepspeed_config=dict(), save_every=5000, eval_every=1000, checkpoint_dir='./checkpoints', log_every=100, wandb_project=None) dataclass

Configuration for training.

from_yaml(path) classmethod

Load configuration from YAML file.

Source code in src/free_transformer/config.py
@classmethod
def from_yaml(cls, path: str):
    """Load configuration from YAML file."""
    with open(path, "r") as f:
        config_dict = yaml.safe_load(f)
    return cls(**config_dict)

to_yaml(path)

Save configuration to YAML file.

Source code in src/free_transformer/config.py
def to_yaml(self, path: str):
    """Save configuration to YAML file."""
    with open(path, "w") as f:
        yaml.dump(self.__dict__, f)