Introduction
Every machine learning engineer has had the same experience: you write a model in Python, the shapes do not match, and you do not find out until 45 minutes into a training run. The error message says RuntimeError: mat1 and mat2 shapes cannot be multiplied (128x512 and 256x1024). You stare at it. You trace backward through six function calls. You fix it, run again, and hit a different shape mismatch three layers deeper.
This is the problem that motivated Synthr. Not Python's syntax. Not its speed. Its fundamental inability to reason about tensor shapes at compile time. Synthr is a programming language designed from the ground up for machine learning, with a type system that catches shape errors before any code runs, automatic differentiation as a first-class primitive, and a compiler that generates GPU-native code.

Why Existing Languages Fall Short
The critique of Python for ML is well-worn, but most discussions focus on the wrong problems. Speed is solvable with compilation. The GIL is solvable with multiprocessing. The real problems are structural.
torch.Tensor tells you nothing about its rank, dimensions, or dtype. Shape mismatches are caught at runtime, which means they are caught late and expensively. NumPy and PyTorch have made heroic efforts with runtime checks, but the information simply is not available at the type level.A language designed for ML should make tensors, automatic differentiation, and GPU execution first-class citizens of the type system and compiler, not library features grafted onto a general-purpose language.
Language Design Goals
Synthr was designed around five non-negotiable goals. Every syntax decision, type system feature, and compiler optimization traces back to one of these.
Tensor[Float32, 128, 784] is a different type from a Tensor[Float32, 128, 512]. Shape mismatches are compile-time errors.grad keyword is a language primitive, not a library function. The compiler sees the full computation graph and can optimize the backward pass at compile time.The Type System
The type system is the heart of Synthr, and the part that required the most design iteration. The key innovation is dependent types for tensor shapes.
In a standard type system, types are fixed at compile time: int, float, String. In a dependent type system, types can depend on values. This lets you express constraints like “this tensor has the same batch dimension as that tensor” directly in the type signature.
// Synthr: A linear layer with shape-checked types
fn linear[B: Nat, In: Nat, Out: Nat](
x: Tensor[Float32, B, In],
w: Tensor[Float32, In, Out],
b: Tensor[Float32, Out]
) -> Tensor[Float32, B, Out]:
return x @ w + bThe type parameters B, In, and Out are natural number type variables. When you call linear, the compiler unifies these variables with the actual tensor shapes and verifies that all constraints are satisfied. If you pass a weight matrix with the wrong dimensions, the error appears at compile time with a clear message about which dimension does not match.
This extends to more complex patterns. Here is a multi-head attention implementation:
// Synthr: Multi-head attention with compile-time shape checking
fn multi_head_attention[B: Nat, Seq: Nat, D: Nat, H: Nat](
q: Tensor[Float32, B, Seq, D],
k: Tensor[Float32, B, Seq, D],
v: Tensor[Float32, B, Seq, D],
num_heads: Static[H],
) -> Tensor[Float32, B, Seq, D]
where D % H == 0:
let head_dim = D / H
let q_heads = reshape[B, Seq, H, head_dim](q)
let k_heads = reshape[B, Seq, H, head_dim](k)
let v_heads = reshape[B, Seq, H, head_dim](v)
let scores = einsum("bshd,bthd->bsht", q_heads, k_heads)
let weights = softmax(scores / sqrt(head_dim), axis=-1)
let output = einsum("bsht,bthd->bshd", weights, v_heads)
return reshape[B, Seq, D](output)The where D % H == 0 clause is a type-level constraint. The compiler proves at compile time that the model dimension is evenly divisible by the number of heads. If it is not, the program does not compile. In PyTorch, this would be a runtime assertion that might not trigger until you change a hyperparameter three months later.
A single shape mismatch caught at compile time saves an average of 20-60 minutes of debugging time. Over a research project with hundreds of experiments, this compounds into days of saved effort.
Automatic Differentiation as a Primitive
In Synthr, grad is a keyword, not a function call. The compiler performs source-to-source transformation to generate the backward pass at compile time.
// Synthr: Automatic differentiation
fn mse_loss(pred: Tensor[Float32, B, D], target: Tensor[Float32, B, D]) -> Scalar[Float32]:
return mean((pred - target) ** 2)
fn train_step(model: &mut Linear, x: Tensor[Float32, B, In], y: Tensor[Float32, B, Out]):
let loss, grads = grad(mse_loss, wrt=model.params)(model(x), y)
model.params -= 0.01 * gradsThe grad keyword tells the compiler to differentiate mse_loss with respect to model.params. Because the compiler sees the full computation graph statically, it can perform optimizations that a runtime tape cannot.
Compiler Architecture
Synthr compiles through four stages, from source to executable GPU code.
// Compilation pipeline (pseudocode)
source.sn -> [Parser] -> Untyped AST
-> [Type Checker + Shape Inference] -> Typed AST
-> [Lowering] -> Synthr IR (graph-based)
-> [Autodiff Transform] -> Synthr IR with backward graph
-> [Fusion + Memory Planning] -> Optimized Synthr IR
-> [LLVM Codegen] -> LLVM IR / PTX
-> [LLVM Backend] -> Native binary + GPU kernelsThe LLVM backend was the pragmatic choice. Writing a production-quality code generator from scratch would have taken years. LLVM gives Synthr access to decades of optimization passes for free. The tradeoff is compile time. A full model compilation takes 3-8 seconds, which is acceptable for training scripts but noticeable in a REPL. The JIT mitigates this by caching compiled functions across sessions.

Comparison to Existing Approaches
Synthr is not the first attempt to improve ML tooling. It is worth positioning it against the alternatives.
| Tool | Shape checking | Autodiff | GPU native | Tradeoff |
|---|---|---|---|---|
| PyTorch | Runtime only | Runtime tape | Via C++ dispatch | Maximum flexibility, minimum static guarantees |
| JAX | Runtime only | Tracing-based | Via XLA | Functional purity enables optimization, but no shape types |
| Julia | Parametric types (partial) | Library (Zygote) | Via CUDA.jl | Multiple dispatch is powerful but not shape-specialized |
| Mojo | Planned | Via MAX | Native | Systems-level performance, but ML semantics are secondary |
| Synthr | Compile-time (dependent types) | Compiler primitive | Native (LLVM/PTX) | Full static guarantees, but smaller ecosystem |
The honest weakness of Synthr is ecosystem maturity. PyTorch has thousands of pretrained models, community libraries, and production tooling. Synthr has none of that. The bet is that the safety and performance guarantees will justify the ecosystem cost for teams where GPU hours are expensive and debugging shape errors is a recurring time sink.
What I Learned About Language Design
Designing a programming language taught me things that building applications never could.
expected Tensor[Float32, 128, 512] but got Tensor[Float32, 128, 256] in argument 2 of linear() saves ten minutes of debugging. A generic type error saves nothing.A programming language is an argument about how people should think about a problem domain. Synthr argues that tensor shapes are types, differentiation is compilation, and GPU execution is not special. Whether that argument is right is something only adoption can answer.
Conclusion
Synthr is not finished. The compiler works. The type system catches real bugs. The generated code runs on GPUs. But the ecosystem is nascent, the standard library is thin, and the language needs years of use before the rough edges are smoothed.
What I am most confident about is the core design bet: that a language which treats tensor shapes as types and automatic differentiation as a compiler feature will produce better ML code than one that treats both as runtime concerns. The evidence so far, in compile-time bug catches, in generated code performance, and in the clarity of Synthr programs, supports that bet.
Building a programming language is the most humbling project I have undertaken. Every design decision has consequences that ripple through the entire system. Every shortcut in the compiler becomes a limitation that users hit months later. It has made me a better engineer in every other domain, because it forced me to think about systems at every level of abstraction simultaneously.
Synthr is an ongoing research project. If you are interested in programming language design, type systems, or ML compilers, I would love to hear from you.