Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

Modern ML Practice in 2026

3 minute read

Published:

This is a short snapshot of practical machine learning work as of 2026. The biggest shift is that many projects now start from pretrained foundation models, but the hard work is still data, evaluation, latency, cost, reliability, and deployment.

Regularization

3 minute read

Published:

Regularization is any training choice that helps a model generalize instead of only memorizing the training set. It can be a penalty in the loss, noise during training, constraints on parameters, better data augmentation, or a validation-based stopping rule.

Embeddings

3 minute read

Published:

Machine learning models do not understand raw text directly. They need text to be converted into numeric vectors. An embedding is a learned vector representation for a token, word, sentence, document, image, or other object.

Python in Detail

3 minute read

Published:

This note reviews Python details that matter in day-to-day ML engineering: the data model, dictionaries, memory management, and concurrency choices.

Optimizers

3 minute read

Published:

Optimizers update model parameters using gradients. The optimizer matters, but it is only one part of the recipe: initialization, normalization, batch size, learning-rate schedule, warmup, gradient clipping, weight decay, and data quality often matter just as much.

ML: Normalization

2 minute read

Published:

Normalization makes optimization easier by controlling the scale and distribution of activations, features, weights, or gradients. The right normalization depends on the architecture and batch regime.

Ensemble Methods

2 minute read

Published:

Ensembles combine multiple models to improve generalization. The main idea is to reduce variance, bias, or both by making predictions from many weaker learners.

Activation Functions

1 minute read

Published:

Activation functions introduce nonlinearity. Without them, a deep network is still equivalent to one linear transformation.

competitions