⇐ Blog

Kaggle: Feature Engineering

Posted at 2 August 2025 | 7 min read

Consider “apparent temperature” measures like the heat index and the wind chill. These quantities attempt to measure the perceived temperature to humans based on air temperature, humidity, and wind speed, things which we can measure directly. You could think of an apparent temperature as the result of a kind of feature engineering, an attempt to make the observed data more relevant to what we actually care about: how it actually feels outside!

Technical note: What we’re calling uncertainty is measured using a quantity from information theory known as “entropy”. The entropy of a variable means roughly: “how many yes-or-no questions you would need to describe an occurance of that variable, on average.” The more questions you have to ask, the more uncertain you must be about the variable. Mutual information is how many questions you expect the feature to answer about the target.

Creating Features

Tips on Discovering New Features

Tips on Creating Features
It’s good to keep in mind your model’s own strengths and weaknesses when creating features. Here are some guidelines:

Clustering with K-means


Notes:


Questions:

  1. What is “relationship”? Which one is it? What kind of it?