Feature Scaling and Encoding

Why do you need F/S?

Simple - you don't want some features dominating your model on account of larger numbers attached to physical units. Large/small don't make sense when there are units attached to them. Therefore, you take the spread in the input feature and map it using normalization and standardization so that it has a range, in units of standard-deviation of about +/- 3 sigma and an average of 0.

What about encoding? That's to take care of the cases of non numerical values - names, names of classes, etc. You just assign numerical labels (1,2,3, etc)



Comments

Popular posts from this blog

The Dummy Variable Trap

Your Handy ML Reference

Efficient Learning Machines: Theories, Concepts, and Applications for Engineers and System Designers