This book describes the important ideas in a variety of fields such as medicine biology finance & marketing in a common conceptual framework While the approach is statistical the emphasis is on concepts rather than mathematics Many examples are given with a liberal use of colour graphics It is a valuable resource for statisticians & anyone interested in data mining in science or industry The book's coverage is broad from supervised learning (prediction) to unsupervised learning The many topics include neural networks support vector machines classification trees & boosting---the first comprehensive treatment of this topic in any book This major new edition features many topics not covered in the original including graphical models random forests ensemble methods least angle regression & path algorithms for the lasso non-negative matrix factorisation & spectral clustering There is also a chapter on methods for wide'' data (p bigger than n) including multiple testing & false discovery rates