| Part of a series on | 
| Machine learning and data mining  | 
|---|
The following outline is provided as an overview of and topical guide to machine learning:
Machine learning – subfield of soft computing within computer science that evolved from the study of pattern recognition and computational learning theory in artificial intelligence.[1] In 1959, Arthur Samuel defined machine learning as a "field of study that gives computers the ability to learn without being explicitly programmed".[2] Machine learning explores the study and construction of algorithms that can learn from and make predictions on data.[3] Such algorithms operate by building a model from an example training set of input observations in order to make data-driven predictions or decisions expressed as outputs, rather than following strictly static program instructions.
What type of thing is machine learning?
- An academic discipline
 - A branch of science
- An applied science
- A subfield of computer science
- A branch of artificial intelligence
 - A subfield of soft computing
 
 - Application of statistics
 
 - A subfield of computer science
 
 - An applied science
 
Branches of machine learning
Subfields of machine learning
- Computational learning theory – studying the design and analysis of machine learning algorithms.[4]
 - Grammar induction
 - Meta-learning
 
Cross-disciplinary fields involving machine learning
Applications of machine learning
- Applications of machine learning
 - Bioinformatics
 - Biomedical informatics
 - Computer vision
 - Customer relationship management –
 - Data mining
 - Earth sciences
 - Email filtering
 - Inverted pendulum – balance and equilibrium system.
 - Natural language processing (NLP)
 - Pattern recognition
 - Recommendation system
- Collaborative filtering
 - Content-based filtering
 - Hybrid recommender systems (Collaborative and content-based filtering)
 
 - Search engine
 - Social Engineering
 
Machine learning hardware
Machine learning tools
Machine learning frameworks
Proprietary machine learning frameworks
- Amazon Machine Learning
 - Microsoft Azure Machine Learning Studio
 - DistBelief – replaced by TensorFlow
 
Open source machine learning frameworks
- Apache Singa
 - Apache MXNet
 - Caffe
 - PyTorch
 - mlpack
 - TensorFlow
 - Torch
 - CNTK
 - Accord.Net
 - Jax
 - MLJ.jl – A machine learning framework for Julia
 
Machine learning libraries
Machine learning algorithms
- Almeida–Pineda recurrent backpropagation
 - ALOPEX
 - Backpropagation
 - Bootstrap aggregating
 - CN2 algorithm
 - Constructing skill trees
 - Dehaene–Changeux model
 - Diffusion map
 - Dominance-based rough set approach
 - Dynamic time warping
 - Error-driven learning
 - Evolutionary multimodal optimization
 - Expectation–maximization algorithm
 - FastICA
 - Forward–backward algorithm
 - GeneRec
 - Genetic Algorithm for Rule Set Production
 - Growing self-organizing map
 - Hyper basis function network
 - IDistance
 - K-nearest neighbors algorithm
 - Kernel methods for vector output
 - Kernel principal component analysis
 - Leabra
 - Linde–Buzo–Gray algorithm
 - Local outlier factor
 - Logic learning machine
 - LogitBoost
 - Manifold alignment
 - Markov chain Monte Carlo (MCMC)
 - Minimum redundancy feature selection
 - Mixture of experts
 - Multiple kernel learning
 - Non-negative matrix factorization
 - Online machine learning
 - Out-of-bag error
 - Prefrontal cortex basal ganglia working memory
 - PVLV
 - Q-learning
 - Quadratic unconstrained binary optimization
 - Query-level feature
 - Quickprop
 - Radial basis function network
 - Randomized weighted majority algorithm
 - Reinforcement learning
 - Repeated incremental pruning to produce error reduction (RIPPER)
 - Rprop
 - Rule-based machine learning
 - Skill chaining
 - Sparse PCA
 - State–action–reward–state–action
 - Stochastic gradient descent
 - Structured kNN
 - T-distributed stochastic neighbor embedding
 - Temporal difference learning
 - Wake-sleep algorithm
 - Weighted majority algorithm (machine learning)
 
Machine learning methods
Instance-based algorithm
- K-nearest neighbors algorithm (KNN)
 - Learning vector quantization (LVQ)
 - Self-organizing map (SOM)
 
Regression analysis
Dimensionality reduction
- Canonical correlation analysis (CCA)
 - Factor analysis
 - Feature extraction
 - Feature selection
 - Independent component analysis (ICA)
 - Linear discriminant analysis (LDA)
 - Multidimensional scaling (MDS)
 - Non-negative matrix factorization (NMF)
 - Partial least squares regression (PLSR)
 - Principal component analysis (PCA)
 - Principal component regression (PCR)
 - Projection pursuit
 - Sammon mapping
 - t-distributed stochastic neighbor embedding (t-SNE)
 
Ensemble learning
- AdaBoost
 - Boosting
 - Bootstrap aggregating (Bagging)
 - Ensemble averaging – process of creating multiple models and combining them to produce a desired output, as opposed to creating just one model. Frequently an ensemble of models performs better than any individual model, because the various errors of the models "average out."
 - Gradient boosted decision tree (GBDT)
 - Gradient boosting machine (GBM)
 - Random Forest
 - Stacked Generalization (blending)
 
Meta-learning
Reinforcement learning
Supervised learning
- Averaged one-dependence estimators (AODE)
 - Artificial neural network
 - Case-based reasoning
 - Gaussian process regression
 - Gene expression programming
 - Group method of data handling (GMDH)
 - Inductive logic programming
 - Instance-based learning
 - Lazy learning
 - Learning Automata
 - Learning Vector Quantization
 - Logistic Model Tree
 - Minimum message length (decision trees, decision graphs, etc.)
 - Probably approximately correct learning (PAC) learning
 - Ripple down rules, a knowledge acquisition methodology
 - Symbolic machine learning algorithms
 - Support vector machines
 - Random Forests
 - Ensembles of classifiers
 - Ordinal classification
 - Conditional Random Field
 - ANOVA
 - Quadratic classifiers
 - k-nearest neighbor
 - Boosting
- SPRINT
 
 - Bayesian networks
 - Hidden Markov models
 
Bayesian
- Bayesian knowledge base
 - Naive Bayes
 - Gaussian Naive Bayes
 - Multinomial Naive Bayes
 - Averaged One-Dependence Estimators (AODE)
 - Bayesian Belief Network (BBN)
 - Bayesian Network (BN)
 
Decision tree algorithms
Decision tree algorithm
- Decision tree
 - Classification and regression tree (CART)
 - Iterative Dichotomiser 3 (ID3)
 - C4.5 algorithm
 - C5.0 algorithm
 - Chi-squared Automatic Interaction Detection (CHAID)
 - Decision stump
 - Conditional decision tree
 - ID3 algorithm
 - Random forest
 - SLIQ
 
Linear classifier
Unsupervised learning
- Expectation-maximization algorithm
 - Vector Quantization
 - Generative topographic map
 - Information bottleneck method
 - Association rule learning algorithms
 
Artificial neural networks
Association rule learning
Hierarchical clustering
Cluster analysis
Anomaly detection
Semi-supervised learning
- Active learning – special case of semi-supervised learning in which a learning algorithm is able to interactively query the user (or some other information source) to obtain the desired outputs at new data points.[5][6]
 - Generative models
 - Low-density separation
 - Graph-based methods
 - Co-training
 - Transduction
 
Deep learning
Other machine learning methods and problems
- Anomaly detection
 - Association rules
 - Bias-variance dilemma
 - Classification
 - Clustering
 - Data Pre-processing
 - Empirical risk minimization
 - Feature engineering
 - Feature learning
 - Learning to rank
 - Occam learning
 - Online machine learning
 - PAC learning
 - Regression
 - Reinforcement Learning
 - Semi-supervised learning
 - Statistical learning
 - Structured prediction
 - Unsupervised learning
 - VC theory
 
Machine learning research
History of machine learning
Machine learning projects
Machine learning projects
Machine learning organizations
Machine learning organizations
Machine learning conferences and workshops
- Artificial Intelligence and Security (AISec) (co-located workshop with CCS)
 - Conference on Neural Information Processing Systems (NIPS)
 - ECML PKDD
 - International Conference on Machine Learning (ICML)
 - ML4ALL (Machine Learning For All)
 
Machine learning publications
Books on machine learning
- Mathematics for Machine Learning
 - Hands-On Machine Learning Scikit-Learn, Keras, and TensorFlow
 - The Hundred-Page Machine Learning Book
 
Machine learning journals
Persons influential in machine learning
- Alberto Broggi
 - Andrei Knyazev
 - Andrew McCallum
 - Andrew Ng
 - Anuraag Jain
 - Armin B. Cremers
 - Ayanna Howard
 - Barney Pell
 - Ben Goertzel
 - Ben Taskar
 - Bernhard Schölkopf
 - Brian D. Ripley
 - Christopher G. Atkeson
 - Corinna Cortes
 - Demis Hassabis
 - Douglas Lenat
 - Eric Xing
 - Ernst Dickmanns
 - Geoffrey Hinton – co-inventor of the backpropagation and contrastive divergence training algorithms
 - Hans-Peter Kriegel
 - Hartmut Neven
 - Heikki Mannila
 - Ian Goodfellow – Father of Generative & adversarial networks
 - Jacek M. Zurada
 - Jaime Carbonell
 - Jeremy Slovak
 - Jerome H. Friedman
 - John D. Lafferty
 - John Platt – invented SMO and Platt scaling
 - Julie Beth Lovins
 - Jürgen Schmidhuber
 - Karl Steinbuch
 - Katia Sycara
 - Leo Breiman – invented bagging and random forests
 - Lise Getoor
 - Luca Maria Gambardella
 - Léon Bottou
 - Marcus Hutter
 - Mehryar Mohri
 - Michael Collins
 - Michael I. Jordan
 - Michael L. Littman
 - Nando de Freitas
 - Ofer Dekel
 - Oren Etzioni
 - Pedro Domingos
 - Peter Flach
 - Pierre Baldi
 - Pushmeet Kohli
 - Ray Kurzweil
 - Rayid Ghani
 - Ross Quinlan
 - Salvatore J. Stolfo
 - Sebastian Thrun
 - Selmer Bringsjord
 - Sepp Hochreiter
 - Shane Legg
 - Stephen Muggleton
 - Steve Omohundro
 - Tom M. Mitchell
 - Trevor Hastie
 - Vasant Honavar
 - Vladimir Vapnik – co-inventor of the SVM and VC theory
 - Yann LeCun – invented convolutional neural networks
 - Yasuo Matsuyama
 - Yoshua Bengio
 - Zoubin Ghahramani
 
See also
- Outline of artificial intelligence
 - Outline of robotics
 - Accuracy paradox
 - Action model learning
 - Activation function
 - Activity recognition
 - ADALINE
 - Adaptive neuro fuzzy inference system
 - Adaptive resonance theory
 - Additive smoothing
 - Adjusted mutual information
 - AIVA
 - AIXI
 - AlchemyAPI
 - AlexNet
 - Algorithm selection
 - Algorithmic inference
 - Algorithmic learning theory
 - AlphaGo
 - AlphaGo Zero
 - Alternating decision tree
 - Apprenticeship learning
 - Causal Markov condition
 - Competitive learning
 - Concept learning
 - Decision tree learning
 - Differentiable programming
 - Distribution learning theory
 - Eager learning
 - End-to-end reinforcement learning
 - Error tolerance (PAC learning)
 - Explanation-based learning
 - Feature
 - GloVe
 - Hyperparameter
 - Inferential theory of learning
 - Learning automata
 - Learning classifier system
 - Learning rule
 - Learning with errors
 - M-Theory (learning framework)
 - Machine learning control
 - Machine learning in bioinformatics
 - Margin
 - Markov chain geostatistics
 - Markov chain Monte Carlo (MCMC)
 - Markov information source
 - Markov logic network
 - Markov model
 - Markov random field
 - Markovian discrimination
 - Maximum-entropy Markov model
 - Multi-armed bandit
 - Multi-task learning
 - Multilinear subspace learning
 - Multimodal learning
 - Multiple instance learning
 - Multiple-instance learning
 - Never-Ending Language Learning
 - Offline learning
 - Parity learning
 - Population-based incremental learning
 - Predictive learning
 - Preference learning
 - Proactive learning
 - Proximal gradient methods for learning
 - Semantic analysis
 - Similarity learning
 - Sparse dictionary learning
 - Stability (learning theory)
 - Statistical learning theory
 - Statistical relational learning
 - Tanagra
 - Transfer learning
 - Variable-order Markov model
 - Version space learning
 - Waffles
 - Weka
 - Loss function
 - Low-energy adaptive clustering hierarchy
 
Other
- Anne O'Tate
 - Ant colony optimization algorithms
 - Anthony Levandowski
 - Anti-unification (computer science)
 - Apache Flume
 - Apache Giraph
 - Apache Mahout
 - Apache SINGA
 - Apache Spark
 - Apache SystemML
 - Aphelion (software)
 - Arabic Speech Corpus
 - Archetypal analysis
 - Arthur Zimek
 - Artificial ants
 - Artificial bee colony algorithm
 - Artificial development
 - Artificial immune system
 - Astrostatistics
 - Averaged one-dependence estimators
 - Bag-of-words model
 - Balanced clustering
 - Ball tree
 - Base rate
 - Bat algorithm
 - Baum–Welch algorithm
 - Bayesian hierarchical modeling
 - Bayesian interpretation of kernel regularization
 - Bayesian optimization
 - Bayesian structural time series
 - Bees algorithm
 - Behavioral clustering
 - Bernoulli scheme
 - Bias–variance tradeoff
 - Biclustering
 - BigML
 - Binary classification
 - Bing Predicts
 - Bio-inspired computing
 - Biogeography-based optimization
 - Biplot
 - Bondy's theorem
 - Bongard problem
 - Bradley–Terry model
 - BrownBoost
 - Brown clustering
 - Burst error
 - CBCL (MIT)
 - CIML community portal
 - CMA-ES
 - CURE data clustering algorithm
 - Cache language model
 - Calibration (statistics)
 - Canonical correspondence analysis
 - Canopy clustering algorithm
 - Cascading classifiers
 - Category utility
 - CellCognition
 - Cellular evolutionary algorithm
 - Chi-square automatic interaction detection
 - Chromosome (genetic algorithm)
 - Classifier chains
 - Cleverbot
 - Clonal selection algorithm
 - Cluster-weighted modeling
 - Clustering high-dimensional data
 - Clustering illusion
 - CoBoosting
 - Cobweb (clustering)
 - Cognitive computer
 - Cognitive robotics
 - Collostructional analysis
 - Common-method variance
 - Complete-linkage clustering
 - Computer-automated design
 - Concept class
 - Concept drift
 - Conference on Artificial General Intelligence
 - Conference on Knowledge Discovery and Data Mining
 - Confirmatory factor analysis
 - Confusion matrix
 - Congruence coefficient
 - Connect (computer system)
 - Consensus clustering
 - Constrained clustering
 - Constrained conditional model
 - Constructive cooperative coevolution
 - Correlation clustering
 - Correspondence analysis
 - Cortica
 - Coupled pattern learner
 - Cross-entropy method
 - Cross-validation (statistics)
 - Crossover (genetic algorithm)
 - Cuckoo search
 - Cultural algorithm
 - Cultural consensus theory
 - Curse of dimensionality
 - DADiSP
 - DARPA LAGR Program
 - Darkforest
 - Dartmouth workshop
 - DarwinTunes
 - Data Mining Extensions
 - Data exploration
 - Data pre-processing
 - Data stream clustering
 - Dataiku
 - Davies–Bouldin index
 - Decision boundary
 - Decision list
 - Decision tree model
 - Deductive classifier
 - DeepArt
 - DeepDream
 - Deep Web Technologies
 - Defining length
 - Dendrogram
 - Dependability state model
 - Detailed balance
 - Determining the number of clusters in a data set
 - Detrended correspondence analysis
 - Developmental robotics
 - Diffbot
 - Differential evolution
 - Discrete phase-type distribution
 - Discriminative model
 - Dissociated press
 - Distributed R
 - Dlib
 - Document classification
 - Documenting Hate
 - Domain adaptation
 - Doubly stochastic model
 - Dual-phase evolution
 - Dunn index
 - Dynamic Bayesian network
 - Dynamic Markov compression
 - Dynamic topic model
 - Dynamic unobserved effects model
 - EDLUT
 - ELKI
 - Edge recombination operator
 - Effective fitness
 - Elastic map
 - Elastic matching
 - Elbow method (clustering)
 - Emergent (software)
 - Encog
 - Entropy rate
 - Erkki Oja
 - Eurisko
 - European Conference on Artificial Intelligence
 - Evaluation of binary classifiers
 - Evolution strategy
 - Evolution window
 - Evolutionary Algorithm for Landmark Detection
 - Evolutionary algorithm
 - Evolutionary art
 - Evolutionary music
 - Evolutionary programming
 - Evolvability (computer science)
 - Evolved antenna
 - Evolver (software)
 - Evolving classification function
 - Expectation propagation
 - Exploratory factor analysis
 - F1 score
 - FLAME clustering
 - Factor analysis of mixed data
 - Factor graph
 - Factor regression model
 - Factored language model
 - Farthest-first traversal
 - Fast-and-frugal trees
 - Feature Selection Toolbox
 - Feature hashing
 - Feature scaling
 - Feature vector
 - Firefly algorithm
 - First-difference estimator
 - First-order inductive learner
 - Fish School Search
 - Fisher kernel
 - Fitness approximation
 - Fitness function
 - Fitness proportionate selection
 - Fluentd
 - Folding@home
 - Formal concept analysis
 - Forward algorithm
 - Fowlkes–Mallows index
 - Frederick Jelinek
 - Frrole
 - Functional principal component analysis
 - GATTO
 - GLIMMER
 - Gary Bryce Fogel
 - Gaussian adaptation
 - Gaussian process
 - Gaussian process emulator
 - Gene prediction
 - General Architecture for Text Engineering
 - Generalization error
 - Generalized canonical correlation
 - Generalized filtering
 - Generalized iterative scaling
 - Generalized multidimensional scaling
 - Generative adversarial network
 - Generative model
 - Genetic algorithm
 - Genetic algorithm scheduling
 - Genetic algorithms in economics
 - Genetic fuzzy systems
 - Genetic memory (computer science)
 - Genetic operator
 - Genetic programming
 - Genetic representation
 - Geographical cluster
 - Gesture Description Language
 - Geworkbench
 - Glossary of artificial intelligence
 - Glottochronology
 - Golem (ILP)
 - Google matrix
 - Grafting (decision trees)
 - Gramian matrix
 - Grammatical evolution
 - Granular computing
 - GraphLab
 - Graph kernel
 - Gremlin (programming language)
 - Growth function
 - HUMANT (HUManoid ANT) algorithm
 - Hammersley–Clifford theorem
 - Harmony search
 - Hebbian theory
 - Hidden Markov random field
 - Hidden semi-Markov model
 - Hierarchical hidden Markov model
 - Higher-order factor analysis
 - Highway network
 - Hinge loss
 - Holland's schema theorem
 - Hopkins statistic
 - Hoshen–Kopelman algorithm
 - Huber loss
 - IRCF360
 - Ian Goodfellow
 - Ilastik
 - Ilya Sutskever
 - Immunocomputing
 - Imperialist competitive algorithm
 - Inauthentic text
 - Incremental decision tree
 - Induction of regular languages
 - Inductive bias
 - Inductive probability
 - Inductive programming
 - Influence diagram
 - Information Harvesting
 - Information gain in decision trees
 - Information gain ratio
 - Inheritance (genetic algorithm)
 - Instance selection
 - Intel RealSense
 - Interacting particle system
 - Interactive machine translation
 - International Joint Conference on Artificial Intelligence
 - International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics
 - International Semantic Web Conference
 - Iris flower data set
 - Island algorithm
 - Isotropic position
 - Item response theory
 - Iterative Viterbi decoding
 - JOONE
 - Jabberwacky
 - Jaccard index
 - Jackknife variance estimates for random forest
 - Java Grammatical Evolution
 - Joseph Nechvatal
 - Jubatus
 - Julia (programming language)
 - Junction tree algorithm
 - K-SVD
 - K-means++
 - K-medians clustering
 - K-medoids
 - KNIME
 - KXEN Inc.
 - K q-flats
 - Kaggle
 - Kalman filter
 - Katz's back-off model
 - Kernel adaptive filter
 - Kernel density estimation
 - Kernel eigenvoice
 - Kernel embedding of distributions
 - Kernel method
 - Kernel perceptron
 - Kernel random forest
 - Kinect
 - Klaus-Robert Müller
 - Kneser–Ney smoothing
 - Knowledge Vault
 - Knowledge integration
 - LIBSVM
 - LPBoost
 - Labeled data
 - LanguageWare
 - Language identification in the limit
 - Language model
 - Large margin nearest neighbor
 - Latent Dirichlet allocation
 - Latent class model
 - Latent semantic analysis
 - Latent variable
 - Latent variable model
 - Lattice Miner
 - Layered hidden Markov model
 - Learnable function class
 - Least squares support vector machine
 - Leave-one-out error
 - Leslie P. Kaelbling
 - Linear genetic programming
 - Linear predictor function
 - Linear separability
 - Lingyun Gu
 - Linkurious
 - Lior Ron (business executive)
 - List of genetic algorithm applications
 - List of metaphor-based metaheuristics
 - List of text mining software
 - Local case-control sampling
 - Local independence
 - Local tangent space alignment
 - Locality-sensitive hashing
 - Log-linear model
 - Logistic model tree
 - Low-rank approximation
 - Low-rank matrix approximations
 - MATLAB
 - MIMIC (immunology)
 - MXNet
 - Mallet (software project)
 - Manifold regularization
 - Margin-infused relaxed algorithm
 - Margin classifier
 - Mark V. Shaney
 - Massive Online Analysis
 - Matrix regularization
 - Matthews correlation coefficient
 - Mean shift
 - Mean squared error
 - Mean squared prediction error
 - Measurement invariance
 - Medoid
 - MeeMix
 - Melomics
 - Memetic algorithm
 - Meta-optimization
 - Mexican International Conference on Artificial Intelligence
 - Michael Kearns (computer scientist)
 - MinHash
 - Mixture model
 - Mlpy
 - Models of DNA evolution
 - Moral graph
 - Mountain car problem
 - Movidius
 - Multi-armed bandit
 - Multi-label classification
 - Multi expression programming
 - Multiclass classification
 - Multidimensional analysis
 - Multifactor dimensionality reduction
 - Multilinear principal component analysis
 - Multiple correspondence analysis
 - Multiple discriminant analysis
 - Multiple factor analysis
 - Multiple sequence alignment
 - Multiplicative weight update method
 - Multispectral pattern recognition
 - Mutation (genetic algorithm)
 - MysteryVibe
 - N-gram
 - NOMINATE (scaling method)
 - Native-language identification
 - Natural Language Toolkit
 - Natural evolution strategy
 - Nearest-neighbor chain algorithm
 - Nearest centroid classifier
 - Nearest neighbor search
 - Neighbor joining
 - Nest Labs
 - NetMiner
 - NetOwl
 - Neural Designer
 - Neural Engineering Object
 - Neural modeling fields
 - Neural network software
 - NeuroSolutions
 - Neuroevolution
 - Neuroph
 - Niki.ai
 - Noisy channel model
 - Noisy text analytics
 - Nonlinear dimensionality reduction
 - Novelty detection
 - Nuisance variable
 - One-class classification
 - Onnx
 - OpenNLP
 - Optimal discriminant analysis
 - Oracle Data Mining
 - Orange (software)
 - Ordination (statistics)
 - Overfitting
 - PROGOL
 - PSIPRED
 - Pachinko allocation
 - PageRank
 - Parallel metaheuristic
 - Parity benchmark
 - Part-of-speech tagging
 - Particle swarm optimization
 - Path dependence
 - Pattern language (formal languages)
 - Peltarion Synapse
 - Perplexity
 - Persian Speech Corpus
 - Picas (app)
 - Pietro Perona
 - Pipeline Pilot
 - Piranha (software)
 - Pitman–Yor process
 - Plate notation
 - Polynomial kernel
 - Pop music automation
 - Population process
 - Portable Format for Analytics
 - Predictive Model Markup Language
 - Predictive state representation
 - Preference regression
 - Premature convergence
 - Principal geodesic analysis
 - Prior knowledge for pattern recognition
 - Prisma (app)
 - Probabilistic Action Cores
 - Probabilistic context-free grammar
 - Probabilistic latent semantic analysis
 - Probabilistic soft logic
 - Probability matching
 - Probit model
 - Product of experts
 - Programming with Big Data in R
 - Proper generalized decomposition
 - Pruning (decision trees)
 - Pushpak Bhattacharyya
 - Q methodology
 - Qloo
 - Quality control and genetic algorithms
 - Quantum Artificial Intelligence Lab
 - Queueing theory
 - Quick, Draw!
 - R (programming language)
 - Rada Mihalcea
 - Rademacher complexity
 - Radial basis function kernel
 - Rand index
 - Random indexing
 - Random projection
 - Random subspace method
 - Ranking SVM
 - RapidMiner
 - Rattle GUI
 - Raymond Cattell
 - Reasoning system
 - Regularization perspectives on support vector machines
 - Relational data mining
 - Relationship square
 - Relevance vector machine
 - Relief (feature selection)
 - Renjin
 - Repertory grid
 - Representer theorem
 - Reward-based selection
 - Richard Zemel
 - Right to explanation
 - RoboEarth
 - Robust principal component analysis
 - RuleML Symposium
 - Rule induction
 - Rules extraction system family
 - SAS (software)
 - SNNS
 - SPSS Modeler
 - SUBCLU
 - Sample complexity
 - Sample exclusion dimension
 - Santa Fe Trail problem
 - Savi Technology
 - Schema (genetic algorithms)
 - Search-based software engineering
 - Selection (genetic algorithm)
 - Self-Service Semantic Suite
 - Semantic folding
 - Semantic mapping (statistics)
 - Semidefinite embedding
 - Sense Networks
 - Sensorium Project
 - Sequence labeling
 - Sequential minimal optimization
 - Shattered set
 - Shogun (toolbox)
 - Silhouette (clustering)
 - SimHash
 - SimRank
 - Similarity measure
 - Simple matching coefficient
 - Simultaneous localization and mapping
 - Sinkov statistic
 - Sliced inverse regression
 - Snakes and Ladders
 - Soft independent modelling of class analogies
 - Soft output Viterbi algorithm
 - Solomonoff's theory of inductive inference
 - SolveIT Software
 - Spectral clustering
 - Spike-and-slab variable selection
 - Statistical machine translation
 - Statistical parsing
 - Statistical semantics
 - Stefano Soatto
 - Stephen Wolfram
 - Stochastic block model
 - Stochastic cellular automaton
 - Stochastic diffusion search
 - Stochastic grammar
 - Stochastic matrix
 - Stochastic universal sampling
 - Stress majorization
 - String kernel
 - Structural equation modeling
 - Structural risk minimization
 - Structured sparsity regularization
 - Structured support vector machine
 - Subclass reachability
 - Sufficient dimension reduction
 - Sukhotin's algorithm
 - Sum of absolute differences
 - Sum of absolute transformed differences
 - Swarm intelligence
 - Switching Kalman filter
 - Symbolic regression
 - Synchronous context-free grammar
 - Syntactic pattern recognition
 - TD-Gammon
 - TIMIT
 - Teaching dimension
 - Teuvo Kohonen
 - Textual case-based reasoning
 - Theory of conjoint measurement
 - Thomas G. Dietterich
 - Thurstonian model
 - Topic model
 - Tournament selection
 - Training, test, and validation sets
 - Transiogram
 - Trax Image Recognition
 - Trigram tagger
 - Truncation selection
 - Tucker decomposition
 - UIMA
 - UPGMA
 - Ugly duckling theorem
 - Uncertain data
 - Uniform convergence in probability
 - Unique negative dimension
 - Universal portfolio algorithm
 - User behavior analytics
 - VC dimension
 - VIGRA
 - Validation set
 - Vapnik–Chervonenkis theory
 - Variable-order Bayesian network
 - Variable kernel density estimation
 - Variable rules analysis
 - Variational message passing
 - Varimax rotation
 - Vector quantization
 - Vicarious (company)
 - Viterbi algorithm
 - Vowpal Wabbit
 - WACA clustering algorithm
 - WPGMA
 - Ward's method
 - Weasel program
 - Whitening transformation
 - Winnow (algorithm)
 - Win–stay, lose–switch
 - Witness set
 - Wolfram Language
 - Wolfram Mathematica
 - Writer invariant
 - Xgboost
 - Yooreeka
 - Zeroth (software)
 
Further reading
- Trevor Hastie, Robert Tibshirani and Jerome H. Friedman (2001). The Elements of Statistical Learning, Springer. ISBN 0-387-95284-5.
 - Pedro Domingos (September 2015), The Master Algorithm, Basic Books, ISBN 978-0-465-06570-7
 - Mehryar Mohri, Afshin Rostamizadeh, Ameet Talwalkar (2012). Foundations of Machine Learning, The MIT Press. ISBN 978-0-262-01825-8.
 - Ian H. Witten and Eibe Frank (2011). Data Mining: Practical machine learning tools and techniques Morgan Kaufmann, 664pp., ISBN 978-0-12-374856-0.
 - David J. C. MacKay. Information Theory, Inference, and Learning Algorithms Cambridge: Cambridge University Press, 2003. ISBN 0-521-64298-1
 - Richard O. Duda, Peter E. Hart, David G. Stork (2001) Pattern classification (2nd edition), Wiley, New York, ISBN 0-471-05669-3.
 - Christopher Bishop (1995). Neural Networks for Pattern Recognition, Oxford University Press. ISBN 0-19-853864-2.
 - Vladimir Vapnik (1998). Statistical Learning Theory. Wiley-Interscience, ISBN 0-471-03003-1.
 - Ray Solomonoff, An Inductive Inference Machine, IRE Convention Record, Section on Information Theory, Part 2, pp., 56–62, 1957.
 - Ray Solomonoff, "An Inductive Inference Machine" A privately circulated report from the 1956 Dartmouth Summer Research Conference on AI.
 
References
- ↑ http://www.britannica.com/EBchecked/topic/1116194/machine-learning This tertiary source reuses information from other sources but does not name them.
 - ↑ Phil Simon (March 18, 2013). Too Big to Ignore: The Business Case for Big Data. Wiley. p. 89. ISBN 978-1-118-63817-0.
 - ↑ Ron Kohavi; Foster Provost (1998). "Glossary of terms". Machine Learning. 30: 271–274. doi:10.1023/A:1007411609915.
 - ↑ "ACL - Association for Computational Learning".
 - ↑ Settles, Burr (2010), "Active Learning Literature Survey" (PDF), Computer Sciences Technical Report 1648. University of Wisconsin–Madison, retrieved 2014-11-18
 - ↑ Rubens, Neil; Elahi, Mehdi; Sugiyama, Masashi; Kaplan, Dain (2016). "Active Learning in Recommender Systems". In Ricci, Francesco; Rokach, Lior; Shapira, Bracha (eds.). Recommender Systems Handbook (2 ed.). Springer US. doi:10.1007/978-1-4899-7637-6. hdl:11311/1006123. ISBN 978-1-4899-7637-6. S2CID 11569603.
 
External links
- Data Science: Data to Insights from MIT (machine learning)
 - Popular online course by Andrew Ng, at Coursera. It uses GNU Octave. The course is a free version of Stanford University's actual course taught by Ng, see.stanford.edu/Course/CS229 available for free].
 - mloss is an academic database of open-source machine learning software.