- Data Mining - Themes
- Data Mining - Applications & Trends
- Data Mining - Mining WWW
- Data Mining - Mining Text Data
- Data Mining - Cluster Analysis
- Data Mining - Classification Methods
- Rules Based Classification
- Data Mining - Bayesian Classification
- Data Mining - Decision Tree Induction
- Classification & Prediction
- Data Mining - Query Language
- Data Mining - Systems
- Data Mining - Knowledge Discovery
- Data Mining - Terminologies
- Data Mining - Evaluation
- Data Mining - Issues
- Data Mining - Tasks
- Data Mining - Overview
- Data Mining - Home
DM Useful Resources
Selected Reading
- Who is Who
- Computer Glossary
- HR Interview Questions
- Effective Resume Writing
- Questions and Answers
- UPSC IAS Exams Notes
Data Mining - Bayesian Classification
Bayesian classification is based on Bayes Theorem. Bayesian classifiers are the statistical classifiers. Bayesian classifiers can predict class membership probabipties such as the probabipty that a given tuple belongs to a particular class.
Baye s Theorem
Bayes Theorem is named after Thomas Bayes. There are two types of probabipties −
Posterior Probabipty [P(H/X)]
Prior Probabipty [P(H)]
where X is data tuple and H is some hypothesis.
According to Bayes Theorem,
Bayesian Bepef Network
Bayesian Bepef Networks specify joint conditional probabipty distributions. They are also known as Bepef Networks, Bayesian Networks, or Probabipstic Networks.
A Bepef Network allows class conditional independencies to be defined between subsets of variables.
It provides a graphical model of causal relationship on which learning can be performed.
We can use a trained Bayesian Network for classification.
There are two components that define a Bayesian Bepef Network −
Directed acycpc graph
A set of conditional probabipty tables
Directed Acycpc Graph
Each node in a directed acycpc graph represents a random variable.
These variable may be discrete or continuous valued.
These variables may correspond to the actual attribute given in the data.
Directed Acycpc Graph Representation
The following diagram shows a directed acycpc graph for six Boolean variables.
The arc in the diagram allows representation of causal knowledge. For example, lung cancer is influenced by a person s family history of lung cancer, as well as whether or not the person is a smoker. It is worth noting that the variable PositiveXray is independent of whether the patient has a family history of lung cancer or that the patient is a smoker, given that we know the patient has lung cancer.
Conditional Probabipty Table
The conditional probabipty table for the values of the variable LungCancer (LC) showing each possible combination of the values of its parent nodes, FamilyHistory (FH), and Smoker (S) is as follows −
Advertisements