## Implementing a Maximum Likelihood Classifier and using it to predict heart disease

### What is this thing about?

The main idea of Maximum Likelihood Classification is to predict the class label **y** that maximizes the likelihood of our observed data **x**. We will consider **x** as being a random vector and **y** as being a parameter (not random) on which the distribution of **x** depends. At first, we need to make an assumption about the distribution of **x** (usually a Gaussian distribution).

Then, the learning of our data consists of the following:

- We split our dataset into subsets corresponding to each label
**y**. - For each subset, we estimate the parameters of our assumed distribution for
**x**using only the data inside that subset.

When making a prediction on a new data vector **x**:

- We evaluate the PDF of our assumed distribution using our estimated parameters for each label
**y**. - Return the label
**y**for which the evaluated PDF had the maximum value.

Let’s start with a simple example considering a 1-dimensional input **x**, and 2 classes: **y = 0**, **y = 1**.

Let’s say that after we estimated our parameters both under **y = 0** and **y = 1** scenarios, we get these 2 PDFs plotted above. The blue one (**y = 0**) has mean ? = 1 and standard deviation ? = 1; the orange plot (**y = 1**) has ? = −2 and ? = 1.5. Now, if we have a new data point **x** = -1 and we want to predict the label **y**, we evaluate both PDFs: ?₀(−1) ≈ 0.05; ?₁(−1) ≈ 0.21. The biggest value is 0.21, which we got when we considered **y = 1**, so we predict label **y = 1**.

That was just a simple example, but in real-world situations, we will have more input variables that we want to use in order to make predictions. So, we need a Multivariate Gaussian distribution, which has the following PDF:

For this method to work, the covariance matrix Σ should be **positive definite**; i.e. it should be symmetric and all eigenvalues should be positive. The covariance matrix Σ is the matrix that contains the covariances between all pairs of components of x: Σ?? = ???(??,??). So, it is a symmetric matrix as ???(??,??) = ???(??,??), and therefore all we have to check is that all eigenvalues are positive; otherwise, we will show a warning. If there are more observations than variables and the variables don’t have a high correlation between them, this condition should be met, Σ should be positive definite.

### Now, let’s implement it

### Using MLClassifier to predict heart disease

For this task, we will use the dataset provided here. This dataset consists of a csv file which has 303 rows, each one has 13 columns that we can use for prediction and 1 label column. A short description of each field is shown in the table below:

We got **80.33%** test accuracy. Although this method doesn’t give an accuracy as good as others, I still think that it is an interesting way of thinking about the problem that gives reasonable results for its simplicity.

The Jupyter notebook can be found here.

*I hope you found this information useful and thanks for reading!*

This article is also posted on Medium here. Feel free to have a look!