# Logistic loss via Bernoulli Distribution

## Ever wondered if logistic loss can be reached via Bernoulli Distribution ?

Before we start first lets get familiar with few terminologies.

# Maximum likelihood estimation

**Maximum likelihood estimation** (**MLE**) is a method of estimating the parameters of a probability distribution by maximizing a likelihood function.

OR

In simple words it is a way of getting parameters which can maximize our model.

If function is differentiable, MLE can be achieved by differentiating the model on local maxima.

# Bernoulli Distribution

What is Bernoulli Distribution ?

**Bernoulli Distribution** is the discrete probability distribution of a random variable which takes the value 1 with probability **p** and the value 0 with probability **q=1-p **and also a special case of binomial distribution.

OR

In simple words it gives you probability of class 1 (Positive) or 0 (Negative) based on the value of **k**.

# Now tell me how to reach ????

Well most of the readers would already be knowing that cross entropy and **logistic loss **has same mathematical formula and hence it can be said that logistic loss is achieved using **binary cross entropy**, so lets go in details of how to reach there via **Bernoulli distribution**.

**Logistic loss’s mathematical form :**

Lets go step by step:-

- First we have to select a model
which can be used for improving probabilities for each class.*M* - As we have seen
**Bernoulli distribution**can get us probability of class 1 or 0 so we can use this model for estimating probability for our binary classes. - Once we have chosen our model
we can apply*M***Maximum Likelihood Estimation**on it in order to improve the probabilities by updating parameters.

- Now in order to determine the value of maximum value for
we have to differentiate and get the value of*M*on local maxima.*k* - To make differentiation simpler we can apply
**log**on both sides of*eq1.*

- Now after differentiating eq2 we can get the optimal value of
*k.* - If we compare
*eq0*with*eq2*both are almost same except the negative sign and the number of iterationterm.*(N)* - Hence if we put a negative sign (-) on eq2 along with summation over
iterations it will become something like*N**Minimum Likelihood Estimation*and finally it can be used as loss function.

So we have successfully arrived at L**ogistic loss **starting with **Bernoulli distribution **and this concludes our Journey.

Thanks for your participation, I hope this article helped you in finding out some new perception of logistic loss.