How should we start in the Data Science domain? What is the correct pathway to learning this domain efficiently and productively? From where should one start learning? What are the proper resources for all the concepts to learn? We have seen all overlook of Data Science and tools, technologies required to learn this domain. We have also gone through the various terminologies and career paths in the corporate industry. Now we will look into the discussion of the above question.

The timeline of learning different skills for data science is depicted in the graph below. We will have a brief of each technology and try to understand why it is essential to learn smartly in a proper structure.

## Basic Probability

Probability is the base of all the algorithms used in Machine Learning. We came across various examples in the real scenario where we use Probability. Having a fair understanding of the Probability Distributions, Bayes Theorem and Conditional Probability is very much essential.

## Statistics

Data Analysis and Statistics are highly correlated in terms of used cases. For the proper understanding of the Data Analysis, we must have a good grasp of statistical concepts, which includes

1.Descriptive statistics

2. Inferential Statistics

We will understand more deeply in upcoming sessions.

There are plenty of resources available on the internet to learn about Probability and Statistics. We can choose according to our convenience. You can also go for the course provided by Khan Academy on Probability and Statistics, which is free.

## Python Basics

After learning Probability & statistics, it's time to apply some of the statistical concepts in Python for Data Analysis; for that, one can opt for Python and R, which are widely used in the Data Science domain.

Here is the list of some free resources for learning Python-

1. W3 Schools

2. Programiz

## Python Libraries for Data Analysis

After learning Python concepts, now it's time for the analysis of Data. But wait! Can we do with just Python core concepts?

The answer to the question is No. We need specific libraries for performing different actions.

These libraries are -

For Data Analysis - Pandas and NumPy

For Exploratory Data Analysis - Matplotlib and Seaborn

There are various other libraries, but these are some of the famous libraries used widely.

## Machine Learning

Till now, we have learnt to analyze data. But what about Future Predictions? Here are the roles of Machine Learning. Machine Learning is the ability of computers or machines to predict future trends or make decisions based on the data provided as its experience.

So now we need to learn algorithms and the mathematics involved behind each algorithm. This is one of the most crucial learning steps in the whole data science journey.

Some of the good resources or course which have a good explanation of Machine Learning are-

1. Machine learning Andrew NG Coursera

2. Machine learning A-Z Udemy

3. Javatpoint

4. Sentdex YouTube Channel

## Application of Machine learning using Scikit-Learn

Now that we have learnt Machine learning algorithms, it's time to apply them in Python. To do this, Python has a great library that takes care of every action related to Machine Learning. The library is Scikit-Learn.

It is a convenient and easy to use library with vast customization of model parameters available.

The documentation of Scikit Learn is incredible, which you can access here.

## Web-application framework

If we are done with all the above steps, we are ready to do a Data Science project. But wait! Do you want your project idea to be available and accessible to everyone. Then comes the role of Web development. Here we need to develop a small web application using a micro-framework such as Flask or Django.

After developing our web app, we can deploy it to any cloud platform such as GCP, Azure, Horeku etc.

Now our Data Science project can be accessible to everyone by a public link.

## Business Intelligence Tools

Moving ahead now, it is always a plus point to have a fair understanding of BI tools that can ease our way to some of the visualization process and sometimes handle various operations.

Some of the powerful and popular tools are -

1. Microsoft Excel

2. Tableau

Project from Kaggle

After learning all these concepts, now we can participate in various competitions and Hack-a-thons. We can get the data and Problem statement from Kaggle or some online platform and then make our self project.

This is all the timeline that one can follow to learn things efficiently and fast.

If you have any suggestions or comments, then please let us know to make our service better.

In the next session we will discuss some of the concepts and brief about Probability.

Please provide your critical feedback through CONTACT US section so as to improve our services.