Home Artificial Intelligence Introduction to Python for Data Science

Introduction to Python for Data Science

6
Introduction to Python for Data Science

Source: https://thedatascientist.com/data-science-considered-own-discipline/

ython is a robust high level object oriented programming language with an easy syntax. It has many applications but the most important ones are web development, software development and data science. Data science is a field where meaningful insights are extracted from data to permit for decision making and planning in businesses. It combines math, statistics, programming, machine learning, artificial intelligence and advanced analytics.

The information science project life cycle consists of varied processes which include: data collection, data cleansing, exploratory data evaluation, model constructing and model deployment. Python has multiple libraries which facilitates these processes hence making it suitable for data science. Examples of those libraries are pandas for data evaluation, wrangling and cleansing, matplotlib and seaborn for data visualization, tensor flow and scikit-learn for machine learning, keras and pyTorch for deep learning, SciPy and NumPy for mathematical computations and plenty of others.

Data Science project life cycle
Data Science project life cycle. Source:https://towardsdatascience.com/the-data-science-process-a19eb7ebc41b

Other aspects that make python the popular language for data science are: the easy syntax hence it is simple to learn, it’s open-source, it allows for test driven development and it’s compatible with multiple operating systems corresponding to Windows, Linux and MacOS.

To work with python one has to first download it then work of their preferred Integrated Development Environment (IDE) for instance IDLE, Visual studio code, Spyder, Jupyter notebooks and plenty of more.

One may also work on an internet based IDE hence there will likely be no have to download any. Examples of this are: Google Colab, Jupyter lab etc.

To learn python for data science one first has to learn python fundamentals which include data types, operators, sequences/ compound data structures, conditional statements, loops, functions and external libraries.

comprise of:

Strings: sequence of characters put in single or double quotes. Written as str in python. Examples: ‘apple’, “python”, ‘1’, ‘1+3’, “3.65”, “”, “ ”.

#To ascertain data type in python
fruit='apple'
print(type(fruit))

Integers:These are whole numbers that may be either positive, negative or 0. Written as int in python. Examples: -3, 2,0.

#To ascertain data type in python
number = -3
print(type(number))

Boolean: These are truth and false values. Written as bool in python. Examples: True, False, 1, 1.0, 0, 0.0.

#To ascertain data type in python
a = True
print(type(a))

Floating point numbers:These are numbers expressed as decimals that may be either positive or negative. Written as float in python. Examples: 3.3, -3.3, 5e10, -5e10, 4., 0.0 and plenty of more.

#To ascertain data type in python
m = -5e10
print(type(m))

store collections of knowledge in a single variable and comprise of:

Lists: These are ordered, indexed, changeable collections of knowledge enclosed in square brackets and separated by commas e.g. list_A=[‘apple’, 1, True, 3.3]

#To ascertain type which needs to be list
list_A=['apple', 1, True, 3.3]
print(type(list_A))

Sets: These are unordered and unindexed collections of knowledge enclosed in curly brackets and separated by commas which may not have duplicates e.g. Set_A= {‘apple’, 0, True, 3.3}

#To ascertain type which needs to be set
Set_A= {'apple', 0, True, 3.3}
print(type(Set_A))

Tuples:These are ordered, unchangeable and indexed collections of knowledge enclosed in parenthesis and separated by commas e.g. Tuple_A= (‘apple’, 0, True, 3.3)

#To ascertain type which needs to be tuple
Tuple_A= ('apple', 0, True, 3.3)
print(type(Tuple_A))

Dictionaries: These are key value pairs which might be ordered, changeable and indexed. They comprise of a key separated from a worth with a colon then the pairs are separated from others with a comma and it’s all enclosed in curly brackets e.g. Dict_A= {‘Name’: ‘Jane’, ‘Age’: 24, ‘Country’: ‘Kenya’}

#To ascertain type which needs to be dictionary
Dict_A= {'Name': 'Jane', 'Age': 24, 'Country': 'Kenya'}
print(type(Dict_A))

are addition, subtraction, division, multiplication, increment operator, decrement operator, exponentiation and modulo. Below are examples of how some are executed:

a= 10
b= 5

#addition
print(a+b)

#subtraction
print(a-b)

#multiplication
print(a*b)

#classic division
print(a/b)

#floor division- rounds off result to nearest integer
#result will likely be 1 as a substitute of 1.5
c=5
d=4
print(c//d)

control the flow of this system’s execution. In python they include: if statements, if else statements, if-elif and if-elif-else statements. Example of conditional statements to find out if a number is odd and even:

# prompt asking user to enter number
integer=int(input('Enter an integer:'))
# if division of number by 2 gives a reminder of 0 it's even, otherwise it's odd. % is the modulo operator which supplies the result as a remainder of #division of the number to the left by the number to the appropriate.
if integer%2 == 0:
print('The number is even')
else:
print('The number is odd')

facilitate iteration which is repeatedly performing a set of instructions until a certain condition is reached. There are for loops, while loops and nested loops. Below is an example of some time loop to print the multiplication table of any number from 10 to 1:

#prompt for user to enter a number
number=int(input('Enter a number:'))

#while loop
# count is the loop variable that has been initialized by starting at 10
count=10
# while count is the same as or lower than 10 and greater than or equal to 1
while count<=10 and count>=1:
product= count*number
print(f"{number} * {count} = {product}")
count-=1

are code written to perform specific tasks. Python has inbuilt functions corresponding to len(), range() and plenty of others but you can even create custom functions. Below is a function to ascertain whether a letter is a vowel or consonant:


#defining the function
def letter_check(letter):
if letter =='a' or letter =='e' or letter =='i' or letter =='o' or letter =='u':
print("Vowel")
elif letter =='y':
print("y is a vowel, and sometimes y is a consonant")
else:
print('Consonant')

#calling the function
letter_check('a')

Once these basics are learnt one moves to the info processing, cleansing, data wrangling and Exploratory Data Evaluation (EDA) libraries which include: pandas, NumPy, matplotlib, seaborn and plenty of more. Their applications will likely be seen best by taking a hands on approach and using them in a project.

After this the following step is learning statistics. Statistics include: univariate evaluation, bivariate evaluation, multivariate evaluation, sampling, distributions and hypothesis testing. The perfect approach to that is learning then carrying out a hands on project with the appropriate python libraries.

Machine learning
Source: https://www.fsm.ac.in/blog/an-introduction-to-machine-learning-its-importance-types-and-applications/

The last step is machine learning which can involve learning various algorithms then implementing them in projects based on using the algorithms. Machine learning comprises of three types: supervised, unsupervised and reinforcement learning. Each type has its own algorithms with python libraries to facilitate model creation, testing and deployment.

Python is a really powerful tool for data science and crucial for anyone pursuing the sector of Data Science or wanting to get insights from data for his or her business.

6 COMMENTS

LEAVE A REPLY

Please enter your comment!
Please enter your name here