Lists, Tuples, Dictionaries, And Data Frames in Python: The Complete Guide Lists Tuples Dictionaries Data frames Conclusions

Artificial Intelligence

Lists, Tuples, Dictionaries, And Data Frames in Python: The Complete Guide Lists Tuples Dictionaries Data frames Conclusions

admin

May 27, 2023

Lists, Tuples, Dictionaries, And Data Frames in Python: The Complete Guide
Lists
Tuples
Dictionaries
Data frames
Conclusions

Definition and creation examples

In Python, a listing is a group of ordered elements that may be of any type: strings, integers, floats, etc…

To create a listing, the items have to be inserted between square brackets and separated by a comma. For instance, here’s how we will create a listing of integers:

# Create list of integers
my_integers = [1, 2, 3, 4, 5, 6]

But lists may have “mixed” types stored inside them. For instance, let’s create a listing with each integers and strings:

# Create a mixed list
mixed_list = [1, 3, "dad", 101, "apple"]

To create a listing, we may use the Python built-in function list(). That is how we will use it:

# Create list and print it
my_list = list((1, 2, 3, 4, 5))
print(my_list)>>>  
[1, 2, 3, 4, 5]

This built-in function may be very useful in some particular cases. For instance, let’s say we wish to create a listing of numbers within the range (1–10). Here’s how we will achieve this:

# Create a listing in a variety
my_list = list(range(1, 10))
print(my_list)>>>
[1, 2, 3, 4, 5, 6, 7, 8, 9]

:Do not forget that the built-in function "range" includes the primary value,
and excludes the last one.

Now, let’s see how we will manipulate lists.

Lists manipulation

Due to the undeniable fact that lists are mutable, we have now numerous possibilities to govern them. For instance, let’s say we have now a listing of names, but we made a mistake and we wish to alter one. Here’s how we will achieve this:

# List of names
names = ["James", "Richard", "Simon", "Elizabeth", "Tricia"]
# Change the unsuitable name
names[0] = "Alexander"
# Print list
print(names)>>>
['Alexander', 'Richard', 'Simon', 'Elizabeth', 'Tricia']

So, within the above example, we’ve modified the primary name of the list from James to Alexander.


In case you didn't know, note that in Python the primary element
is all the time accessed by "0", regarding of the sort we're manipulating.
So, within the above example, "names[0]" represents the primary element
of the list "names".

Now, suppose we’ve forgotten a reputation. We will add it to our list like so:

# List of names
names = ["James", "Richard", "Simon", "Elizabeth", "Tricia"]
# Append one other name
names.append("Alexander")
# Print list
print(names) >>>
['James', 'Richard', 'Simon', 'Elizabeth', 'Tricia', 'Alexander']

If we’d like to concatenate two lists, we have now two possibilities: the concatenate method or the extend()one. Let’s see them:

# Create list1
list1 = [1, 2, 3]
# Create list2
list2 = [4, 5, 6]
# Concatenate lists
concatenated_list = list1 + list2
# Print concatenated list
print(concatenated_list)>>>
[1, 2, 3, 4, 5, 6]

So, this method creates a listing that’s the sum of other lists. Let’s see the extend() method:

# Create list1
list1 = [1, 2, 3]
# Create list2
list2 = [4, 5, 6]
# Extend list1 with list2
list1.extend(list2)
# Print latest list1
print(list1)>>>
[1, 2, 3, 4, 5, 6]

As we will see, the outcomes are the identical, however the syntax is different. This method extends list1 with list2.

If we wish to remove elements, we have now two possibilities: we will use the remove() method or del. Let’s see them:

# Create list
my_list = [1, 2, 3, 'four', 5.0]
# Remove one element and print
my_list.remove('4')
print(my_list)>>>
[1, 2, 3, 5.0]

Let’s see the opposite method:

# Create list
my_list = [1, 2, 3, 'four', 5.0]
# Delete one element and print
del my_list[3]
print(my_list)>>>
[1, 2, 3, 5.0]

So, we get the identical results with each methods, but remove() gives us the chance to explicitly write the element to remove, while del must access the position of the element of the list.


Should you've gained familiarity with accessing positions, within the above
example my_list[3] = '4'. Because, remember: in Python we start counting
positions from 0.

List comprehension

There are lots of cases where we’d like to create lists ranging from existing lists, generally applying some filters to the prevailing data. To achieve this, we have now two possibilities:

We use loops and statements.
We use list comprehension.

Practically, they’re each the identical technique to write the identical thing, but list comprehension is more concise and chic.

But before we discuss these methods, you could need a deep overview of loops and statements. Listed below are a few articles I wrote up to now which will enable you:

Now, let’s see a few examples using loops and statements directly.

Suppose we have now a shopping list. We wish our program to print that we love one fruit and that we don’t just like the others on the list. Here’s how we will achieve this:

# Create shopping list
shopping_list = ["banana", "apple", "orange", "lemon"]
# Print the one I like
for fruit in shopping_list:
if fruit == "lemon":
print(f"I like {fruit}")
else:
print(f"I do not like {fruit}")>>>
I do not like banana
I do not like apple
I do not like orange
I like lemon

One other example might be the next. Suppose we have now a listing of numbers and we wish to print just the even ones. Here’s how we will achieve this:

# Create list
numbers = [1,2,3,4,5,6,7,8]
# Create empty list
even_list = []
# Print even numbers
for even in numbers:
if even %2 == 0:
even_list.append(even)
else:
passprint(even_list)
>>>
[2, 4, 6, 8]

Should you will not be aware of the sintax %2 == 0 it signifies that we're
dividing a number by 2 and expect a reminder of 0. In other words,
we're asking our program to intercept the even numbers.

So, within the above example, we’ve created a listing of numbers. Then, we’ve created an empty list that’s used after the loop to append all of the even numbers. This fashion, we’ve created a listing of even numbers from a listing with “general” numbers.

Now… this fashion of making latest lists with loops and statements is a bit of “heavy”. I mean: it requires lots of code. We will gain the identical leads to a more concise way using list comprehension.

For instance, to create a listing with even numbers we will use list comprehension like so:

# Create list
numbers = [1,2,3,4,5,6,7,8]
# Create list of even numbers
even_numbers = [even for even in numbers if even %2 == 0]
# Print even list
print(even_numbers)>>>
[2, 4, 6, 8]

So, list comprehension creates directly a latest list and we define the condition inside it. As we will see, we gain the identical result as before, but in only one line of code: not bad!

Now, let’s create a listing with comments on the fruit I like (and the fruit I don’t) with list comprehension:

# Create shipping list
shopping_list = ["banana", "apple", "orange", "lemon"]
# Create commented list and print it
commented_list = [f"I love {fruit}" if fruit == "banana"
else f"I don't like {fruit}"
for fruit in shopping_list]
print(commented_list)>>>
['I love banana', "I don't like apple", "I don't like orange",
"I don't like lemon"]

So, we gained the identical result as before, but with only a line of code. The one difference is that here we’ve printed a listing (because list comprehension creates one!), while before we just printed the outcomes.

List of lists

There’s also the chance to create lists of lists, which might be lists nested into one list. This possibility is beneficial when we wish to represent listed data as a singular list.

For instance, consider we wish to create a listing of scholars and their grades. We could create something like that:

# Create lis with students and their grades
students = [
["John", [85, 92, 78, 90]],
["Emily", [77, 80, 85, 88]],
["Michael", [90, 92, 88, 94]],
["Sophia", [85, 90, 92, 87]]
]

It is a useful notation if, for instance, we wish to calculate the mean grade for every student. We will do it like so:

# Iterate over the list
for student in students:
name = student[0] # Access names
grades = student[1] # Access grades
average_grade = sum(grades) / len(grades) # Calculate mean grades
print(f"{name}'s average grade is {average_grade:.2f}")>>>
John's average grade is 86.25
Emily's average grade is 82.50
Michael's average grade is 91.00
Sophia's average grade is 88.50

Tuples are one other data structure type in Python. They’re defined with round brackets and, as lists, can contain any data type separated by a comma. So, for instance, we will define a tuple like so:

# Define a tuple and print it
my_tuple = (1, 3.0, "John")
print(my_tuple)>>>
(1, 3.0, 'John')

The difference between a tuple and a listing is that a tuple is . Because of this the weather of a tuple can’t be modified. So, for instance, if we attempt to append a worth to a tuple we get an error:

# Create a tuple with names
names = ("James", "Jhon", "Elizabeth")
# Attempt to append a reputation
names.append("Liza")>>>
AttributeError: 'tuple' object has no attribute 'append'

So, since we will’t modify tuples, they’re useful when we wish our data to be immutable; for instance, in situations where we don’t have the desire to make mistakes.

A practical example would be the cart of an e-commerce. We might want this type of data to be immutable in order that we don’t make any mistakes when manipulating it. Imagine someone bought a shirt, a pair of shoes, and a watch from our e-commerce. We may report this data with quantity and price into one tuple:

# Create a chart as a tuple
cart = (
("Shirt", 2, 19.99),
("Shoes", 1, 59.99),
("Watch", 1, 99.99)
)

In fact, to be precise, it is a tuple of tuples.

Since lists are immutable, they’re more efficient when it comes to performance, meaning they save our computer’s resources. But in the case of manipulation, we will use the very same code as we’ve seen for lists, so we won’t write it again.

Finally, similarly to lists, we will create a tuple with the built-in function tuple() like so:

# Create a tuple in a variety
my_tuple = tuple(range(1, 10))
print(my_tuple)>>>
(1, 2, 3, 4, 5, 6, 7, 8, 9)

A dictionary is a technique to store data which might be coupled as keys and values. That is how we will create one:

# Create a dictionary
my_dictionary = {'key_1':'value_1', 'key_2':'value_2'}

So, we create a dictionary with curly brackets and we store in it a few keys and values separated by a colon. The couples keys-values are then separated by a comma.

Now, let’s see how we will manipulate dictionaries.

Dictionaries manipulation

Each keys and values of a dictionary may be of any type: strings, integers, or floats. So, for instance, we will create a dictionary like so:

# Create a dictionary of numbers and print it
numbers = {1:'one', 2:'two', 3:'three'}
print(numbers)>>>
{1: 'one', 2: 'two', 3: 'three'}

But we will create one also like that:

# Create a dictionary of numbers and print it
numbers = {'one':1, 'two':2.0, 3:'three'}
print(numbers)>>>
{'one': 1, 'two': 2.0, 3: 'three'}

Selecting the sort for values and keys will depend on the issue we’d like to resolve. Anyway, considering the dictionary we’ve seen before, we will access each values and keys like so:

# Access values and keys
keys = list(numbers.keys())
values = tuple(numbers.values())
# Print values and keys
print(f"The keys are: {keys}")
print(f"The values are: {values}")>>>
The keys are: ['one', 'two', 3]
The values are: (1, 2.0, 'three')

So, if our dictionary known as numbers we access its key with numbers.keys(). And with numbers.values() we access its values. Also, note that we have now created a listing with the keys and a tuple with the values using the notation we’ve seen before.

In fact, we may iterate over dictionaries. For instance, suppose we wish to print the values which might be greater than a certain threshold:

# Create a shopping list with fruits and costs
shopping_list = {'banana':2, 'apple':1, 'orange':1.5}
# Iterate over the values
for values in shopping_list.values():
# Values greater than threshold
if values > 1:
print(values)>>>
2
1.5

Like lists, dictionaries are mutable. So, if we wish so as to add a worth to a dictionary we have now to define the important thing and the worth so as to add to it. We will do it like so:

# Create the dictionary
person = {'name': 'John', 'age': 30}
# Add value and key and print
person['city'] = 'Recent York'
print(person)>>>
{'name': 'John', 'age': 30, 'city': 'Recent York'}

To switch a worth of a dictionary, we’d like to access its key:

# Create a dictionary
person = {'name': 'John', 'age': 30}
# Change age value and print
person['age'] = 35
print(person)>>>
{'name': 'John', 'age': 35}

To delete a pair key-value from a dictionary, we’d like to access its key:

# Create dictionary
person = {'name': 'John', 'age': 30}
# Delete age and print
del person['age']
print(person)>>>
{'name': 'John'}

Nested dictionaries

We have now seen before that we will create lists of lists and tuples of tuples. Similarly, we will create nested dictionaries. Suppose, for instance, we wish to create a dictionary to store the info related to a category of scholars. We will do it like so:

# Create a classroom dictionary
classroom = {
'student_1': {
'name': 'Alice',
'age': 15,
'grades': [90, 85, 92]
},
'student_2': {
'name': 'Bob',
'age': 16,
'grades': [80, 75, 88]
},
'student_3': {
'name': 'Charlie',
'age': 14,
'grades': [95, 92, 98]
}

So, the info of every student are represented as a dictionary and all of the dictionaries are stored in a singular dictionary, representing the classroom. As we will see, the values of a dictionary may even be lists (or tuples, if we’d like). On this case, we’ve used lists to store the grades of every student.

To print the values of 1 student, we just have to do not forget that, from the angle of the classroom dictionary, we’d like to access the important thing and, on this case, the keys are the scholars themselves. This implies we will do it like so:

# Access student_3 and print
student_3 = classroom['student_3']
print(student_3)>>>
{'name': 'Charlie', 'age': 14, 'grades': [95, 92, 98]}

Dictionaries comprehension

Dictionary comprehension allows us to create dictionaries concisely and efficiently. It’s much like list comprehension but, as a substitute of making a listing, it creates a dictionary.

Suppose we have now a dictionary where we have now stored some objects and their prices. We wish to know the objects that cost lower than a certain threshold. We will do it like so:

# Define initial dictionary
products = {'shoes': 100, 'watch': 50, 'smartphone': 250, 'tablet': 120}
# Define threshold
max_price = 150
# Filter for threshold
products_to_buy = {fruit: price for fruit, price in products.items() if price <= max_price}
# Print filtered dictionary
print(products_to_buy)>>>
{'shoes': 100, 'watch': 50, 'tablet': 120}

So, the syntax to make use of dictionary comprehension is:

new_dict = {key:value for key, value in iterable}

Where iterable is any iterable Python object. It will possibly be a listing, a tuple, one other dictionary, etc…

Creating dictionaries with the “standard” method would require lots of code, with conditions, loops, and statements. As a substitute, as we will see, dictionary comprehension allows us to create a dictionary, based on conditions, with only one line of code.

Dictionary comprehension is very useful when we’d like to create a dictionary retrieving data from other sources or data structures. For instance, say we’d like to create a dictionary retrieving values from two lists. We will do it like so:

# Define names and ages in lists
names = ['John', 'Jane', 'Bob', 'Alice']
cities = ['New York', 'Boston', 'London', 'Rome']
# Create dictionary from lists and print results
name_age_dict = {name: city for name, city in zip(names, cities)}
print(name_age_dict)>>>
{'John': 'Recent York', 'Jane': 'Boston', 'Bob': 'London', 'Alice': 'Rome'}

An information frame is the representation of tabular data. Image from the Panda’s website here: https://pandas.pydata.org/docs/getting_started/index.html

An information frame is a two-dimensional data structure consisting of columns and rows. So, it’s by some means much like a spreadsheet or a table in an SQL database. They’ve the next characteristics:

Each row represents a person commentary or record.
Each column represents a variable or a selected attribute of the info.
They’ve labeled rows (called indexes) and columns, making it easy to govern the info.
The columns can contain various kinds of data, like integers, strings, or floats. Even a single column can contain different data types.

While data frames are the everyday data structure utilized in the context of Data Evaluation and Data Science, it isn’t unusual that a Python Software Engineer might have to govern an information frame, and this is the reason we’re having an outline of information frames.

Here’s how an information frame appears:

So, on the left (within the blue rectangle) we will see the indexes, meaning the row counts. We will then see that an information frame can contain various kinds of data. Particularly, the column “Age” comprises different data types (one string and two integers).

Basic data frames manipulation with Pandas

While recently a latest library to govern data frames called “Polars” began circulating, here we’ll see some data manipulation with Pandas which continues to be essentially the most used as of today.

Initially, generally, we will create data frames by importing data from .xlsx or .cvs files. In Pandas we will do it like so:

import pandas as pd# Import cvs file
my_dataframe = pd.read_csv('a_file.csv')
# Import xlsx
my_dataframe_2 = pd.read_excel('a_file_2.xlsx')

If we wish to create an information frame:

import pandas as pd# Create a dictionary with various kinds of data
data = {
'Name': ['John', 'Alice', 'Bob'],
'Age': ['twenty-five', 30, 27],
'City': ['New York', 'London', 'Sydney'],
'Salary': [50000, 60000.50, 45000.75],
'Is_Employed': [True, True, False]
}
# Create the dataframe
df = pd.DataFrame(data)

That is the info frame we’ve shown above. So, as we will see, we first create a dictionary, after which we convert it to an information frame with the strategy pd.DataFrame().

We have now three possibilities to visualise an information frame. Suppose we have now an information frame called df:

The primary one is print(df).
The second is df.head() that can show the primary 5 rows of our data frame. In case we have now an information frame with lots of rows, we will show greater than the primary five. For instance, df.head(20) shows the primary 20.
The third one is df.tail() that works exactly like head(), but this shows the last rows.

On the side of visualization, using the above df, that is what df.head() shows:

And that is what print(df) shows:

Within the case of small data sets like this one, the difference is barely a matter of taste (I prefer head() since it “shows the tabularity” of information). But within the case of enormous data sets, head() is way significantly better. Try it, and let me know!

Consider that Pandas is a really wide library, meaning it allows us to govern tabular data in quite a lot of ways, so it’d must be treated alone. Here we wish to indicate just the very basics, so we’ll see how we will add and delete a column (the columns of an information frame are also called “Pandas series”).

Suppose we wish so as to add a column to the info frame df we’ve seen above that’s telling us if persons are married or not. We will do it like so:

# Add marital status
df["married"] = ["yes", "yes", "no"]

this is identical notation we used so as to add values to a dictionary.
Return back on the article and compare the 2 methods.

And showing the pinnacle we have now:

The information frame df with the marital status. Image by Writer.

To delete one column:

# Delete the "Is_Employed" column
df = df.drop('Is_Employed', axis=1)

And we get:

The information frame without the column related to employment data. Image by Writer.

Note that we’d like to make use of axis=1 because here we’re telling Pandas to remove columns and since an information frame is a two-dimensional data structure, axis=1 represents the vertical direction.

As a substitute, if we wish to drop a row, we’d like to make use of axis=0. For instance, suppose we wish to delete the row related to the index 1 ( that’s the second row because, again, we start counting from 0):

# Delete the second row 
df = df.drop(1, axis=0)

And we get:

The information frame without the second row. Image by Writer.

To this point, we’ve seen essentially the most used data structures in Python. These will not be the one ones, but surely essentially the most used.

Also, there is no such thing as a right or unsuitable in using one moderately than one other: we just need to grasp what data we’d like to store and use one of the best data structure for such a task.

I hope this text helped you understand the usage of those data structures and when to make use of them.