A Proven Method to Remember Data Science Concepts For as Long as You Need

Artificial Intelligence

A Proven Method to Remember Data Science Concepts For as Long as You Need

admin

April 23, 2023

A Proven Method to Remember Data Science Concepts For as Long as You Need

And tools to place the tactic into practice within the age of AI

Image by me. Via my good pal, Midjourney

The issue with self-learning data science

Each time I need to put in a library with Anaconda, the -c a part of the command keeps moving around. So, like most individuals, I google it, sometimes 3-4 times a day:

conda install -c conda-forge library_name

Sounds familiar?

This little example signals a fundamental flaw in the way in which most of us learn data science and machine learning today: Data science knowledge is cheaper than air, so we don’t take as seriously as we must.

We see university students busting their brains to recollect a lot information to pass exams and tests. In the event that they don’t do well, they are going to get chucked out from the institution they paid a lot for.

As self-taught data scientists, we now have none of that pressure. All we now have is our self-discipline that keeps persuading us we’re doing a wonderful job as we watch a YouTube course on our couch.

Our learning processes are haphazard. We learn something recent and jump to the following shiny thing without .

We leave information retention as much as likelihood.

After we actually sit right down to practice what we “learned” (air quotes), we’ll realize we already forgot 80% of the brand new knowledge within the time it took to activate our computers.

So, we start googling. And after this behavior becomes the norm, we brag to others how we’re exceptional at googling in our little tweets. What we’re actually doing is subtly signaling to others that we now have no reliable systems in any way to learn and retain the overwhelming amount of knowledge in data science.

Through no fault of our own, we became .

The answer

Without effective methods and tools to learn and retain recent knowledge, it is hard to grow to be an information scientist.

There may be just a lot to learn: math, statistics, machine learning theory, the functions and methods in dozens of Python libraries, and so forth. It is tough to maintain track of all this information.

The Ebbinghaus forgetting curve above shows the speed at which recent information leaks from memory.

It is evident from the graph that it can take only six days to lose recent info completely. And when it’s information learned in our haphazard and careless ways, it can grow to be even shorter.

But when you make a serious effort to place recent knowledge right into a reliable repetition system, you consciously select to recollect it for the remaining of your life or so long as you would like it.

Can I quite possibly be talking about rote learning (🤒)? No, after all not. I’m talking about !

Spaced repetition is a robust memory technique that greatly takes advantage of the Ebbinghaus forgetting curve:

Spaced repetition re-exposes you to recent information at increasingly larger optimal intervals, each interval coming just when a memory leak is about to occur.

It will reset your memory and increase the following interval where you’ve to review the fabric.

What are the advantages of SR?

Perhaps, probably the most helpful thing about spaced repetition is the way in which it transfers knowledge from short to long-term memory.

Other than the efficient use of time and improved retention, studies show the next advantages of the system:

Personalization: Customizable to your unique preferences, because it adapts to your pace and level of mastery of the fabric.
Improved comprehension: By reinforcing concepts and connections continually over time, it becomes easier so that you can construct a network of information and understand complex topics more deeply.
Increased motivation: Spaced repetition gives me a terrific sense of progress and achievement as my repetition intervals get longer.

These are probably why many medical students swear their lives on this method because they use it to memorize the names of bones, blood vessels, nerve branches, and all of the exhausting details concerning the human body.

Data science will not be as complicated, but we still have a pretty big amount of things to recollect.

Spaced repetition algorithms

There are various algorithms implementing spaced repetition in practice, the preferred of which is .

SuperMemo is a series of SR algorithms that has steadily been coming out since 1982. The creator, Dr. Piotr Wozniak, was recognized by Wired magazine because the “inventor of a way to show people into geniuses” in 2008.

So, how do you turn right into a genius with this method?

After sufficiently learning the underlying concepts and facts, you first break down the fabric into chunks using flashcards (yes, I understand it is a big problem but ).

After making a database of cards, you begin to review them in sessions. The primary session shows the cards within the order they were added or shuffled (based in your preferences). Then, you rate the cards on how well you recall them.

In SuperMemo-2, ther are six options:

0: I don’t have any clue in any way
1: Incorrect, but after seeing the reply, it rings a bell
2: Incorrect, but after seeing the reply, it got here rushing back to me
3: Correct response, but I needed to dig deep and make an effort to recollect
4: Correct response, but I’m hesitating
5: I remember it as if it was minutes ago

Then, the chosen rating is plugged into long calculations that involve the variety of times the cardboard was successfully recalled before, the easiness factor of the cardboard (don’t ask), and the inter-repetition interval. The end result will determine when the cardboard have to be shown again.

For cards rated below 4, SuperMemo will ask you to review the cardboard as again and again as you would like in the course of the current session until the rating goes above 4.

Each accurately recalled card can be shown after increasingly long intervals. For instance, when you memorize that the function to convert a timestamp right into a datetime is datatime.datetime.fromtimestamp, you simply should review the cardboard showing this information 4–5 times over the span of a month to recollect it for the approaching six months.

As you may imagine, it is a a lot better repetition system than rote learning, fixed interval repetition, or worst, repetition when the mood strikes you.

Spaced repetition tools

There are various SR tools powered by SuperMemo-like algorithms.

The primary (and this one is the king) is Anki. It’s open-source and implements a modified version of SuperMemo-2. As an alternative of providing six recall rankings, it shows 4:

Anki getting used to memorize Russian vocab. Image by Wikipedia. Wikimedia commons.

Because it is open-source, it has a really antique look, but it surely is a cross-platform, free application (aside from the iOS version). The GitHub repo of the software has over 13k stars, which suggests massive support from the community.

They’ve been working on Anki for over ten years, and the present version has the next features:

Available in all places: Windows, macOS, Linux, Android, and iOS (this one costs money)
Fully customizable: create your individual flashcards, organize them into decks, and set your individual parameters to the spaced repetition algorithm
Sync across devices: the pc version of Anki is the foremost app and mobile and web versions are only companions but synced.
Multimedia support: Add images, audio, video, text formatting, and LaTeX to make flashcards memorable and interesting. There may be also support for image occlusions to memorize visual information.
Add-ons: much like Python extensions, you possibly can create and add your individual functionality to the software, like custom keyboard shortcuts, themes, and advanced statistics.
Pre-built decks: community continuously shares decks with pre-made cards for popular topics. This includes tons of of 1000’s of cards on language learning or virtually any subject in university exams and lots of other great/cool/weird topics.

One obvious pain point we is creating flashcards unavailable in the neighborhood.

I do know that data science is a comparatively young field in terms of spaced repetition. Anyone would have an unlimited amount of knowledge to convert into flashcards, which sounds tedious and sickening. However it is a crucial evil.

I firmly imagine that the general time it takes so that you can create flashcards for one topic and with spaced repetition can be much lower than hours of googling or dozens of of forgetting and relearning.

Besides, we’re lucky to be living within the golden age of AI (we’re, aren’t we?). There are already low cost AI-powered flashcard software like Monic.ai.

I already tried Monic.ai, and it looks great. You upload a screenshot or a PDF file, and it mechanically converts the text inside into flashcards in mere seconds. It’s powered by spaced repetition as well.

Should you determine to offer it a go, you need to consider downloading the GoFullPage Chrome extension to take full-page screenshots or know tips on how to save web pages as PDFs so you can turn any online article, tutorial, or documentation page of Python frameworks into flashcards with Monic.ai.

Wrap

It’s time to change our approaches to learning data science. We must always ditch our careless, haphazard ways of watching YouTube videos only for the sake of watching or taking courses back-to-back in quest of a recent worthless e-certificate.

We must always stop learning something once and hope for one of the best that it stays there. We must always stop wishful pondering.

We must always stop leaving memory as much as likelihood.

As an alternative, we must always take deliberate actions to memorize every crucial fact, piece of theory, concept, terminal command, Python function, or function argument for so long as we want them.

Yes, this may take some getting used to, but once we’re, we will significantly shorten the time it takes to go from “learning data science online” to “doing data science in a job that pays six figures”.

Thanks for reading!

Loved this text and, let’s face it, its bizarre writing style? Imagine gaining access to dozens more identical to it, all written by a superb, charming, witty creator (that’s me, by the way in which :).

For less than 4.99$ membership, you’ll get access to not only my stories, but a treasure trove of information from one of the best and brightest minds on Medium. And when you use my referral link, you’ll earn my and a virtual high-five for supporting my work.