Data scientists spend a large amount of their time cleaning datasets so that they’re easier to work with. In fact, the 80/20 rule says that the initial steps of obtaining and cleaning data account for 80% of the time spent on any given project.
So, if you’re just stepping into this field or planning to step into this field, it’s important to be able to deal with messy data, whether that means missing values, inconsistent formatting, malformed records, or nonsensical outliers.
In this video course, you’ll leverage Python’s pandas and NumPy libraries to clean data.
Along the way, you’ll learn about:
DataFrame
DataFrame
.str()
methods to clean columnsTo get the most out of this tutorial, you should have a basic understanding of the pandas and NumPy libraries, including pandas’ workhorse Series
and DataFrame
objects, common methods that can be applied to these objects, and NumPy’s NaN
values.
What’s Included:
Downloadable Resources:
Related Learning Paths:
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4