Once data is gathered for data science it is often in an unstructured or raw format. Data must be filtered for content and validity. In this course, you'll explore examples of practical tools and techniques for data filtering.
Introduction to Data Filtering
start the course
identify common filtering techniques and tools
extract date elements from common date formats
parse content types in HTTP headers
use csvcut to filter CSV data
use sed to replace values in a text data stream
drop duplicate records from data
extract headers from a jpeg image
use pdfgrep to extract data from searchable pdf files
detect invalid or impossible data combinations
parse robots.txt from a web site to decide what should and shouldn't be crawled nor indexed
Practice: Filtering Dates
drop records from a CSV file based on date range
The materials within this course focus on the Knowledge Skills and Abilities (KSAs) identified within the Specialty Areas listed below. Click to view Specialty Area details within the interactive National Cybersecurity Workforce Framework.