Python Data Analysis
上QQ阅读APP看书,第一时间看更新

Chapter 4. pandas Primer

pandas is named after panel data (an econometric term) and Python data analysis, and is a popular open source Python project. This chapter is a tutorial on basic pandas functionalities, where we will learn about pandas data structures and operations.

Note

The official pandas documentation insists on naming the project pandas in all lowercase letters. The other convention they insist on is this import statement: import pandas as pd. We will try to follow these conventions as much as possible.

In this chapter, we will install and explore pandas. Then, we will acquaint ourselves with the two central pandas data structures: DataFrame and Series. After this, you will learn how to perform SQL-like operations on the data contained in these data structures. pandas has statistical utilities including time-series routines, some of which will be demonstrated. The topics we will pursue are as follows:

  • Installing and exploring pandas
  • DataFrame and Series data structures
  • Querying data in pandas
  • Statistics with pandas DataFrames
  • Data aggregation with pandas DataFrames
  • Concatenating, joining, and appending DataFrames
  • Handling missing values
  • Dealing with dates
  • Pivot tables
  • Remote data access