Important Methods in DataFrame

Important Methods in DataFrame

Table of contents

Methods

  1. value_counts()

    • Works for series and data frame
* Gives the frequency for every unique item in series

* Most useful is when it is applied to series

* No data frame counts the frequency of unique rows when applied to an entire data frame
  1. sort_values()

    • Applicable to both series as well as data frames

    • series.sort_values() by default is ascending

    • sort_values() when applied missing values it will put them at last by default

    • We can change this behavior by changing the parameter passed to the top or last

    • By default, the sorting will happen and the changes are not stored

    • However, we can store by using in place

        df.sort_values(['Courses', 'Discount'],
                      ascending = [True, True])
      
  2. sort_index()

    • Perform sorting based on index

    • applicable in both series and data frame

  3. rank()

    • applicable only on series

    • will give rank based on lower values as rank one to higher values

  4. set_index()

    • We can change the default index to any column using this

    • Applicable to data frames only

  5. sort_index()

    • Applicable on both series and data frames

    • Sorts based on an index

  6. reset_index()

    • Applicable on both series and data frame when applied on series it will convert it into a data frame

    • We can reset_index() in the data frame

    • how to replace existing index without loosing

        batsman.reset_index().set_index('batting_rank')
      
  7. rename()

    • applicable only on a data frame

    • pass in a dictionary with the key as the original value and the value as the value we want to change

  8. unique()

    • gives the unique values in the series

    • applicable only on series

    • counts missing values as well

  9. unique()

    • does not count missing values
  10. IsNull()

    • applicable on series and data frame

    • checks whether the value is missing value or not in series (replace missing values with true and not missing values with false)

  11. notnull()

    • exact opposite of isnull()
  12. hasn't()

    • returns true if we have missing values or not

    • applicable only on series

  13. dropna

    • this will remove all the rows where all columns have null values

    • when we pass in a subset then it will remove the rows where the specified subset has null.here we are looking for those values which can be null is any of the columns mentioned

    • applicable in series and dataframe both

    • how parameter -> works like or

  14. fillna

    • Fill NA/NaN values using the specified method.

    • applicable on series and pandas

  15. apply

    • Fill NA/NaN values using the specified method