Pandas dataframe filter with multiple conditions

Let's take a look at a few different ways to filter and select rows in a pandas dataframe based on multiple conditions.

To start we're going to create a simple dataframe in python:

pd.dataframe created

6 functions to filter dataframe rows on multiple conditions

Let's look at six different ways to filter rows in a dataframe based on multiple conditions:

What conditions do we want to filter on?

Get all rows having hourly wage greater than or equal to 100 and age < 60 and favorite football team name starts with ‘S’.

Using Loc to Filter With Multiple Conditions

The loc function in pandas can be used to access groups of rows or columns by label.

Add each condition you want to be included in the filtered result and concatenate them with the & operator. You'll see our code sample will return a pd.dataframe of our filtered rows. Don't forget to include "import pandas as pd" at the top!

.loc on dataframe

Using The Numpy Function np.where To Filter With Multiple Conditions

The Numpy python library interacts great with dataframes, especially when dealing with indexing.

Let's use the numpy.where function with a few conditions to produce the same result we wanted above.

applying np.where to our df

Let's take the returned row index list that matches the combined conditions into loc

output from .loc

Using The Pandas Function Query With Multiple Conditions

We can use the pandas dataframe function query() and boolean expressions to get our filtered rows back. The main benefit of the query function is it uses numexpr which improves efficiency, especially in larger dataframes.

print query output

Filtering On Multiple Conditions Using Pandas Boolean Indexing

This is a good method to go with if you want to remove columns as well, as you can exclude any dataframe columns you don't want in the last statement. Boolean indexing is also very efficient as it does not make a copy of the data.

Output has all three columns

Pandas Eval Function With Multiple Conditions

We can use the pandas dataframe function eval inside a df[] tag to filter on these conditions. Note the .values() at the end.

eval func

Looking To Filter Where Only 1 Condition Has To Be True?

This numpy array and functools implementation allows you to use disjunction to filter your dataframe. Here's an example:

each column is compared to the conditions

Note: you still need "import pandas as pd"

Dataframe Comparison Tools For Multiple Condition Filtering

Post pandas .22 update, there's multiple functions you can use as well to compare column values to conditions. Make sure your dtype is the same as what you want to compare to.