How to randomly shuffle dataframe rows with pandas ?

Published: 18 septembre 2020

Tags: Python; Pandas; DataFrame;

DMCA.com Protection Status

Examples of how to randomly shuffle dataframe rows with pandas

Create a dataframe with pandas

Let's first create a dataframe with pandas

import numpy as np
import pandas as pd

data = np.arange(20)

df = pd.DataFrame(data=data,columns=['Column A'])

print(df)

returns

    Column A
0          0
1          1
2          2
3          3
4          4
5          5
6          6
7          7
8          8
9          9
10        10
11        11
12        12
13        13
14        14
15        15
16        16
17        17
18        18
19        19

Randomly shuffle dataframe rows

A solution to randomly shuffle dataframe rows is to use pandas.DataFrame.sample with frac = 1 (to keep all rows)

df = df.sample(frac=1)

print(df)

returns

        Column A
11        11
16        16
1          1
4          4
6          6
3          3
19        19
12        12
10        10
0          0
18        18
8          8
2          2
5          5
13        13
7          7
9          9
14        14
17        17
15        15

Note: if you want a sample just decrease the fraction (for example frac = 0.5 will select randomly half of the rows):

df = df.sample(frac=0.5)

print(df)

returns

    Column A
14        14
8          8
7          7
11        11
19        19
9          9
17        17
18        18
0          0
1          1

References