Examples of how to randomly shuffle dataframe rows with pandas
Table des matières
Create a dataframe with pandas
Let's first create a dataframe with pandas
import numpy as np
import pandas as pd
data = np.arange(20)
df = pd.DataFrame(data=data,columns=['Column A'])
print(df)
returns
Column A
0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
11 11
12 12
13 13
14 14
15 15
16 16
17 17
18 18
19 19
Randomly shuffle dataframe rows
A solution to randomly shuffle dataframe rows is to use pandas.DataFrame.sample with frac = 1 (to keep all rows)
df = df.sample(frac=1)
print(df)
returns
Column A
11 11
16 16
1 1
4 4
6 6
3 3
19 19
12 12
10 10
0 0
18 18
8 8
2 2
5 5
13 13
7 7
9 9
14 14
17 17
15 15
Note: if you want a sample just decrease the fraction (for example frac = 0.5 will select randomly half of the rows):
df = df.sample(frac=0.5)
print(df)
returns
Column A
14 14
8 8
7 7
11 11
19 19
9 9
17 17
18 18
0 0
1 1