A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://www.geeksforgeeks.org/python/python-pandas-dataframe-sample/ below:

Pandas Dataframe.sample() | Python - GeeksforGeeks

Pandas Dataframe.sample() | Python

Last Updated : 11 Jul, 2025

Pandas DataFrame.sample() function is used to select randomly rows or columns from a DataFrame. It proves particularly helpful while dealing with huge datasets where we want to test or analyze a small representative subset. We can define the number or proportion of items to sample and manage randomness through parameters such as n, frac and random_state.

Example : Sampling a Single Random Row

In this example, we load a dataset and generate a single random row using the sample() method by setting n=1.

C++
import pandas as pd

# Load dataset
d = pd.read_csv("employees.csv")

# Sample one random row
r_row = d.sample(n=1)

# Display the result
r_row

Output

one row of dataframe

The sample(n=1) function selects one random row from the DataFrame.

Syntax

DataFrame.sample(n=None, frac=None, replace=False, weights=None, random_state=None, axis=None)

Parameters:

Return Type: New object of same type as caller.

To download the CSV file used, Click Here.

Examples of Pandas Dataframe.sample() Example 1: Sample 25% of the DataFrame

In this example, we generate a random sample consisting of 25% of the entire DataFrame by using the frac parameter.

C++
import pandas as pd
d = pd.read_csv("employees.csv")

# Sample 25% of the data
sr = d.sample(frac=0.25)

# Verify the number of rows
print(f"Original rows: {len(d)}")
print(f"Sampled rows (25%): {len(sr)}")

# Display the result
sr

Output

25% of dataframe

As shown in the output image, the length of sample generated is 25% of data frame. Also the sample is generated randomly.

Example 2: Sampling with Replacement and a Fixed Random State

This example demonstrates how to sample multiple rows with replacement (i.e., allowing repetition of rows) and ensures reproducibility using a fixed random seed.

C++
import pandas as pd
d = pd.read_csv("employees.csv")

# Sample 3 rows with replacement and fixed seed
sd = d.sample(n=3, replace=True, random_state=42)

sd

Output

sampling with replacement

The replace=True parameter allows the same row to be sampled more than once, making it ideal for bootstrapping. random_state=42 ensures the result is reproducible across multiple runs very useful during testing and debugging.



RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4