Extract sample randomly in Excel

Last Updated on 04/05/2024

When you have a huge list of customers, products, files, employees, .... extracting samples by copy-paste is a waste of time. This article will explain how to extract a perfect random list of data 👍😀

Let's say you have a list of 19,000 customers. You want to extract a random list of mail, name, and city (only these 3 columns).

Step 1 : Create a list of random number

The very first step to extracting a sample randomly is to create a list of random numbers, and this is very easy with the function RANDBETWEEN.

1. Identify the first row of your dataset (here row 2)
2. Identify the last row of your dataset (here row 19,001)
3. Write the following formula

=RANDBETWEEN(2,19000)

Step 2 : Copy the formula to increase the list

Copy-paste your formula to generate a list of random numbers. These numbers will be the Excel row number 😲

Step 3: Convert the formulas into values

Of course, you can't keep your formulas. Because each time you will refresh your worksheet, these formulas will re-generate new values. So you must convert your formulas with the Copy-Paste in value

Tips: With this technique, you can't avoid duplicates. So, the trick is to generate more random numbers that you need and remove duplicates 😉

Step 4: Use the INDIRECT function to customize a reference

The INDIRECT function helps you to create a custom function with variable values. And here, it's exactly what we need here (follow the link if you have never used this function)

Now, we will build a custom cell's reference with the result of the previous random number (in J2) to return the email (column H). The formula for the email is

=INDIRECT("H"&J2)

Copy this formula for all your random numbers and you will have your list of email

For the country, just change the value of the column reference

=INDIRECT("G"&J2)

Build the formula from another worksheet

Now, if you want to extract the value in a workbook that does not contain the data, the formula is a little bit different. Here, you also have to include the worksheet reference.

Select any cell with an email

=Customer!H5

Now, keep only the reference of the sheet name and the column between double quotes, "Customer!H". The reference of the row will be taken in the cell B4 in our example

=INDIRECT("Customer!H"&B4)

Tutorial video

If there is something you didn't understand, have a look at the following video.

Frédéric LE GUEN

1. Rick Rothstein
03/06/2022 @ 02:32

May I suggest instead of using the Volatile INDIRECT function as you did with this..

=INDIRECT("H"&J2)

to use the non-Volatile INDEX function like this instead...

=INDEX(H:H,J2)

• Frédéric LE GUEN
05/06/2022 @ 21:47

Very good point Rick

Extract sample randomly in Excel

Last Updated on 04/05/2024

When you have a huge list of customers, products, files, employees, .... extracting samples by copy-paste is a waste of time. This article will explain how to extract a perfect random list of data 👍😀

Let's say you have a list of 19,000 customers. You want to extract a random list of mail, name, and city (only these 3 columns).

Step 1 : Create a list of random number

The very first step to extracting a sample randomly is to create a list of random numbers, and this is very easy with the function RANDBETWEEN.

1. Identify the first row of your dataset (here row 2)
2. Identify the last row of your dataset (here row 19,001)
3. Write the following formula

=RANDBETWEEN(2,19000)

Step 2 : Copy the formula to increase the list

Copy-paste your formula to generate a list of random numbers. These numbers will be the Excel row number 😲

Step 3: Convert the formulas into values

Of course, you can't keep your formulas. Because each time you will refresh your worksheet, these formulas will re-generate new values. So you must convert your formulas with the Copy-Paste in value

Tips: With this technique, you can't avoid duplicates. So, the trick is to generate more random numbers that you need and remove duplicates 😉

Step 4: Use the INDIRECT function to customize a reference

The INDIRECT function helps you to create a custom function with variable values. And here, it's exactly what we need here (follow the link if you have never used this function)

Now, we will build a custom cell's reference with the result of the previous random number (in J2) to return the email (column H). The formula for the email is

=INDIRECT("H"&J2)

Copy this formula for all your random numbers and you will have your list of email

For the country, just change the value of the column reference

=INDIRECT("G"&J2)

Build the formula from another worksheet

Now, if you want to extract the value in a workbook that does not contain the data, the formula is a little bit different. Here, you also have to include the worksheet reference.

Select any cell with an email

=Customer!H5

Now, keep only the reference of the sheet name and the column between double quotes, "Customer!H". The reference of the row will be taken in the cell B4 in our example

=INDIRECT("Customer!H"&B4)

Tutorial video

If there is something you didn't understand, have a look at the following video.

1. Rick Rothstein
03/06/2022 @ 02:32

May I suggest instead of using the Volatile INDIRECT function as you did with this..

=INDIRECT("H"&J2)

to use the non-Volatile INDEX function like this instead...

=INDEX(H:H,J2)

• Frédéric LE GUEN
05/06/2022 @ 21:47

Very good point Rick