Extract sample randomly in Excel

Extract sample randomly in Excel
Last Updated on 27/02/2024
Reading time: 3 minutes

When you have a huge list of customers, products, files, employees, .... extracting samples by copy-paste is a waste of time. This article will explain how to extract a perfect random list of data 👍😀

Let's say you have a list of 19,000 customers. You want to extract a random list of mail, name, and city (only these 3 columns).

Excel file with customer details

Step 1 : Create a list of random number

The very first step to extracting a sample randomly is to create a list of random numbers, and this is very easy with the function RANDBETWEEN.

  1. Identify the first row of your dataset (here row 2)
  2. Identify the last row of your dataset (here row 19,001)
  3. Write the following formula

=RANDBETWEEN(2,19000)

Random number returns by the function RANDBETWEEN

Step 2 : Copy the formula to increase the list

Copy-paste your formula to generate a list of random numbers. These numbers will be the Excel row number 😲

Copy the random formula

Step 3: Convert the formulas into values

Of course, you can't keep your formulas. Because each time you will refresh your worksheet, these formulas will re-generate new values. So you must convert your formulas with the option Copy-Paste Special (option Value) in this video.

Menu Paste Special value

Tips: With this technique, you can't avoid duplicates. So, the trick is to generate more random numbers that you need and remove duplicates 😉

Step 4: Use the INDIRECT function to customize a reference

The INDIRECT function helps you to create a custom function with variable values. And here, it's exactly what we need here (follow the link if you have never used this function)

Now, we will build a custom cell's reference with the result of the previous random number (in J2) to return the email (column H). The formula for the email is

=INDIRECT("H"&J2)

Creation of a custom reference with INDIRECT

Copy this formula for all your random numbers and you will have your list of email

List of email extract randomly

For the country, just change the value of the column reference

=INDIRECT("G"&J2)

Return the country randomly

Build the formula from another worksheet

Now, if you want to extract the value in a workbook that is not the one that contains the data, the formula is a little bit different. Here, you have to include also the worksheet reference.

Select any cell with an email

=Customer!H5

Reference from another worksheet

Now, keep only the reference of the sheet name and the column between double quotes, "Customer!H". The reference of the row will be taken in the cell B4 in our example

=INDIRECT("Customer!H"&B4)

Tutorial video

If there is something you didn't understand, have a look at the following video.

2 Comments

  1. Rick Rothstein
    03/06/2022 @ 02:32

    May I suggest instead of using the Volatile INDIRECT function as you did with this..

    =INDIRECT("H"&J2)

    to use the non-Volatile INDEX function like this instead...

    =INDEX(H:H,J2)

    Reply

    • Frédéric LE GUEN
      05/06/2022 @ 21:47

      Very good point Rick

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Extract sample randomly in Excel

Reading time: 3 minutes
Last Updated on 27/02/2024

When you have a huge list of customers, products, files, employees, .... extracting samples by copy-paste is a waste of time. This article will explain how to extract a perfect random list of data 👍😀

Let's say you have a list of 19,000 customers. You want to extract a random list of mail, name, and city (only these 3 columns).

Excel file with customer details

Step 1 : Create a list of random number

The very first step to extracting a sample randomly is to create a list of random numbers, and this is very easy with the function RANDBETWEEN.

  1. Identify the first row of your dataset (here row 2)
  2. Identify the last row of your dataset (here row 19,001)
  3. Write the following formula

=RANDBETWEEN(2,19000)

Random number returns by the function RANDBETWEEN

Step 2 : Copy the formula to increase the list

Copy-paste your formula to generate a list of random numbers. These numbers will be the Excel row number 😲

Copy the random formula

Step 3: Convert the formulas into values

Of course, you can't keep your formulas. Because each time you will refresh your worksheet, these formulas will re-generate new values. So you must convert your formulas with the option Copy-Paste Special (option Value) in this video.

Menu Paste Special value

Tips: With this technique, you can't avoid duplicates. So, the trick is to generate more random numbers that you need and remove duplicates 😉

Step 4: Use the INDIRECT function to customize a reference

The INDIRECT function helps you to create a custom function with variable values. And here, it's exactly what we need here (follow the link if you have never used this function)

Now, we will build a custom cell's reference with the result of the previous random number (in J2) to return the email (column H). The formula for the email is

=INDIRECT("H"&J2)

Creation of a custom reference with INDIRECT

Copy this formula for all your random numbers and you will have your list of email

List of email extract randomly

For the country, just change the value of the column reference

=INDIRECT("G"&J2)

Return the country randomly

Build the formula from another worksheet

Now, if you want to extract the value in a workbook that is not the one that contains the data, the formula is a little bit different. Here, you have to include also the worksheet reference.

Select any cell with an email

=Customer!H5

Reference from another worksheet

Now, keep only the reference of the sheet name and the column between double quotes, "Customer!H". The reference of the row will be taken in the cell B4 in our example

=INDIRECT("Customer!H"&B4)

Tutorial video

If there is something you didn't understand, have a look at the following video.

2 Comments

  1. Rick Rothstein
    03/06/2022 @ 02:32

    May I suggest instead of using the Volatile INDIRECT function as you did with this..

    =INDIRECT("H"&J2)

    to use the non-Volatile INDEX function like this instead...

    =INDEX(H:H,J2)

    Reply

    • Frédéric LE GUEN
      05/06/2022 @ 21:47

      Very good point Rick

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *