DataWright Information Services

Consulting and Resources for Excel and Access




Merging lists to create a single unique list

If you’ve ever had to combine two or more lists in Excel so that
you get a unique list from the result, you’ll know that it can be a
tedious job. A common example is two address lists with the same
layout, and overlapping data. This article shows three ways to achieve
the final result. They are:

  • You can download a zip file with 3 sample
    files, here.

Using a key field to get a unique list

To get a unique list using a key field, we need to go through the
following steps:

  • Copy both lists to another sheet. That way, you don’t affect
    the original data
  • Create a formula that identifies duplicates. You may need to
    create a calculated column that combines two or more columns, so
    that you have a unique key; if so, you will have two calculated
    columns. The first will be to create the key, and the second
    will identify duplicates.
  • Filter the list for duplicates, and delete these rows.

Try it out

In the file download is a workbook called MergeLists.xls. Open it
up, and you will see that Sheet1 and Sheet2 each have a list of
names and addresses. Both lists have the same layout, which
simplifies the process for us. There are only 18 names in the two
lists, two of which are duplicated, but the process is similar for
lists with thousands of names.

The list on Sheet1 looks like this:
Starting list on Sheet1

Create a key

Name and address details can be difficult to identify as unique.
There can be duplication in any of the name or address fields
without duplicating the whole record, so it’s best to create a key
that uses all of the available fields. In G2 of Sheet1, enter this
formula:

=A2&B2&C2&D2&E2&F2

Copy and paste the formula down column G to create a unique key
for each address record. Do the same on Sheet2. It isn’t pretty but
it doesn’t need to be. Once you have the combined list you can
delete the additional columns.

Check for duplicates

In cell H2 of Sheet2, place this formula:

=COUNTIF(Sheet1!$G:$G,G2)

Fill down — records in this sheet that do not have a match in
Sheet1 will have a 0 in Column H. If there are duplicates, you will
get a value higher than 0.

Filter the unique records

Still on Sheet2, do the following:

  • Type Key in G1
  • Type Count in H1
  • Autofilter the list for 0 in Column H.
  • Copy the filtered records across to Sheet1, and paste them
    below the existing list.
You should now have a list of 16 addresses on Sheet1,
with no duplicates. This image shows the final result.

The combined list has no duplicated rows

Using a query to extract unique data

The second option is to use the power of a database query to
extract the data into a unique list. The SQL language can combine
two or more lists into a single, unique list with a UNION query.
This approach requires the following steps:

  • If the lists are in different workbooks, copy them to a new
    workbook so that they are on Sheet1 and Sheet2. This has the
    advantage that you won’t need to change any of the code. If you
    did create a new workbook, save it.
  • Paste this code into a new module in the workbook.
''================================================
''This code combines two tables on separate sheets
''into a third worksheet in the same workbook.
''Requires: A reference to the Microsoft ActiveX
''Data Objects 2.x Library, where x is a number
''between 1 and 8.
''Assumptions: The two starting lists are on Sheet1
''and Sheet2. The combined list will go on Sheet3.
''You will need to change the sheet references in 
''this code to reflect the names or your worksheets.
''Created: 29 November 2007 by Denis Wright
''================================================

Sub ADO_Merge_Sheets()
  Dim cnn As ADODB.Connection
  Dim rst As ADODB.Recordset
  Dim sSQL As String
  Dim sPath As String
  Dim MyConn

  sPath = ActiveWorkbook.FullName

  'Define the SQL statement that merges the sheets.
  sSQL = "SELECT * FROM [Sheet1$]"
  sSQL = sSQL & " UNION SELECT * FROM [Sheet2$]"

  'Establish connection to the same file
  'When connecting to Excel instead of a database, you need to define
  'the extended properties as Excel 8.0 (The first Excel version to use ADO)

  MyConn = sPath

  Set cnn = New ADODB.Connection
  With cnn
    .Provider = "Microsoft.Jet.OLEDB.4.0"
    .Properties("Extended Properties").Value = "Excel 8.0"
    .Open MyConn
  End With

  'Define a recordset based on the SQL statement

  Set rst = New ADODB.Recordset
  rst.CursorLocation = adUseServer
  rst.Open Source:=sSQL, _
      ActiveConnection:=cnn, _
      CursorType:=adOpenForwardOnly, _
      LockType:=adLockOptimistic, _
      Options:=adCmdText

  Application.ScreenUpdating = False

  'Delete existing data on the destination sheet, then
  'transfer the results of the latest filter, starting at cell A2.
  'When done, clean up references to avoid memory leaks.

  With Sheets("Sheet3") 'Sheet3 is the DESTINATION sheet
    .Range("A1").CurrentRegion.Clear
    .Range("A2").CopyFromRecordset rst
  End With
  rst.Close
  cnn.Close

  'Copy the headings across from Sheet1 and Autofit the final list.
  Sheets("Sheet1").Range("A1").EntireRow.Copy _
  Destination:=Sheets("Sheet3").Range("A1")
  Application.CutCopyMode = False
  Sheets("Sheet3").Range("A1:F1").EntireColumn.AutoFit

  Application.ScreenUpdating = True

End Sub

Try it out

In the sample files provided with this example is a workbook
called MergingLists_Code.xls. It already has the lists in Sheet1 and
Sheet2, and a button on Sheet3 that runs the code.

The code-based method is much quicker to use than the
formula-based one. If you are starting to experiment with ADO code
in Excel, getting an understanding of SQL can save you a lot of
effort.

How it works

SQL is a language common to all relational databases. If you
build a query in Access, and then look behind the scenes, you will
find that the graphical layout has created a SQL statement to do the
work of selecting, grouping, and sorting your data. ADO is a generic
technology that brings SQL capabilities (among other things) to
Office applications like Excel. With ADO you can interact with a
whole range of database systems, including (in this case) Excel
files.

The SQL statement in the code strings together to produce this:

  SELECT * FROM [Sheet1$] UNION SELECT * FROM [Sheet2$]

A SQL SELECT statement requires you to select one or more tables
or queries to pull your data from. When you are using a worksheet as
the data source, enclosing the sheet name in square brackets (with
the $ symbol) causes SQL to treat the worksheet as a table.

The UNION operator will combine 2 or more tables with the
same structure
into a single list. You need to ensure that:

  • The tables / sheets have the same number of fields (data
    columns), and
  • Those fields are in the same order in all of the lists.

If the fields are not in the same order you will get an error and
the code will fail or (possibly worse) create a garbage result.

Creating a unique list with
Advanced Filter

The third approach is also quite easy to use. The steps are:

  • Copy and paste to create a starting list with all the
    records, including duplicates.
  • Use Advanced Filter to extract the unique items

Try it out

In the sample files, open MergeLists_AdvFilter.xls. Sheet1
contains 16 names, and Sheet2 a further 9 names.

Combine the lists into one list with all records

Copy the lists from Sheet1 and Sheet2 onto Sheet3, so that you
have a single list with one set of headings.

Extract the unique records

Place your cursor in the list and then follow the instructions
below for your version of Excel.

Excel 2007

On the Ribbon, click the Data tab. The third group of icons is
Sort & Filter. In this list, select Advanced Filter.

Earlier versions

Select Data > Filter > Advanced Filter.

All versions

The Advanced Filter dialog will display. The screen shot
is for 2007, but the layout for earlier versions of Excel is
virtually identical.

The Advanced Filter dialog, ready to show unique records only

Make sure that you select Unique Records Only in
the bottom left corner of the dialog, then click OK. The list will
filter down to show all of the unique records. You can now select
the table, and copy the new list to another sheet.

Clear the filter

Once you have copied the unique list somewhere else, return to
Sheet3 and undo the filter.

Excel 2007

In the Data tab of the ribbon, just above the Advanced button is
Clear. Push this to clear the filter and redisplay
all rows.

Earlier versions

Select Data > Filter > Show All to remove the
filter.

Wrapping up

This article shows three different approaches for combining two
or more lists into one. There is always more than one way to get the
job done. These tips should increase your options if you need to
tackle a chore like this.