Search
 
SCRIPT & CODE EXAMPLE
 
CODE EXAMPLE FOR PYTHON

how to find duplicates in csv file using python

import csv
import collections

with open(r"T:DataDumpBook1.csv") as f:
    csv_data = csv.reader(f,delimiter=",")

    next(csv_data)  # skip title line

    count = collections.Counter()

    # first pass: read the file
    for row in csv_data:
        address = row[6]
        count[address] += 1

    # second pass: display duplicate info & compute total
    total_dups = 0
    for address,nb in count.items():
        if nb>1:
            total_dups += nb
            print('{} is a duplicate address, seen {} times'.format(address,nb))
        else:
            print('{} is a unique address'.format(address))
    print("Total duplicate addresses {}".format(toal_dups))
 
PREVIOUS NEXT
Tagged: #find #duplicates #csv #file #python
ADD COMMENT
Topic
Name
3+3 =