Bulk Screening

Bulk Screening

Submitted File Specifications

Data should be in columns and rows in a file format that can easily be converted to CSV or Excel. Columns should have Column Headers. If multiple sheets exist, or multiple columns have name data, please indicate which you would like screened, or all may be screened. Each field of data must be in it's own column, do not combine name and address data. Charges are calculated on a per-row x per-column basis, no deduplication is done on our part. If you are submitting multiple files, please zip them together, we support .zip, .gz, and .7z.

The default setting for matching logic in our Batch Screening product is 50% word match with a minimum of 2 words (or 1 if only 1 word is in either your entry or the DPL entry), unless the end user specifically requests different settings. We then further filter this using a Levenshtein distance of 35 or less, and then an Analysist further reduces the report based on their human faculties. We prefer to have an ISO 2 Letter Country code for each row of data, it greatly helps with the filtering process.

You can submit the file via our website, ftp(s) site (requested from our Customer Support) or email - we can accept gpg encoded files, our public key is available from our support site.

Returned Report Specifications

Column A: IDNUM represents an arbitrary unique identification number for each listing in our Denied Party List (DPL). If you click them, it will take you to a detailed view of the particular listing in our DPL. You will need a username and password -- please contact Customer Support if you do not already have credentials.

Column B: Denied Party Data is the data in the DPL that matched your data. It should correspond to the name, or the street1 field.

Column C: Percentage represents the percentage of words involved with the match that were not on your common word list.

  • So "Smith, Robert Tomas" vs. "Smith, Jack Tomas" means 4 words (the smiths) involved with the match, and 6 words total = 66%
  • Or "Amalgamated, Inc." vs. "Amalgamated Chemicals" assuming "Inc" is a common word means 2 words (the Amalgamateds) out of 3 total words or 66% (one word match allowed due to only 1 word on left hand side)
  • Or "North American Mercantile Savings and Loan" vs. "Association of North American Business Interests", assuming "and" and "of" are common - 4 words (norths and americans) out of 10 or 40% match.And so on.

Using this percentage helps to weed out matches that do have 2 words in common, but are not very closely related.

Column D: Data Matched is a copy of the field in your data that our data had matched to. It should contain name or address data.

Column E: Name of the individual, organization, entity, or vessel in our database.

Column F: Country is the ISO 2 letter code of the listing.

Column G: Code indicates the source list code.

Column H and after are your (the end user's) data