Risk Detector
Detects various types of possibly fake data in person records.
Examples by culture: Examples: |
Developer: see the technical specification of the REST service.
Input/Output
Possible input values are the person's name, telephone numbers and email addresses, and physical addresses. The API is flexible in how the data is fed, for example the name can come in as a single string "full name" or separated into specific name fields.
The output contains an overall risk score in the range -1 to +1, plus detailed information about every detected risk.
A score > 0 means there's a risk. Zero is the neutral value; nothing bad detected yet there's nothing good either. A negative value means that the record does look genuine.
Performed checks
Fake risks
Completely invalid data is entered to pass mandatory input field requirements.
Random typing detection
Example: "asdf asdf".
Placeholders entities
Examples: "John Doe", "Anytown"
Famous and fictional entities
Examples: "James Bond", "Barak Obama"
Humorous, invalid, vulgar input
Examples: "Sandy Beach", "Timbuckthree", "None of your business", "firstname lastname"
Disguise risks
Such mangled input is used to circumvent machine processing. Humans can still understand these modified values, but machines can't unless they detect the patterns and clean the input.
Padding
Padding is adding content to the left/right of a value.
Example: XXXJohnXXX
Stutter typing
Example: Petttttttttterson
Spaced typing:
Example: P e t e r M i l l e r
Demarcation
This is not a spell checker service, nor an address verification service. The service does not return cleaned data.