ZixCorp Lexicons An Overview March 2013
Table of Contents Introduction.. Pg. 3 Healthcare Lexicons.. Pg. 3 Example #1: (Standard rule covering official business messages).... Pg. 4 Example #2: (Standard rule covering official business messages).... Pg. 4 Financial Lexicons. Pg. 5 Example #1: (Match on financial identifier and financial terms).. Pg. 6 Example #2: (Match on financial identifier and financial terms)..... Pg. 6 Credit Card Lexicon.. Pg. 6 SSN Lexicon Pg. 7 State Regulation Lexicon.. Pg. 7 Profanity Lexicon. Pg. 7 Medical Research Lexicon..... Pg. 8 Customized Lexicons.. Pg. 8 Lexicon Development Process... Pg. 9 Content of Zix Lexicons.. Pg. 10 2 ZixCorp Lexicons: An Overview
Introduction ZixCorp Email Encryption Services use a set of comprehensive lexicons to scan for sensitive information, such as personal health information (PHI) or personal financial information in electronic messages. Searches are conducted by scanning all message subjects, bodies and attachments for sensitive information defined within the lexicons. A lexicon is a file consisting of a comprehensive set of terms, phrases, expressions and pattern masks that identify sensitive types of information. Sensitive information is defined as any information that, when inappropriately disclosed, can lead to significant contractual or legal liabilities; serious damage to your organization s image and reputation; or legal, financial, or business losses. ZixCorp uses many sources to generate the lexicon content that is used to search for sensitive information, including federal regulations, authoritative reference sources on the subject and standard of care practices. The following is a description of the lexicons that are typically used in ZixCorp Email Encryption Services, followed by a basic list of the formats inside each of the standard lexicons. In addition to these standard lexicons, custom lexicons can be created to detect sensitive information that is unique to an organization such as customer codes or classified project identifiers. Healthcare Lexicons Healthcare lexicons are designed to identify PHI as defined by the Health Insurance Portability and Accountability Act. The Healthcare lexicons are a set of two lexicons, identifiers and health terms that work together to identify PHI. The lexicons search for PHI by taking the intersection of identifying information, combined with health terms or claims information. This provides the highest level of confidence that context is actually PHI. An example of this would be a document containing a patient s date of service and diagnosis. The date of service would constitute an identifier, and the diagnosis would constitute health information. To search for PHI, both of the healthcare lexicons are combined using the following logic: Identifiers AND Health Terms The identifiers lexicon looks for indications of official business communications, such as SSNs, Subscriber IDs, dates of birth, etc. The Health Terms Lexicon scans for diagnoses, diseases, insurance information, pharmaceutical information, etc. 3 ZixCorp Lexicons: An Overview
The healthcare lexicons can be used on the ZixGateway to effectively identify messages that contain PHI and then manage those messages in a method compliant with HIPAA legislation. The following are several example messages that would be identified as PHI by the healthcare lexicons. Bold font indicate terms that are contained in the lexicons. Example #1: (Standard rule covering official business messages) From: Sue To: Linda Subject: RE: Shared patient Linda, Here s the info you requested on patient Jane Doe, ss# 123456789. She sees Dr. A. at General Hospital. She began fluorouracil approximately 5/15/2011. When he saw her in 2012, he stated that she had been on fluorouracil for a year. Her last visit was 10/14/2012. No cancer! Example #2: (Standard rule covering official business messages) From: Sue To: Linda Subject: RE: Daily Inpatient Report General Hospital does have an acute rehab service. Both members are improving considerably with their therapy. Members are Mr. Smith, Mbr Num: 123456 & Mr. Jones, Mbr Num: 234567. They are on a rehab unit. 4 ZixCorp Lexicons: An Overview
Financial Lexicons Personal financial lexicons consist of a set of 2 lexicons: financial terms and financial identifiers. These lexicon files are designed to work in combination to recognize Nonpublic Personally Identifiable Financial Information as defined in the Gramm-Leach-Bliley Act (GLBA). The lexicons work in conjunction to recognize the intersection of financial identifiers, such as SSNs, account numbers or loan numbers AND financial terms, such as balance transfer, refinance or deposit. The following logic is used to identify messages containing nonpublic personally identifiable financial information: Financial Identifiers AND Financial Terms ZixCorp personal financial lexicons can be used on the ZixGateway to effectively assist companies in identifying personally identifiable financial information in email traffic. Below are several example messages that would trigger the personal financial lexicons. The expressions shown in bold font indicate terms that are identified in the lexicons. 5 ZixCorp Lexicons: An Overview
Example #1: (Match on financial identifier and financial terms) From: Linda To: Sue Subject: Your Account Dear Miss Jones, We here at Big-Mortgage-Finance Corp. have noticed that you have defaulted on loan #123456. We are happy to assist you however possible. Perhaps an automatic payroll deduction could help you make regular bill payments. Please see the attached account summary and submit payment in full as soon as possible to avoid foreclosure. Example #2: (Match on financial identifier and financial terms) From: Mike To: Daniel Subject: Prepayment Fees In order to complete the monthly billing, please verify the prepayment fee for the following accounts: JOHN DOE 111001111 2,630.00 SUE JONES 222002222 4,250.00 Please respond as soon as possible, so we may complete the billing process. Thank you for your assistance. Credit Card Number Lexicon Major credit card companies and banks use standard numbering sequences that are unique to each brand of card, such as Visa, MasterCard, or Discover. The Credit Card Number Lexicon can identify most credit card numbers and bank card numbers with matching technology that recognizes the identifiable patterns of numbers that all major credit card companies and banks use. 6 ZixCorp Lexicons: An Overview
SSN Lexicon This SSN lexicon is designed to identify social security numbers in emails. The lexicon is used to detect 9 digit numbers that meet the format requirements of an SSN and are found in close proximity to a label that identifies the number as a SSN. The SSN lexicon is included in many of the other lexicons, but can also be used independently to identify emails containing SSNs. State Regulation Lexicons To assist organizations with state compliance requirements, such as the privacy regulations in Massachusetts, Nevada, California, Texas and many other states, the State Regulation lexicons can be used to detect emails with sensitive content as defined by those laws. The wording in these regulations typically defines sensitive content as personal information which includes a resident's first name and last name or first initial and last name in combination with any one or more of the following data elements that relate to such resident: (a) Social Security number, driver's license number or state-issued identification card number; or (c) financial account number, or credit or debit card number, with or without any required security code, access code, personal identification number or password, that would permit access to a resident s financial account. The State Regulatory lexicons are designed to detect social security numbers, state-specific driver s license numbers, financial account numbers, and credit and debit card numbers. Profanity Lexicon The profanity lexicon is designed to recognize profane and obscene language in email messages. According to Merriam-Webster, profane means to debase by a wrong, unworthy, or vulgar use. Obscene means marked by violation of accepted language inhibitions and by the use of words regarded as taboo in polite usage. These definitions form the basis on which this lexicon was designed and developed. 7 ZixCorp Lexicons: An Overview
Medical Research Lexicon The Medical Research lexicon is designed to help organizations identify emails that contain nonsensitive information directly related to research activities. Research information can often be incorrectly identified by the Healthcare and Financial lexicon as being sensitive because it has many common attributes of PHI and personal financial information. In a research environment, the email traffic often contains test results of de-identified patients or animals, and information on grant funding. None of these emails are sensitive, so the Medical Research lexicon is used to identify these messages, so they can be processed appropriately. The ZixResearch Center TM has identified complex expressions that are standard and exclusive to research environments. This lexicon is very effective at identifying messages that deal with nonsensitive research-related topics. Customized Lexicons ZixCorp can help customers develop and deploy custom lexicons for ZixGateway or design effective ZixGateway policies that can best implement their corporate email policies. For instance, a client may have specific account number or medical record formats, in this case the Zix Research Center will create a lexicon to scan for those specific formats, thereby increasing the accuracy of that client s scanning capabilities. All customizations are performed as a client service and there is never any charge for this service. 8 ZixCorp Lexicons: An Overview
Lexicon Development Process ZixCorp goes to great lengths to ensure that lexicons are accurate and precise. This is accomplished through a comprehensive definition and design of the lexicons, coupled with exhaustive manual analysis to ensure that the lexicon results agree with the judgment of the lexicon designers. The following example provides a high level overview of the design process and validation of the lexicons: 1. Standard lexicons designed based on definitions from HIPAA, GLBA, State Regulations or standard of care practices. 2. Jury standard document developed 3. Message samples gathered from participating partner organizations 4. Samples manually examined using the jury standard document as a reference 5. Reference sources identified to ensure comprehensive content, including medical dictionaries, professionally-accepted terminology lists, legislation, etc. 6. Lexicons constructed and run against message samples 7. Lexicon results compared to manual results 8. Lexicons tuned and rerun against sample until performance is optimized 9. Revisions made based on changes in the definition of sensitive information and continuous collection of message samples. 9 ZixCorp Lexicons: An Overview
Content of Zix Lexicons The section below includes the basic information that each of the standard Zix lexicons includes in its scanning formats. Health Identifiers SSNs Vehicle Identification Numbers Member Numbers Medical Savings Account numbers Medical Record Numbers Subscriber Numbers Patient ID numbers (All of the above are only found when in close proximity to a number at least 5 digits long) Admit dates Dates of Birth Dates of Death Dates of Discharge Dates of Service (All of the above are only found when in close proximity to a date) 10 ZixCorp Lexicons: An Overview
Health Terms Diseases Chemicals, Drugs, and Analytic, Diagnostic or Therapeutic Techniques Substance Use or Abuse Mental Health Terms Medical Records Information Insurance Information Medications Personal Financial Identifiers SSNs Vehicle Identification Numbers Account Numbers Certificate Numbers Loan Numbers Policy Numbers Customer Numbers (All of the above are only found when in close proximity to a number at least 5 digits long) Personal Financial Terms Banking Terms Investment Terms Mortgage Terms General Financial Terms Credit Card Number Lexicon Mastercard formats Visa formats American Express formats Carte Blanche / Diners Club formats Discover formats Enroute formats JCB formats 11 ZixCorp Lexicons: An Overview
Social Security Number Lexicon hyphenated 9 digit valid SSN sequence (nnn-nn-nnnn) 9 digit valid SSN sequence (nnnnnnnnn) and in proximity of an SSN identifier (the phrase SSN, or SS, etc.) 9 digit valid SSN sequence sperated by spaces (nnn nn nnnn) and in proximity of an SSN identifier (the phrase SSN, or SS, etc.) State Regulation Lexicons SSNs Account Numbers State specific Driver License formats Generic Driver license formats Debit/Credit Card Numbers 12 ZixCorp Lexicons: An Overview
About Zix Corporation Zix Corporation (ZixCorp) provides the only email encryption services designed with your most important relationships in mind. The most influential companies and government organizations use the proven ZixCorp Email Encryption Services, including WellPoint, Humana, the SEC and more than 1,200 hospitals and 1,300 financial institutions. ZixCorp Email Encryption Services are powered by ZixDirectory SM, the largest email encryption community in the world. The tens of millions of ZixDirectory members can feel secure knowing their most important relationships are protected. For more information, visit www.zixcorp.com. For more information about ZixCorp Email Encryption Services, contact ZixCorp at 866-257-4949 or email sales@zixcorp.com. 13 ZixCorp Lexicons: An Overview
Copyright and Trademarks Notice This manual, ZixGateway TM software and other computer software offered by ZixCorp Systems, Inc. and its affiliates (collectively "ZixCorp") are the property of ZixCorp and are copyrighted. Your use of ZixCorp property and services is governed by the services agreement and/or license accompanying the original media. Your right to copy ZixCorp property is limited by copyright law. Unauthorized duplication or distribution of the software, or any portion of it, may result in severe civil or criminal penalties, and will be prosecuted to the maximum extent possible under the law. ZixCorp Systems, Inc. 2002-2010 All Rights Reserved. Protected Under U.S. Patent Laws. The following are registered marks of ZixCorp or its affiliates and are protected by trademark laws under U.S. and international law: ZixAuditor, ZixCorp, ZixGateway and ZixResearch Center. All other brand and product names are trademarks or registered trademarks of their respective holders. Contact Information Zix Corporation 2711 N. Haskell Avenue Suite 2300, LB 36 Dallas, TX 75204-2960 Telephone: (214) 370-2000, (888) 771-4049 Fax (Main): (214) 370-2070 14 ZixCorp Lexicons: An Overview
15 ZixCorp Lexicons: An Overview