NORTHEAST CONSORTIUM NORTHEAST RESEARCH CONSORTIUM: Guide to Using Real Time Data for LMI Analysts. March, 2012

Size: px
Start display at page:

Download "NORTHEAST CONSORTIUM NORTHEAST RESEARCH CONSORTIUM: Guide to Using Real Time Data for LMI Analysts. March, 2012"

Transcription

1 NORTHEAST CONSORTIUM Understanding the Labor Market in New Ways NORTHEAST RESEARCH CONSORTIUM: Guide to Using Real Time Data for LMI Analysts March, 2012 This workforce solution was funded by a grant awarded by the U.S. Department of Labor s Employment and Training Administration. The solution was created by the grantee and does not necessarily reflect the official position of the U.S. Department of Labor. The Department of Labor makes no guarantees, warranties, or assurances of any kind, express or implied, with respect to such information, including any information on linked sites and including, but not limited to, accuracy of the information or its completeness, timeliness, usefulness, adequacy, continued availability, or ownership. This solution is copyrighted by the institution that created it. Internal use by an organization and/or personal use by an individual for non commercial purposes is permissible. All other uses require the prior authorization of the copyright

2 Section 1: THE LMI UNCERTAINTY PRINCIPLE Perhaps the most important distinction to make when dealing with online job postings is that they are not the same as job openings or vacancy rates. They are created by employers for their own unique and specific purposes, which does not include developing reporting data for use by LMI analysts. Online job postings may be used to attract applicants, but are also often times written to solicit a specific response or simply test the market. Online job postings are prevalent in some occupational areas and limited or non existent in others. The actual postings are usually vague in geographic location details and frequently do not include information such as salary, educational requirements, experience requirements, etc. Also, there is not necessarily a one to one relationship between an online job posting and a job opening. An employer might: Create a single posting for several openings with the same or similar qualifications in the same location or multiple locations; Use an online job posting to better understand the local workforce without having an actual job available; Post several times on differing websites for a single job opening, sometimes using different language in the description for each of the varying sites, making de duplication difficult or impossible; Never make some vacancies public, but fill them internally, through word of mouth, etc.; Use advertising methods that do not generate online postings signs in the windows, union hiring halls, radio ads, and employee referrals. As a result, in the world of online job postings, there are no standard practices, rules or conventions as to the form or function. Guide to Using Real Time Data for LMI Analysts Page 1

3 How Online Job Postings Differ from Job Vacancy Data THE LMI UNCERTAINTY PRINCIPLE 1: Output, i.e., projections & conclusions, cannot be better than input, i.e., the online job postings themselves. Postings are sourced, using spidering technology, from thousands of sites, frequently grabbing multiple copies of a single posting. Both the postings and the websites are of varying quality. The postings are designed primarily for recruitment, and are not designed for analysis. Analysis has shown that the postings are biased upscale in the labor market. There is a significant daily fluctuation of online job postings. Online job postings are influenced by factors unrelated to real changes in labor demand. Trying to extract real information from short time periods produces more noise than signal; three to six month chunks of data seem to be required to smooth out the noise. Geographic units of analysis are likely to be large, i.e., at the State level, because of the way job locations are posted. Zip code level data is not reliable because very few jobs post an exact location that can be tied to a zip code. THE LMI UNCERTAINTY PRINCIPLE 2: The more granular you make any parameter, the less you can know about other parameters. Analyzing skills at the occupational level will require large spans of both time and geography. Projections may be possible, but may not be capable of being isolated to individual occupations except in areas with very high concentrations of that occupation. Guide to Using Real Time Data for LMI Analysts Page 2

4 What Have We Gleaned From Online Job Postings Data? Advertising Behaviors: An analysis of postings based on job family (two digit SOC level) showed common patterns of advertising behavior. Different posting methods were found by job family, even by the same employer. Large employers and corporations were more likely to post all jobs online. Smaller businesses were more likely to only post executive or high skill positions on line. Data Quality and Quantity: The ability of automated software programs to generate accurate data from online postings is affected by both the quality and quantity of the postings. In some job families, larger quantities of postings have allowed the software to produce data that is measurably accurate for that job family. For other job families, posting quality, i.e., detailed content, is commonly found, which allows the software to produce measurably accurate data. Job families with a negligible number of postings (e.g., agriculture, forestry and fishing, building and grounds cleaning and maintenance, etc.) were found to be of limited accuracy. Job families with quality postings have the potential to be analyzed at the four digit SOC level if there is also sufficient quantity (e.g., computer and math, healthcare practitioners, technical occupations, etc.). Context Is Important: Understanding terminology context is important. Some terms and phrases have different meanings depending on the occupation to which it is applied. For example, reuse applies to software reuse, workforce reuse, and water reuse, but the meaning is quite different for each. This context is important because the meaning of the words can make classification of the skill different. The word reuse alone could be green for instance but it is not green when it is modified by software but probably green when modified by water. Analysis software can be trained to recognize these context items but such training requires significant work by trained LMI analysts. Postings Data are Complementary to Existing LMI: As with the majority of LMI data, online job postings do not represent a complete picture of the labor market. The potential value of this piece is in its timeliness and that it measures the flow of jobs, as opposed to being a measure of stock usually found in LMI. In preliminary comparisons, the online postings data was found to have similar patterns to JOLTS data, one of the few flow measures in traditional LMI. Potentially, existing LMI could be used as a benchmark for online job posting patterns, providing confidence that past data had been extracted within acceptable error limits and thus providing a measure of confidence that future data is also being extracted within acceptable error limits. Guide to Using Real Time Data for LMI Analysts Page 3

5 Not All Areas Are Created Equal: Geographic location for an advertised job is not universally available. Rural areas were found to have a limited share of job ads, while urban areas hold the greatest share. Frequently the actual location of a job is not included in the job posting, causing the data parser to default the listing to the location of the central zip code of a large labor market clearly an unreliable zip code location. Because of the location accuracy, the Consortium s goal of being able to provide data for all substate areas was not possible at present and may never be possible. Real time data reported at any level below a major metropolitan area is suspect and will either substantially under or over state the online posting counts. Real time LMI providers that produce city, county or zip code level data should be asked to disclose their coding and modeling methodologies for review and data produced at those levels should be considered questionable for any use until the analyst is satisfied with the response. Data Source is Important: Sources primarily have an effect on the data in two ways: First, data from some spidered sites simply lack content, compromising the parsing and occupational coding processes, i.e., you cannot analyze what does not exist. It is important to understand your vendor s spidering and de duplication processes. Second, there can be artificial increases in the volume of job postings when source domains are added to the spidering process. This can be irrelevant if you are doing a pointin time analysis, but an impediment to doing time series analysis. It is important to know whether your vendor is continuously adding new sites in an attempt to provide a reasonable representation of the scope of the labor market or whether your vendor is limiting the number of sites to a stable subset of reliable and representative sites. Guide to Using Real Time Data for LMI Analysts Page 4

6 There may not be a single version of a real time database that can be used for all purposes and maintaining multiple versions may prove to be prohibitively expensive. Pursuing the expansion of spidering to cover all large corporate sites, large national boards and smaller regional and occupation niche boards may be the only way to get representative counts in all local areas and all occupations. However, narrowing the number of sites to a stable but representative group so that a time series can be created produces data which tracks more closely with other external measures (the Help Wanted Online database is the best example of this approach). For analysis of the skills and other requirements of the particular class of job, it might prove useful to create a database that is fed only by high quality corporate, national and niche job boards. High quality in this context would be boards that contain extensive job postings with some or all of the information fielded (e.g., education or experience requirements are always found in the same place). Free posting sites like Craigslist or Snagajob are likely to be excluded. This constrained source database would not be used to gauge absolute volume, but may be very useful in showing proportions skill mix, education requirement distributions, experience requirement distributions, etc. Guide to Using Real Time Data for LMI Analysts Page 5

7 What Issues Must Be Addressed When Publishing Postings Data? Data Labeling: Results should be labeled online job postings, job ad volume, online job ads, or something similar, to accurately reflect the data being analyzed. The data does not represent job openings or vacancy rates. Incorrect uses of these terms, i.e., confusing job ads with job openings, have already been seen in media stories on various published data series. Data Accuracy: Data made available to the public should only be from data fields with a reasonable measure of accuracy. Data fields that cannot be verified to some standardized measure whether due to lack of information within the postings or due to error in the analysis software should not be published. Continuing standardization will be needed for ongoing research. Job posting data is fluid and continued analysis will be needed. Buyer Beware: When publishing information based on online job postings, you must also present data limitations, caveats, potential error rates, etc. Some end users may choose to utilize the data inappropriately anyway, but the same holds true for any LMI that is published (and is frequently misquoted and used out of context). We do recommend that only data with minimal caveats or disclaimers be released to the public. We also recommend that high level policy makers should only use online postings data in their decision making process if it is combined with traditional LMI. Although there is still a great deal of analytical and procedural work to be done, the information extracted from online job postings is still worthwhile, if used appropriately. Guide to Using Real Time Data for LMI Analysts Page 6

8 Realities and Limitations of the Data (It is important to note that there was not complete consensus by all of the Consortium members on all of these points on this page and the next, or on the value of online job postings in the decision making process. However, these two pages represent the best aggregation that everyone could live with.) DATA FIELDS Occupation Geography Industries Skills Job Title Firm Name Education Certification Greenness Timeframe Job source CURRENT LEVEL OF CONFIDENCE Generally 2 digit, depending on job family State level Unknown Currently, of limited utility and only with analyst review; Shows promise, but needs more research with an analyst looking at context and job family Yes Sometimes, but questionable, due to so many variations of the name; Requires an analyst to vet Sometimes, but interpretation and context are important (degree might be required, might be preferred). Also, many online postings for jobs with clear degree requirements (e.g., lawyers, nurses, etc.) will not include any stated requirement, letting the degree level be implied. Caveat: Low % of fields listed When it is listed, it is generally valid, but it still needs an analyst to review, based on the low percentage of listings Sometimes; should only be used for research purposes with an analyst review for interpretation and context. Analysis based on skill words generally produces much more accurate results than coding by occupation. The primary problem is the rate of false positives (calling a job green when it is not). Better to use quarterly data or a 3 month moving average to smooth out spikes; monthly is possible (calendar & 30 day timeframes), depending on the report Yes Guide to Using Real Time Data for LMI Analysts Page 7

9 WHAT THE DATA CAN DO: Can show which websites post job ads Can show which firms post job ads Can describe data at the state level or other large geographic area Can display accurate information at most 2 digit occupational codes Can compare job titles with other variables Can use postings on a quarterly or, sometimes, on a monthly basis, depending on the report; usually better to use quarterly data or a 3 month moving average to smooth out spikes Can improve quality by selectively removing bad sources of data, using Quality Assurance & Data workgroups Can show educational levels matched to job titles, especially in high skill jobs, when education is listed or inferred and with an analyst looking at interpretation and context Can provide information on certifications within certain occupations, when it is posted and with an analyst review Can look at skills and certifications together to get a more complete picture, with an analyst reviewing, especially for context and job family May be able to do firm name based on query (good for green jobs, comparative analysis, and profiling online job ad posters) May be able to get an up/down trend on known list of skills May be able to show increase/decrease in job titles Can show job titles in demand Can show top volume O*Net codes WHAT THE DATA CANNOT DO: Cannot equate job posting with job vacancy Cannot eliminate all duplicates Cannot get the universe of job openings Cannot get all 6 digit occupational codes Cannot do industries Cannot know negative biases on anything not posted or parsed Cannot use skills without vetting Cannot project job postings yet, but modeling shows promise out to 6 months Cannot provide consistent representation of postings from month to month Cannot show salary or benefits Cannot publish zip code or any sub state level data Cannot always determine educational needs in most postings Cannot always determine required vs. preferred educational level (where listed)

10 Section 2: OVERVIEW OF REAL TIME DATA With employers increasingly turning to the Internet to advertise opportunities, online job postings have become the primary vehicle for many job seekers. There have been mixed responses to this development. Some argue that given the size of today s applicant pool, firms have become ever more selective when sorting through résumés, prompting rumors of discrimination against the unemployed. 1 Others contend that online intermediaries alleviate labor market imperfections, often associated with imperfect information and adverse selection. 2 Labor market analysts, workforce professionals, and researchers are increasingly in need of diverse and robust sources of data in order to be more responsive to the changing economy. In recent years, realtime data, most commonly in the form of online job postings, has become a staple of some labor market information (LMI) staff, and its use is expected to increase in the years ahead. For example, the Projections Managing Partnership, which coordinates the nationwide employment projections, has stated that the new projections interface will likely include data from online postings alongside the standard information, such as total employment and new and replacement job growth. 3 In this changing environment it is critical for researchers, policy makers, and other data users to have a broad understanding of what real time data is, how it is obtained, and what it can and cannot accomplish, as well as how it relates to more commonly used sources of data or traditional labor market information. This section provides an overview of the analytical uses of online job postings, outlines the methodology used by the Northeast Consortium s data provider (Burning Glass Technologies), and briefly discusses the relationship between online postings and traditional sources of labor market information. 1 The Help WantedSign Comes With a Frustrating Asterisk NYT ( wanted ads exclude the long term jobless.html?_r=1&smid=fb nytimes&wt.mc_id=bu SM E FB SM LIN HWA NYT NA&WT.mc_ev=click) 2 The Economics of Labor Market Intermediation VOX Policy Research ( crisis debate.com/index.php?q=node/2500) 3 Presentation by Ms. Alexandra Hall, Director, Office of Government, Policy and Public Relations, Colorado Dept. of Labor & Employment Guide To Using Real Time Data for LMI Analysts Page 9

11 The Promise and the Pitfalls Traditional LMI includes systematically collected data from either administrative records, such as the Quarterly Census of Employment and Wages (QCEW), or from surveys, such as the Occupational Employment Statistics (OES) wage survey. This methodologically sound data is both valid and reliable; however, there is a significant time lag between when data is collected and when it is released for analytical purposes. Online job postings can be compiled into a database and analyzed as a proxy for recent demand in the labor market. In contrast to traditional LMI, online job postings (real time data and online job postings are used interchangeably in this report) offer a more timely data source for analysis. As previously stated, online job postings were designed for recruitment, not analysis; therefore, postings databases lack the validity and reliability of traditional LMI. They have the potential to provide an innovative tool for analysis of issues in labor demand, especially related to industries, such as green, where many of the occupations and their accompanying knowledge, skills, and credentials are still emerging. Aggregated job postings databases include many variables that are absent from traditional data sources, such as skills, education requirements, industry based certifications, and employee benefits associated with job postings. The promise of the data has resulted in increased interest in their use for analysis. The potential benefits of jobs postings analysis include: 1) unique features that position postings as the solution for curriculum development; 2) incorporation into models for more accurate short term projections; and 3) a source of emerging skills that have not made it to standardized knowledge, skills and abilities taxonomies produced by O*NET. These benefits and others have resulted in the increased interest in and use of postings in recent years. Job postings databases compiled from various Internet sources are not without limitations. While the diversity of sites used as a source of data is an asset, it also creates problems not found in traditional labor market data. The quality varies widely. Postings often do not include all of the skills, knowledge, or credentials required for a position (and it is impossible to determine what proportion of the requirements is included). In addition, postings are a proxy for demand, but they do not represent actual hiring. While not all postings are for open positions, it is also true that not all open Guide To Using Real Time Data for LMI Analysts Page 10

12 positions are posted on the Internet. Word of mouth, referrals, and even signs in the window continue to be preferred recruitment methods for some industries and firms. There are also occupations where recruitment rarely occurs online. The number of jobs posted online is higher in fields such as computer science and management, but lower in fields like construction and farming. Many of these factors result in the postings having a generally upscale bias overall in terms of the types of positions listed. That upscale bias may actually be a benefit for the use of real time data to help guide curriculum development because many of the positions not recruited online are either low skill or require only on the job training. Guide To Using Real Time Data for LMI Analysts Page 11

13 Data Field Specifics What the data can do There are several variables in the online postings that are of value for data users and consumers. Unit of time: The job postings data used by the Consortium includes a variable for the date of the posting. Data quality reviews concluded that the minimum unit of time for analysis is one month because of the variability of postings on a daily or weekly basis, but that it was generally better to use at least quarterly or 3 month moving average data. The use of a moving average is common for highly volatile data series like the unemployment claims data. As noted above, the more granular the other elements of analysis are (e.g., occupations, levels of geography, skills), the longer the time period of analysis should be. Geography: State level and large geographic areas are the best geographic units of analysis. As described in the limitations section, the location information available in the postings is not as specific as it may seem. Zip code level data was deemed unreliable and misleading. Job title: Job titles from postings are useful variables for analysis. Titles can be compared over time to determine if some are increasing or decreasing in frequency. They are also valuable when used in conjunction with other variables (e.g., education, where available). Titles may produce a window into emerging occupations and, with further analysis, emerging skills. Firm name: The name of the firm is not available for all postings (because it is not listed on some sites), but it can be analyzed when it is included. Useful information that can be gained from firm name analysis including the types of postings (occupations and job titles) listed for specific firms, seasonal hiring patterns, and gaining a better understanding about which firms post jobs online. Guide To Using Real Time Data for LMI Analysts Page 12

14 Skills and certifications: Skills are not clearly represented in the postings, and certifications are included in only limited groups of occupations. Skills, in this context, is short hand for a mix of qualifications and job duty statements that might more broadly be thought of as skills, abilities, theoretical and applied knowledge, common tasks and types of experience. Terms and abbreviations can have more than one meaning; for example, P.T. might mean Physical Therapist or Part Time. Postings frequently do not specify qualifications, as it appears an assumption is made that those who are qualified already know the qualifications. For example, lawyers know they need to pass the Bar, and nurses know they need a degree of some kind and need to be licensed. Furthermore, the data analysis showed that the skills field often functions as a catch all category into which skills, certifications, occupations, and qualifications were parsed. Specific skills and knowledge areas, such as knowledge of a computer language, are relatively easy to identify, as are basic skills such as communication. Other terms do not parse as neatly. This area requires a great deal more research. Further research should focus on extracting and categorizing skills and certifications that will bear fruit because online postings are the only real source of this data. As noted earlier, one of the key efforts is to increase the ability of analytical software to understand context. Source Domain: The website from which the job posting was captured can be a useful analytical tool. It can point to websites that generate disproportionate errors from the analytical software because of how the job information is formatted or how the site displays search terms and related jobs. It is potentially useful in generating stable time series databases. However, analysts need to be aware of shifts in domains that are unrelated to real changes in the data stream. Tracking successor and predecessor domains is a critical component of being able to track real changes, rather than simply a company s decision to redesign their website. This successor/predecessor issue should be familiar to any analyst who has tried to clean up wage record or QCEW files. Analysts need to be cautious about spikes in data from particular domains. Increases or decreases may be the result of changes in spidering, such as the addition of new websites or a broken spider. Such shifts may change the relative importance of a site in terms of producing master jobs the original job against which duplicates are measured. Guide To Using Real Time Data for LMI Analysts Page 13

15 What the data cannot do Northeast Research Consortium March, 2012 In addition to the types of analysis that can be conducted with online job postings data, there are also a number of analyses which are not advisable at this time. Location: Exact job locations are difficult to pinpoint. Recruitment often takes place regionally, and job postings commonly list the largest city near the work location, such as the Boston area. Unless the posting includes a precise address for the company (and even this is not always the place the new employee would be working), the general location is at best a proxy. The location related variables (e.g., city, county, zip code, latitude and longitude) should be used with caution. Geographies below the metro area level are likely to contain significant over or under counts, depending on where the analysis software decided to locate the job that lists only a broad recruitment area. Salary: In most online job postings, salary information is not included. When a salary is listed, it can be presented in many different forms (e.g., annually, monthly, weekly, or hourly). Postings with some salary information may include dollar signs or other abbreviations that cause inconsistency when parsing the information into variables. For all of these reasons, this data field is not considered usable. Occupational Coding: Occupational coding beyond the two digit SOC code level, is problematic. Overall, two factors influence the ability to analyze data by job family (two digit SOC). First, job postings do not, as a general rule, include occupational codes. Thus, the accuracy of occupational coding depends on the data provider s ability to accurately classify a job title into an occupational taxonomy. Within the data file used by the Consortium, the overall accuracy varied by type of job. For example computer and mathematics jobs were usually coded correctly, but farming, fishing, and forestry jobs were rarely coded correctly. The second factor influencing job family analysis is that the jobs posted online do not represent the universe of all occupations. Even if correctly coded into a taxonomy, there is limited representation for some job families (e.g., construction), while others have substantial representation in the postings (e.g., information technology / computer related occupations). The extent of the under representation can be estimated for job families where the job family has a tight link to a 3 digit NAICS code. For instance, NAICS 722 Eating and Drinking Establishments would be tightly linked to the SOC Family 35 Food Preparation and Service Workers. Using separations and new hire data from Census LED, one could estimate the proportion of turnover that would be expected in that industry compared to all others. That estimate of turnover could then be compared to the proportion of job posting found in job family 35 to see the extent of the likely underrepresentation. Guide To Using Real Time Data for LMI Analysts Page 14

16 Qualifications: Qualifications for job applicants, such as education, can be difficult to determine. This manifests itself in a number of ways: for example, most postings do not include the desired level of education, or when they do it can be unclear whether it is a required or preferred qualification. In addition, some basic prerequisites (e.g., passing the Bar exam for an attorney) that may be an industry or occupational standard are rarely listed because it is assumed that qualified applicants already know this requirement. As noted above, the full skill set required for each posting is typically not included. Industries: Online job postings are just that jobs. Industry information, and thus the ability to accurately assign a NAICS code, is rarely included in the postings. Frequently the posting is spidered from a site other than that of the employer, and a firm name is not listed. Guide To Using Real Time Data for LMI Analysts Page 15

17 Burning Glass Methodology The Consortium s job postings data was provided by Burning Glass Technologies, a Boston based firm that aggregates online postings and offers a variety products and services related to job matching. Postings are collected daily from thousands of private and government job boards and websites, newspapers and other media outlets, corporate job boards and websites, and community sites for employment opportunities. They are collected through spidering technology which crawls the web for suitable content. The text of each posting is then analyzed using a natural language based artificial intelligence, which allows context not simply a list of rules or look up tables to be taken into account when parsing the text into variables. A dataset of over 60 variables is produced, which range from the most basic (job title, company name, city) to more complex concepts (skills and credentials). For the Northeast Consortium project, a taxonomy of green skills was developed which serves as a reference so that when the parsing technology encounters a term representing a green skill it can be standardized in the database. Guide To Using Real Time Data for LMI Analysts Page 16

18 Job Postings Data and Traditional LMI Limited analysis has been done comparing trends in online job posting data to traditional labor market information. The Job Openings and Labor Turnover Survey (JOLTS) is not identical, but it is a helpful source of comparison. The two data sources have size differences; however, this has changed over time. In September 2004, there were over 4.1 million total private job openings from JOLTS, versus about 1.3 total online postings from Burning Glass; by summer 2010 this gap had narrowed to about three million openings for JOLTS and two million online postings for Burning Glass. This was likely due to a number of factors, such as increased use of online posting by firms and increased sites spidered by Burning Glass. More importantly, each series follows the same broad trends over time. And, although job postings data do track closely with Current Employment Statistics (CES), there is insufficient evidence to suggest that the data is a leading indicator. Guide To Using Real Time Data for LMI Analysts Page 17

19 Section 3: DEFINING GREEN Despite the recent momentum behind green jobs, there is still no universally accepted definition of green. A number of scholars have explored the workings of the green economy, but they all face the challenge of defining and quantifying this nebulous concept. In an effort to advance the efforts of conceptualizing the green economy, the Consortium chose to deviate from the traditional survey based approach and concentrate instead on analysis of online job postings. Data mining a universe of job ads gives researchers not only a time advantage, but also greater flexibility with respect to defining green. It also gives them the ability to modify the framework for what constitutes green (i.e., clusters, search triggers, etc.). More importantly, this method allows us to track the greening of the economy more efficiently than using green revenue as a proxy. 4 Given its strengths, this approach also has limitations. The Consortium had intended to conduct an analysis of robust data, however much of the time and focus was spent on improving the artificial intelligence parser and ongoing identification of green occupations and green skills. Initially it was believed that an industry based (NAICS) approach would yield the most effective outcomes. It quickly became evident that not all jobs within an industry, or even a job family, could be considered green and, most importantly, industry coding proved to be one of the most erroneous variables, in those rare instances where it was included in an online job posting. Therefore, a two tiered approach was established first, identification of green skills phrases, and second, creation of a green firms list. The foundation of this methodology is based on the green taxonomy, which is primarily derived from O*NET s descriptions of green occupations. In cooperation with our vendor, we closely imitated each occupation s description and, in turn, built and continuously updated a database of nearly 900 key phrases pertaining to green. 5 In effect, this is the basis of real time green demand analysis the prevalence of these tasks and phrases in real time data serves as a gauge of demand for green jobs. More specifically, postings that are validated through this taxonomy are flagged as green, signaling that certain green skills were listed in the job posting. Consider O*NET s description for Construction Managers, for example: Apply green building strategies to reduce energy costs or minimize carbon output or other sources of harm to the environment. Working from the above occupation, the following strings would all be included in the taxonomy to serve as trigger words in identifying green jobs: green, building, minimize carbon output, environment. The taxonomy can be easily improved with the addition of like terms and, similarly, the elimination of those deemed obsolete. 4 To proxy green jobs, some earlier studies used firms green revenue as share of total. Accuracy of such estimates is unsettled. 5 Appendix contains full taxonomy and clusters Guide To Using Real Time Data for LMI Analysts Page 18

20 The green jobs report developed by the Consortium shows the volume of online job ads containing green skills taxonomy. The report shows a point in time snapshot of the top green skills for the specified timeframe and geography. LMI users should be aware that due to changes in parsing software, changes to the green skills list and fluctuations in the number of websites spidered, there are too many inconsistencies to conduct a reliable time series analysis of individual skills. While the first approach provides a proxy for green labor demand via select keywords, the second tier was expected to give us parity with part of the BLS definition green goods producing and service providing establishments. Postings from establishments that only produce green goods or provide green services are indisputably green, so it follows that such analysis would shed light on recent green developments, such as skills and certifications demanded by the green business community. By matching a Consortium wide list of green firms 6 to real time data, we had hoped to reveal postings that may not have been triggered by our taxonomy, but may still be considered green since they are derived from a green employer. This method has proven to be very challenging in practice, however. Agreeing on green only producing establishments is problematic, but accurately matching establishment names to those appearing in real time data is simply not feasible at this time. This is largely due to the nature of online job postings (explained in the next section), whose format and content can vary substantially. Firstly, roughly half of the postings contain no employer information. Of those ads with the employer name present, slight discrepancies in spelling accounts for the remaining incongruity. For instance, in real time, company XYZ may take the form of XYZ, XYZ Inc., XYZ Corp., XYZ Ltd., etc. making the matching process not viable. One possible solution would be to add a handful of variations to each establishment, along with the official name, as seen in tax records. For these reasons, the advancement of this approach was not pursued as initially planned. As with most empirical work, we faced broader methodological limitations in defining green. Our reliance on algorithms to sort through the complexity of online jobs postings was the largest source of concern. As previously stated, postings are crafted for recruitment purposes and not for analysis, thereby often omitting information of interest to researchers. Even when the algorithms are on target, the context of each posting is perhaps the most delicate factor in the process, as many key words are acutely context sensitive, i.e., some terms have different meanings depending on the occupation to which it is applied. For example, reuse can apply to software reuse, workforce reuse, or water reuse, guaranteeing skewed green posting counts. Context also plays a factor when the parser cannot distinguish between a web site s page navigation, advertisements, and other source code, and the text of a job posting. When a green skill is pulled from a source outside the actual job posting text, it creates a false positive leading to an overcounting of green jobs. Also, it is wrong to assume that the distribution of green job postings mirrors the patterns of the real green economy; as previously explained, the analysis clearly shows that online job postings are not representative of all economic sectors. 6 See Green Firm Identification. Guide To Using Real Time Data for LMI Analysts Page 19

21 With the current occupational and industrial coding systems there is no perfect method for identifying green jobs. More specifically, green crosses the boundaries of today s job classification systems, making the green economy difficult to measure. Three different methods have attempted to overcome the difficulties: 1) counting green revenue share and apportioning employment via the given ratio; 2) directly surveying employers and asking them to list green jobs; and 3) using real time job ads to proxy green labor demand. The Consortium attempted to fill the void in previous studies by addressing a retrospective look at the green economy and we believe our methodology will enable researchers to track the gradual greening of the economy more precisely than other measures. While attempting to overcome the rigid nature of existing studies, our approach is not error free and we ve learned that real time requires further refinement and standardization to be a credible tool. With further improvement in artificial intelligence, analysis of the green economy through the lens of real time data should eventually result in more robust findings. Guide To Using Real Time Data for LMI Analysts Page 20

22 Section 4: THE DATA ITSELF There have been mixed responses to the increasing role of the Internet in recruitment. Some argue that given the size of today s applicant pool, firms have become ever more selective when sorting through résumés, prompting rumors of discrimination against the unemployed. 7 Others contend that online intermediaries alleviate labor market imperfections, often associated with imperfect information and adverse selection. 8 The prevalent use of the Internet in job search has greatly expanded the geographic and skill scope for employers, but, more importantly, the explosive growth in online job postings holds great potential for the research and LMI communities as a cutting edge tool to examine labor market conditions with less lag time than standard LMI data. As this approach continues to gain traction, end users are in need of a better understanding of the underlying data. One goal of this Consortium was to fill that void by providing an insider view of real time data. At present, the Consortium appears to be the first to critically evaluate the robustness of real time data. This section outlines critical issues of which analysts must be aware prior to working with real time data. 7 The Help-WantedSign Comes With a Frustrating Asterisk NYT ( 8 The Economics of Labor Market Intermediation VOX Policy Research ( Guide To Using Real Time Data for LMI Analysts Page 21

23 Nature of Real Time Not Traditional LMI Data It is important to highlight our ability to closely examine the data in raw form, a notion not well embraced by the industry most vendors provide preset, aggregated job postings and related analytics, which prevent gaining deeper insight on the subject matter. To mine the data, the Northeast Consortium partnered with Burning Glass Technologies (BG), a Boston based firm that develops technological methods of matching résumés to job postings. The data collection process relies on spiders that crawl the web, aggregating data from job boards, employer sites, government agencies, and newspapers. The goal of this process is to gather a comprehensive jobs database. The data is then locally stored, parsed and coded to over 60 variables, such as location, employer name, occupational and industry codes. Due to data quality, only a handful of variables have been found to be suitable for analysis at this time. Though there is room for optimism about expansion of the number of reliable variables, given the considerable improvements witnessed in our short time of working with real time data. 9 To understand complexity of real time data, one need not look further than the data deluge that has accompanied the Internet boom. Online content is growing exponentially, with only a fraction of it being verifiable, valid, and usable material. The same can be argued about online job postings. The limitations of real time are largely a consequence of: (1) how the postings are originally crafted, and (2) the effectiveness of artificial intelligence, or the parser, in properly coding the data. Unlike working with a static, homogenous database structure, the postings are aggregated from over 16,000 non standardized sources. For example, Craigslist, a free classifieds service, is a significant source of postings volume, but it also creates a substantial share of data problems. Unfortunately, artificial intelligence cannot yet effectively distinguish between a job, a personal ad, or spam. Whatever ends up on a job board is spidered, parsed, and becomes part of the real time database. This highlights a major difference between real time data and other job measures. Real time data is dependent upon how firms and recruiters craft their ads and do not necessarily translate into a hire or represent a vacancy. In other words, postings are designed for recruitment and not for research and analysis employers do not have to reveal much information to receive a flood of applications. In fact, it is not uncommon for key variables, such as the exact location, salary and even the employer s name to be omitted. While some variables are often omitted in job postings, it does not negate the value of those variables when they are present. For example, quality control reviews found that educational requirements were specifically identified in just one third of total postings. Among those postings that did not specify education, the educational requirement could be inferred for nearly 20 percent. 9 See Appendices for a summary of variables, their definitions and quality control results. Guide To Using Real Time Data for LMI Analysts Page 22

24 Postings without educational requirements were generally for occupations with common educational levels for qualification, i.e., professional or technical occupations in health care, computer science, finance, or law. When education level is specific to a position, it is often an employer requirement, thus the employer is more likely to include that information in a job posting. On the other hand, if education is an occupational requirement, as frequently established for State issued licenses, it is not uncommon for the educational level to be omitted, as qualified applicants are presumed to have already met the requirement. This observation presented an interesting dynamic in online job postings. The 2010 American Community Survey estimates show that about 37 percent of the region s population aged has a bachelor s degree or higher. 10 In a sample of 8,000 job postings for the entire region, among those that specified an educational level, just over 70 percent required a baccalaureate or higher. 11 This measure indicates an upscale bias in online job postings, meaning that jobs with lower skill requirements are less likely to be found online than jobs requiring higher levels of educational preparation. Analysis of zip code level and other sub state locations has indicated a significant issue. While value was found in available educational requirement information, geographic location questions were not so easily resolved. Some job postings are designed for national recruitment, while others are not. By the same token, some postings are only advertised internally or by word of mouth. With specific location data commonly omitted from ads, the parser infers a zip code for location of the job, sometimes using the firm headquarters, or the nearest metropolitan area, or even the job board s physical location. For example, a posting for a job in Westchester, New York might be advertised on multiple job boards as being in the greater New York area. The parser infers a zip code identification for New York City, since that is the only geographic area mentioned, but it is incorrect. This common lack of detailed geographic information brings into question the usefulness of geographic specificity below the State level, or perhaps the level of large metropolitan areas. Unlike traditional survey based methods, real time data is not subject to standard sampling errors, but is influenced by factors other than true labor demand. In essence, it is extremely difficult to control exogenous factors, such as the full extent of removal of duplicates (de duplication), job board under or over coverage, and spidering processes, among other issues. For instance, by plotting daily American Community Survey 1 Year Estimates, Table B23006: Educational Attainment by Employment Status for the Population 25 to 64 Years. Data summed for the eight state Consortium region. 11 Estimated from a sample of 8,000 postings from the period of 8/1/10 2/5/11 from all eight Consortium states Guide To Using Real Time Data for LMI Analysts Page 23

25 postings it becomes evident that the spiders are inactive some days while spiking on others. This volatility, along with an inability to adjust for seasonality, makes one month the minimum period for conducting time series analyses, with a three month period frequently being preferred. More problematic are the approximately 16,000 sites from which the data are sourced. In some cases, the spider that gathers data from these sites cannot differentiate between job posting text and the unique HTML code and detailed meta information 12 contained in a web page s source code. If page code and metadata is parsed into the real time database, it can undermine the parser s coding accuracy. Close examination of the data has shown that a single month may have as many records of parsed page code and metadata as the three previous months combined. As a result, the data has had to be continuously monitored and the parser updated with unique patches to address quality issues. Creating good real time data requires substantial and continued investment in human analysts who will not only monitor the system for quality, but be involved in the necessary efforts to improve the analytical software and provide context for the words and phrases found in the job ads. 12 Meta information, or metadata, describes other data, such as length of document, author, or date created. Web page HTML source code can include page navigation, text from other advertisements, page headers and footers, etc. Guide To Using Real Time Data for LMI Analysts Page 24

26 Duplicates and De Duplication Another major challenge in producing reliable analysis is the prevalence of job scraping, which is an industry wide practice that consists of copying postings from corporate client sites, state jobs banks, and other sources to other job boards. Job postings are produced to attract applicants and some are written by employers to solicit a specific response, or test the labor market. Duplication is beneficial for employers as expanded market coverage increases the odds of reaching the talent pool further reinforcing the concept that real time data does not directly track job openings or vacancies. Thus, the interests of job seekers, employers and job boards are usually in conflict with those of an analyst. This highlights the need for a standardized methodology to identify duplicates and to de duplicate data. Industry research suggests that most vacancies are filled within two to three months; therefore postings that re appear within a 60 day period are flagged as duplicates by our vendor. (Other vendors may use a different time period and analysts should always query the vendor about the specifics of the de duplication methodology.) This is subject to debate, however, as those estimates were developed prior to the recent recession. The Great Recession has created an employer s market, with more job seekers than job openings. Further delaying the hiring process is the gap between the candidates skills and experience and those desired by the firm, leading recruiters to woo candidates from competitors. The 60 day window is a subtle element and to date, the Consortium has not produced significant analyses on the role of time frames to producing more robust data. Generally, duplicates are believed to complicate the reconciliation of total job counts reports and in turn, cause fluctuations in the distribution of key variables until 60 days after the original posting. To further distinguish duplicates from unique postings, the de duplication algorithm relies on a number of select variables (namely, job title, employer, city, state, skills ) and the presence of employer information. The standard de duplication algorithm used by our vendor is able to identify and remove a share of the duplicates based on defined criteria, such as a reoccurring job title posted by the same employer name. But it still takes an experienced human eye to verify the extent of duplicates in the data, which is costly. De duplication concerns are often de emphasized, because the algorithm relies on a script with a strict set of rules that do not always pick up small subtleties in the postings. One cannot fully rule out the possibility of a duplicate even if the content inside the posting has been modified. Analyses of the rates at which select postings and/or occupations reappear may provide a useful gauge for determining the difficulty in filling certain vacancies. Even after being filtered, the dataset may still contain percent duplicates, primarily due to slight variations in either the job title or the employer name, such as Vanguard, Vanguard Inc., or Vanguard, Inc. This variation can be Guide To Using Real Time Data for LMI Analysts Page 25

WEIE Labor Market Context: Methods & Resources

WEIE Labor Market Context: Methods & Resources WEIE Labor Market Context: Methods & Resources The labor market component of the partnership case studies consisted of secondary analysis of existing public and proprietary data sources, supplemented and

More information

Evaluating Salary Survey Methodologies By Jonas Johnson, Ph.D. Senior Researcher

Evaluating Salary Survey Methodologies By Jonas Johnson, Ph.D. Senior Researcher Evaluating Salary Survey Methodologies By Jonas Johnson, Ph.D. Senior Researcher (800) 627-3697 - info.eri@erieri.com - www.erieri.com - 8575 164th Avenue NE, Redmond, WA 98052 In the field of compensation,

More information

Jobs In Maine. Online Job Postings by Industry, Occupation, Skills, and Education

Jobs In Maine. Online Job Postings by Industry, Occupation, Skills, and Education Jobs In Maine Online Job Postings by Industry, Occupation, Skills, and Education third quarter 2013 December 2013 Online Job Postings by Industry, Occupation, Skills, and Education third quarter 2013

More information

MARKET RESEARCH. Three key data types for assessing the viability of new distance learning programs and recruiting prospective students

MARKET RESEARCH. Three key data types for assessing the viability of new distance learning programs and recruiting prospective students A UNIVERSITY S GUIDE TO MARKET RESEARCH Three key data types for assessing the viability of new distance learning programs and recruiting prospective students a 2015 guidebook Summary Two common scenarios

More information

Real-Time Labor Market Information New Hampshire Computer and Information Technology Job Postings

Real-Time Labor Market Information New Hampshire Computer and Information Technology Job Postings Introduction to Real-Time Labor Market Information Real time labor market information is derived from online job postings. Details included in online job postings can provide information such as the type

More information

Matching Workers with Registered Nurse Openings: Are Skills Scarce?

Matching Workers with Registered Nurse Openings: Are Skills Scarce? Matching Workers with Registered Nurse Openings: Are Skills Scarce? A new DEED study found that a lack of skilled candidates is a small factor in the inability of employers to fill openings for registered

More information

WHAT AN INDICATOR OF LABOR DEMAND MEANS FOR U.S. LABOR MARKET ANALYSIS: INITIAL RESULTS FROM THE JOB OPENINGS AND LABOR TURNOVER SURVEY

WHAT AN INDICATOR OF LABOR DEMAND MEANS FOR U.S. LABOR MARKET ANALYSIS: INITIAL RESULTS FROM THE JOB OPENINGS AND LABOR TURNOVER SURVEY WHAT AN INDICATOR OF LABOR DEMAND MEANS FOR U.S. LABOR MARKET ANALYSIS: INITIAL RESULTS FROM THE JOB OPENINGS AND LABOR TURNOVER SURVEY Kelly A. Clark, Bureau of Labor Statistics 2 Massachusetts Ave. NE,

More information

META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING

META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING Ramesh Babu Palepu 1, Dr K V Sambasiva Rao 2 Dept of IT, Amrita Sai Institute of Science & Technology 1 MVR College of Engineering 2 asistithod@gmail.com

More information

Appendix B Data Quality Dimensions

Appendix B Data Quality Dimensions Appendix B Data Quality Dimensions Purpose Dimensions of data quality are fundamental to understanding how to improve data. This appendix summarizes, in chronological order of publication, three foundational

More information

This study is an extension of a research

This study is an extension of a research Business Employment Dynamics data: survival and longevity, II A study that extends previous research on the longevity of businesses shows that survival decreases at a decreasing rate; establishments that

More information

Actuarial Standard of Practice No. 23. Data Quality. Revised Edition

Actuarial Standard of Practice No. 23. Data Quality. Revised Edition Actuarial Standard of Practice No. 23 Data Quality Revised Edition Developed by the General Committee of the Actuarial Standards Board and Applies to All Practice Areas Adopted by the Actuarial Standards

More information

Preliminary Case Study 3: Health, Science, and Medical Technology Real Time Labor Data. Career Ladders Project

Preliminary Case Study 3: Health, Science, and Medical Technology Real Time Labor Data. Career Ladders Project Preliminary Case Study 3: Health, Science, and Medical Technology Real Time Labor Data Executive Summary Career Ladders Project In this case study, we conducted research to find out what a real time labor

More information

Supply & Demand Report

Supply & Demand Report Supply & Demand Report Job Title: Manufacturing Engineer Location: Phoenix, AZ (within a 50 mile radius) Timeframe: December 2012 to November 2014 Filters Applied: Occupations : Industrial Engineers, Mechanical

More information

Short-Term Forecasting in Retail Energy Markets

Short-Term Forecasting in Retail Energy Markets Itron White Paper Energy Forecasting Short-Term Forecasting in Retail Energy Markets Frank A. Monforte, Ph.D Director, Itron Forecasting 2006, Itron Inc. All rights reserved. 1 Introduction 4 Forecasting

More information

CRISP-DM, which stands for Cross-Industry Standard Process for Data Mining, is an industry-proven way to guide your data mining efforts.

CRISP-DM, which stands for Cross-Industry Standard Process for Data Mining, is an industry-proven way to guide your data mining efforts. CRISP-DM, which stands for Cross-Industry Standard Process for Data Mining, is an industry-proven way to guide your data mining efforts. As a methodology, it includes descriptions of the typical phases

More information

KEYWORDS: Risk Assessment, Competitive Intelligence, National Security, Web Security, Defense, Information Security

KEYWORDS: Risk Assessment, Competitive Intelligence, National Security, Web Security, Defense, Information Security The Competitive Intelligence and National Security Threat from Website Job Listings Jay D. Krasnow Georgetown University (M.A., May 2000) Communications, Culture and Technology Program 10706 Kings Riding

More information

Getting More From Your Actuarial Analysis

Getting More From Your Actuarial Analysis Getting More From Your Actuarial Analysis For Companies Retaining Property/Casualty Insurance Risks PwC 1 Introduction Many companies retain property/casualty insurance (P&C) risks, such as workers' compensation,

More information

JOB OPENINGS AND LABOR TURNOVER APRIL 2015

JOB OPENINGS AND LABOR TURNOVER APRIL 2015 For release 10:00 a.m. (EDT) Tuesday, June 9, Technical information: (202) 691-5870 JoltsInfo@bls.gov www.bls.gov/jlt Media contact: (202) 691-5902 PressOffice@bls.gov USDL-15-1131 JOB OPENINGS AND LABOR

More information

Education Pays in Colorado:

Education Pays in Colorado: Education Pays in Colorado: Earnings 1, 5, and 10 Years After College Mark Schneider President, College Measures Vice President, American Institutes for Research A product of the College Measures Economic

More information

How To Monitor A Project

How To Monitor A Project Module 4: Monitoring and Reporting 4-1 Module 4: Monitoring and Reporting 4-2 Module 4: Monitoring and Reporting TABLE OF CONTENTS 1. MONITORING... 3 1.1. WHY MONITOR?... 3 1.2. OPERATIONAL MONITORING...

More information

Northeast Minnesota Labor Market Trends Pathways 2 Postsecondary Summit October 10, 2014

Northeast Minnesota Labor Market Trends Pathways 2 Postsecondary Summit October 10, 2014 Northeast Minnesota Labor Market Trends Pathways 2 Postsecondary Summit October 10, 2014 Cameron Macht Regional Analysis & Outreach Manager Minnesota Dept. of Employment & Economic Development Labor Market

More information

BUSINESS EMPLOYMENT DYNAMICS FIRST QUARTER 2015

BUSINESS EMPLOYMENT DYNAMICS FIRST QUARTER 2015 For release 10:00 a.m. (EST), Wednesday, November 18, 2015 USDL-15-2204 Technical Information: (202) 691-6553 BDMInfo@bls.gov www.bls.gov/bdm Media Contact: (202) 691-5902 PressOffice@bls.gov BUSINESS

More information

PROPOSAL TO UNIVERSITY OF WISCONSIN-SUPERIOR

PROPOSAL TO UNIVERSITY OF WISCONSIN-SUPERIOR PROPOSAL TO UNIVERSITY OF WISCONSIN-SUPERIOR Burning Glass proposes to provide University of Wisconsin-Superior with detailed real-time job posting data and a custom research report to assess the labor,

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

IMPLEMENTATION NOTE. Validating Risk Rating Systems at IRB Institutions

IMPLEMENTATION NOTE. Validating Risk Rating Systems at IRB Institutions IMPLEMENTATION NOTE Subject: Category: Capital No: A-1 Date: January 2006 I. Introduction The term rating system comprises all of the methods, processes, controls, data collection and IT systems that support

More information

Quarterly Census of Employment and Wages (QCEW) Business Register Metrics August 2005

Quarterly Census of Employment and Wages (QCEW) Business Register Metrics August 2005 Quarterly Census of Employment and Wages (QCEW) Business Register Metrics August 2005 Sheryl Konigsberg, Merissa Piazza, David Talan, and Richard Clayton Bureau of Labor Statistics Abstract One of the

More information

Jobs4TN Online. Digital Government Government to Citizen. Contact: Leesa Bray, Information Technology Administrator. Tennessee

Jobs4TN Online. Digital Government Government to Citizen. Contact: Leesa Bray, Information Technology Administrator. Tennessee Jobs4TN Online Digital Government Government to Citizen Contact: Leesa Bray, Information Technology Administrator Tennessee Project Initiation Date: January 12, 2011 Project Completion Date: May 14, 2012

More information

!"!!"#$$%&'()*+$(,%!"#$%$&'()*""%(+,'-*&./#-$&'(-&(0*".$#-$1"(2&."3$'45"

!!!#$$%&'()*+$(,%!#$%$&'()*%(+,'-*&./#-$&'(-&(0*.$#-$1(2&.3$'45 !"!!"#$$%&'()*+$(,%!"#$%$&'()*""%(+,'-*&./#-$&'(-&(0*".$#-$1"(2&."3$'45"!"#"$%&#'()*+',$$-.&#',/"-0%.12'32./4'5,5'6/%&)$).2&'7./&)8'5,5'9/2%.%3%&8':")08';:

More information

South Carolina Nurse Supply and Demand Models 2008 2028 Technical Report

South Carolina Nurse Supply and Demand Models 2008 2028 Technical Report South Carolina Nurse Supply and Demand Models 2008 2028 Technical Report Overview This document provides detailed information on the projection models used to estimate the supply of and demand for Registered

More information

How To Train Green Workers

How To Train Green Workers San Fernando Valley Green Employer Report Executive Summary January 2012 Acknowledgements Los Angeles Valley College is grateful to Congressman Howard Berman for securing federal funding for this workforce

More information

Iowa State University University Human Resources Classification and Compensation Unit 3810 Beardshear Hall uhrcc@iastate.edu

Iowa State University University Human Resources Classification and Compensation Unit 3810 Beardshear Hall uhrcc@iastate.edu Iowa State University University Human Resources Classification and Compensation Unit 3810 Beardshear Hall uhrcc@iastate.edu Table of Contents INTRODUCTION... - 3 - SECTION I - EXTERNAL COMPETITIVENESS...

More information

Commonwealth of Virginia Job Vacancy Survey 2011-2012

Commonwealth of Virginia Job Vacancy Survey 2011-2012 a Commonwealth of Virginia Job Vacancy Survey 2011-2012 Prepared for: Virginia Employment Commission Richmond, Virginia Prepared by: Virginia Center for Urban Development and the Survey and Evaluation

More information

Monitor Your Brand. Do Your Customers and Competitors Know More About Your Brand than You Do? By: Beth Lee-Browning. Sponsored By:

Monitor Your Brand. Do Your Customers and Competitors Know More About Your Brand than You Do? By: Beth Lee-Browning. Sponsored By: Monitor Your Brand Do Your Customers and Competitors Know More About Your Brand than You Do? By: Beth Lee-Browning Sponsored By: Table of Contents Overview 2 Is Your Brand Healthy? Why Old Metrics No Longer

More information

The 2006 Earnings Public-Use Microdata File:

The 2006 Earnings Public-Use Microdata File: The 2006 Earnings Public-Use Microdata File: An Introduction by Michael Compson* This article introduces the 2006 Earnings Public-Use File (EPUF) and provides important background information on the file

More information

ANALYSIS OF HR METRICS for the Northern Non Profit Service Providers

ANALYSIS OF HR METRICS for the Northern Non Profit Service Providers ANALYSIS OF HR METRICS for the Northern Non Profit Service Providers Part of the 2011/12 Shared Human Resource (HR) Services Pilot Program for Small Non Profit Agencies Serving A Large Geographic Area

More information

A better way to calculate equipment ROI

A better way to calculate equipment ROI page 1 A better way to calculate equipment ROI a West Monroe Partners white paper by Aaron Lininger Copyright 2012 by CSCMP s Supply Chain Quarterly (www.supplychainquarterly.com), a division of Supply

More information

The U.S. Producer Price Index for Management Consulting Services (NAICS 541610)

The U.S. Producer Price Index for Management Consulting Services (NAICS 541610) The U.S. Producer Price Index for Management Consulting Services (NAICS 541610) Andrew Baer* U.S. Bureau of Labor Statistics 2 Massachusetts Avenue NE Washington, DC 20212 August 8, 2006 * The views expressed

More information

DATA MASHUPS: PUBLIC + PRIVATE LMI. A P DU Annual C o n ference S E P T E M B E R 1, 2 015

DATA MASHUPS: PUBLIC + PRIVATE LMI. A P DU Annual C o n ference S E P T E M B E R 1, 2 015 DATA MASHUPS: PUBLIC + PRIVATE LMI A P DU Annual C o n ference S E P T E M B E R 1, 2 015 NYC LABOR MARKET INFORMATION SERVICE We help education and workforce practitioners and policy makers make data-driven

More information

Education Pays in Colorado:

Education Pays in Colorado: Education Pays in Colorado: Earnings 1, 5, and 10 Years After College Mark Schneider President, College Measures Vice President, American Institutes for Research A product of the College Measures Economic

More information

APPENDIX N. Data Validation Using Data Descriptors

APPENDIX N. Data Validation Using Data Descriptors APPENDIX N Data Validation Using Data Descriptors Data validation is often defined by six data descriptors: 1) reports to decision maker 2) documentation 3) data sources 4) analytical method and detection

More information

Recruitment and Selection

Recruitment and Selection Recruitment and Selection The recruitment and selection belongs to value added HR Processes. The recruitment is about: the ability of the organization to source new employees, to keep the organization

More information

Ch 1 - Conduct Market Research for Price Analysis

Ch 1 - Conduct Market Research for Price Analysis Ch 1 - Conduct Market Research for Price Analysis 1.0 - Chapter Introduction 1.1 - Reviewing The Purchase Request And Related Market Research o 1.1.1 - How Was The Estimate Made? o 1.1.2 - What Assumptions

More information

Populating a Data Quality Scorecard with Relevant Metrics WHITE PAPER

Populating a Data Quality Scorecard with Relevant Metrics WHITE PAPER Populating a Data Quality Scorecard with Relevant Metrics WHITE PAPER SAS White Paper Table of Contents Introduction.... 1 Useful vs. So-What Metrics... 2 The So-What Metric.... 2 Defining Relevant Metrics...

More information

Markups and Firm-Level Export Status: Appendix

Markups and Firm-Level Export Status: Appendix Markups and Firm-Level Export Status: Appendix De Loecker Jan - Warzynski Frederic Princeton University, NBER and CEPR - Aarhus School of Business Forthcoming American Economic Review Abstract This is

More information

Vendor Perspective, Question #1

Vendor Perspective, Question #1 Page 1 of 14 September 25 th, 2013 HIT Policy Committee Privacy and Security Tiger Team: Epic appreciates this opportunity to provide testimony related to accounting of disclosures and access reports from

More information

UNDERSTANDING THE CHARACTERISTICS OF YOUR PORTFOLIO

UNDERSTANDING THE CHARACTERISTICS OF YOUR PORTFOLIO UNDERSTANDING THE CHARACTERISTICS OF YOUR PORTFOLIO Although I normally use this space to ruminate about various economic indicators and their implications, smartly advancing asset prices have encouraged

More information

Data Quality Assessment. Approach

Data Quality Assessment. Approach Approach Prepared By: Sanjay Seth Data Quality Assessment Approach-Review.doc Page 1 of 15 Introduction Data quality is crucial to the success of Business Intelligence initiatives. Unless data in source

More information

Fort McPherson. Atlanta, GA MSA. Drivers of Economic Growth February 2014. Prepared By: chmuraecon.com

Fort McPherson. Atlanta, GA MSA. Drivers of Economic Growth February 2014. Prepared By: chmuraecon.com Fort McPherson Atlanta, GA MSA Drivers of Economic Growth February 2014 Diversified and fast-growing economies are more stable and are less sensitive to external economic shocks. This report examines recent

More information

MEASURING INCOME DYNAMICS: The Experience of Canada s Survey of Labour and Income Dynamics

MEASURING INCOME DYNAMICS: The Experience of Canada s Survey of Labour and Income Dynamics CANADA CANADA 2 MEASURING INCOME DYNAMICS: The Experience of Canada s Survey of Labour and Income Dynamics by Maryanne Webber Statistics Canada Canada for presentation at Seminar on Poverty Statistics

More information

Development of an ECI excluding Workers Earning Incentive Pay. Anthony J. Barkume and Thomas G. Moehrle * U.S. Bureau of Labor Statistics

Development of an ECI excluding Workers Earning Incentive Pay. Anthony J. Barkume and Thomas G. Moehrle * U.S. Bureau of Labor Statistics Development of an ECI excluding Workers Earning Incentive Pay Anthony J. Barkume and Thomas G. Moehrle * U.S. Bureau of Labor Statistics NOTE: This paper has been prepared for presentation to the Federal

More information

Optimizing Customer Service in a Multi-Channel World

Optimizing Customer Service in a Multi-Channel World Optimizing Customer Service in a Multi-Channel World An Ovum White Paper sponsored by Genesys Publication Date: October 2010 Introduction The way in which customer service is delivered has changed. Customers

More information

EQR PROTOCOL 4 VALIDATION OF ENCOUNTER DATA REPORTED BY THE MCO

EQR PROTOCOL 4 VALIDATION OF ENCOUNTER DATA REPORTED BY THE MCO OMB Approval No. 0938-0786 EQR PROTOCOL 4 VALIDATION OF ENCOUNTER DATA REPORTED BY THE MCO A Voluntary Protocol for External Quality Review (EQR) Protocol 1: Assessment of Compliance with Medicaid Managed

More information

Web Analytics Definitions Approved August 16, 2007

Web Analytics Definitions Approved August 16, 2007 Web Analytics Definitions Approved August 16, 2007 Web Analytics Association 2300 M Street, Suite 800 Washington DC 20037 standards@webanalyticsassociation.org 1-800-349-1070 Licensed under a Creative

More information

Do you know? "7 Practices" for a Reliable Requirements Management. by Software Process Engineering Inc. translated by Sparx Systems Japan Co., Ltd.

Do you know? 7 Practices for a Reliable Requirements Management. by Software Process Engineering Inc. translated by Sparx Systems Japan Co., Ltd. Do you know? "7 Practices" for a Reliable Requirements Management by Software Process Engineering Inc. translated by Sparx Systems Japan Co., Ltd. In this white paper, we focus on the "Requirements Management,"

More information

MACHINE LEARNING & INTRUSION DETECTION: HYPE OR REALITY?

MACHINE LEARNING & INTRUSION DETECTION: HYPE OR REALITY? MACHINE LEARNING & INTRUSION DETECTION: 1 SUMMARY The potential use of machine learning techniques for intrusion detection is widely discussed amongst security experts. At Kudelski Security, we looked

More information

Analyzing survey text: a brief overview

Analyzing survey text: a brief overview IBM SPSS Text Analytics for Surveys Analyzing survey text: a brief overview Learn how gives you greater insight Contents 1 Introduction 2 The role of text in survey research 2 Approaches to text mining

More information

A Portrait of Seattle s Low-Income Working Population

A Portrait of Seattle s Low-Income Working Population A Portrait of Seattle s Low-Income Working Population December 2011 Support provided by the City of Seattle Office of Economic Development 1 INTRODUCTION The Great Recession, now over two years gone, has

More information

Coastal Restoration Spending in Louisiana Economic Impact Analysis

Coastal Restoration Spending in Louisiana Economic Impact Analysis Coastal Restoration Spending in Louisiana Economic Impact Analysis Louisiana Workforce Commission www.lmi.laworks.net/green September 2011 In 2009, Louisiana and Mississippi partnered to research economic

More information

Private Equity Performance Measurement BVCA Perspectives Series

Private Equity Performance Measurement BVCA Perspectives Series Private Equity Performance Measurement BVCA Perspectives Series Authored by the BVCA s Limited Partner Committee and Investor Relations Advisory Group Spring 2015 Private Equity Performance Measurement

More information

Small Business Checkup

Small Business Checkup Small Business Checkup How healthy is your business? www.aretehr.com TABLE OF CONTENTS The Four Keys to Business Health... 3 Management & Operations... 4 Marketing... 6 Financial & Legal... 8 Human Resources...

More information

Managing Records and Information within Your Organization

Managing Records and Information within Your Organization Managing Records and Information within Your Organization By: Carl E. Weise, CRM, ERM m, ECM m, EMM m, SharePoint s You may already have a records management program in your organization, or recognize

More information

Jobs Online Background and Methodology

Jobs Online Background and Methodology DEPARTMENT OF LABOUR LABOUR MARKET INFORMATION Jobs Online Background and Methodology DECEMBER 2009 Acknowledgements The Department of Labour gratefully acknowledges the support of our partners in Jobs

More information

Secrets to Automation Success. A White Paper by Paul Merrill, Consultant and Trainer at Beaufort Fairmont, LLC

Secrets to Automation Success. A White Paper by Paul Merrill, Consultant and Trainer at Beaufort Fairmont, LLC 5 Secrets to Automation Success A White Paper by Paul Merrill, Consultant and Trainer at Beaufort Fairmont, LLC 5 Secrets to Automated Testing Success 2 Secret #1 Practice Exceptional Leadership If you

More information

Anatomy of a Decision

Anatomy of a Decision research@bluehillresearch.com @BlueHillBoston 617.624.3600 Anatomy of a Decision BI Platform vs. Tool: Choosing Birst Over Tableau for Enterprise Business Intelligence Needs What You Need To Know The demand

More information

Real-Time Data Analytics into Action

Real-Time Data Analytics into Action Real-Time Data Analytics into Action Centers of Excellence (COE) Lori Sanchez Director, Desert/Inland Empire Region Evgeniya Zhenya Lindstrom Director, San Diego-Imperial Region California Community College

More information

Methodological Issues for Interdisciplinary Research

Methodological Issues for Interdisciplinary Research J. T. M. Miller, Department of Philosophy, University of Durham 1 Methodological Issues for Interdisciplinary Research Much of the apparent difficulty of interdisciplinary research stems from the nature

More information

10426: Large Scale Project Accounting Data Migration in E-Business Suite

10426: Large Scale Project Accounting Data Migration in E-Business Suite 10426: Large Scale Project Accounting Data Migration in E-Business Suite Objective of this Paper Large engineering, procurement and construction firms leveraging Oracle Project Accounting cannot withstand

More information

A Management Report. Prepared by:

A Management Report. Prepared by: A Management Report 7 STEPS to INCREASE the RETURN on YOUR BUSINESS DEVELOPMENT INVESTMENT & INCREASE REVENUES THROUGH IMPROVED ANALYSIS and SALES MANAGEMENT Prepared by: 2014 Integrated Management Services

More information

A Guide to Creating Dashboards People Love to Use Part 1: Foundation

A Guide to Creating Dashboards People Love to Use Part 1: Foundation A Guide to Creating Dashboards People Love to Use Part 1: Foundation Dashboard Design Matters Dashboards have become standard business practice over the last decade. Dozens of dashboard building tools

More information

Executive Summary. Specifically, the project gathered both primary and secondary data to meet four main research objectives:

Executive Summary. Specifically, the project gathered both primary and secondary data to meet four main research objectives: Executive Summary The overall goal of the research reported here is to provide an objective and credible assessment of the future workforce needs for lawyers in the state of California through the year

More information

ORACLE ENTERPRISE DATA QUALITY PRODUCT FAMILY

ORACLE ENTERPRISE DATA QUALITY PRODUCT FAMILY ORACLE ENTERPRISE DATA QUALITY PRODUCT FAMILY The Oracle Enterprise Data Quality family of products helps organizations achieve maximum value from their business critical applications by delivering fit

More information

Building a Strategic Workforce Planning Capability at the U.S. Census Bureau 1

Building a Strategic Workforce Planning Capability at the U.S. Census Bureau 1 Building a Strategic Workforce Planning Capability at the U.S. Census Bureau 1 Joanne Crane, Sally Obenski, and Jonathan Basirico, U.S. Census Bureau, and Colleen Woodard, Federal Technology Services,

More information

Building a Database to Predict Customer Needs

Building a Database to Predict Customer Needs INFORMATION TECHNOLOGY TopicalNet, Inc (formerly Continuum Software, Inc.) Building a Database to Predict Customer Needs Since the early 1990s, organizations have used data warehouses and data-mining tools

More information

ETL Anatomy 101 Tom Miron, Systems Seminar Consultants, Madison, WI

ETL Anatomy 101 Tom Miron, Systems Seminar Consultants, Madison, WI Paper AD01-2011 ETL Anatomy 101 Tom Miron, Systems Seminar Consultants, Madison, WI Abstract Extract, Transform, and Load isn't just for data warehouse. This paper explains ETL principles that can be applied

More information

Solvency II Data audit report guidance. March 2012

Solvency II Data audit report guidance. March 2012 Solvency II Data audit report guidance March 2012 Contents Page Introduction Purpose of the Data Audit Report 3 Report Format and Submission 3 Ownership and Independence 4 Scope and Content Scope of the

More information

Volume Author/Editor: Katharine G. Abraham, James R. Spletzer, and Michael Harper, editors

Volume Author/Editor: Katharine G. Abraham, James R. Spletzer, and Michael Harper, editors This PDF is a selection from a published volume from the National Bureau of Economic Research Volume Title: Labor in the New Economy Volume Author/Editor: Katharine G. Abraham, James R. Spletzer, and Michael

More information

Direct Marketing of Insurance. Integration of Marketing, Pricing and Underwriting

Direct Marketing of Insurance. Integration of Marketing, Pricing and Underwriting Direct Marketing of Insurance Integration of Marketing, Pricing and Underwriting As insurers move to direct distribution and database marketing, new approaches to the business, integrating the marketing,

More information

The Orthopaedic Surgeon Online Reputation & SEO Guide

The Orthopaedic Surgeon Online Reputation & SEO Guide The Texas Orthopaedic Association Presents: The Orthopaedic Surgeon Online Reputation & SEO Guide 1 Provided By: the Texas Orthopaedic Association This physician rating and SEO guide was paid for by the

More information

730 Yale Avenue Swarthmore, PA 19081 www.raabassociatesinc.com info@raabassociatesinc.com

730 Yale Avenue Swarthmore, PA 19081 www.raabassociatesinc.com info@raabassociatesinc.com Lead Scoring: Five Steps to Getting Started 730 Yale Avenue Swarthmore, PA 19081 www.raabassociatesinc.com info@raabassociatesinc.com Introduction Lead scoring applies mathematical formulas to rank potential

More information

Predicting the Stock Market with News Articles

Predicting the Stock Market with News Articles Predicting the Stock Market with News Articles Kari Lee and Ryan Timmons CS224N Final Project Introduction Stock market prediction is an area of extreme importance to an entire industry. Stock price is

More information

US Behavior Analyst Workforce: Understanding the National Demand for Behavior Analysts

US Behavior Analyst Workforce: Understanding the National Demand for Behavior Analysts US Behavior Analyst Workforce: Understanding the National Demand for Behavior Analysts Produced by Burning Glass Technologies on behalf of the Behavior Analyst Certification Board. Electronic and/or paper

More information

Removing Web Spam Links from Search Engine Results

Removing Web Spam Links from Search Engine Results Removing Web Spam Links from Search Engine Results Manuel EGELE pizzaman@iseclab.org, 1 Overview Search Engine Optimization and definition of web spam Motivation Approach Inferring importance of features

More information

Occupational Demand/ Program Supply Analysis using Web Sources

Occupational Demand/ Program Supply Analysis using Web Sources Occupational Demand/ Program Supply Analysis using Web Sources MCCA Student Success Summit September 19, 2013 Institutional Research Dept Washtenaw Community College I. Online Data Sources II. Application

More information

Measurement Information Model

Measurement Information Model mcgarry02.qxd 9/7/01 1:27 PM Page 13 2 Information Model This chapter describes one of the fundamental measurement concepts of Practical Software, the Information Model. The Information Model provides

More information

STATE OF WASHINGTON EMPLOYMENT SECURITY DEPARTMENT Labor Market and Economic Analysis Branch P.O. Box 9046, Olympia, WA 98507-9046

STATE OF WASHINGTON EMPLOYMENT SECURITY DEPARTMENT Labor Market and Economic Analysis Branch P.O. Box 9046, Olympia, WA 98507-9046 STATE OF WASHINGTON EMPLOYMENT SECURITY DEPARTMENT Labor Market and Economic Analysis Branch P.O. Box 9046, Olympia, WA 98507-9046 ANNUAL PERFORMANCE REPORT September 23, 2005 As required in Training and

More information

Position Classification Flysheet for Logistics Management Series, GS-0346

Position Classification Flysheet for Logistics Management Series, GS-0346 Position Classification Flysheet for Logistics Management Series, GS-0346 Table of Contents SERIES DEFINITION... 2 SERIES COVERAGE... 2 EXCLUSIONS... 4 DISTINGUISHING BETWEEN LOGISTICS MANAGEMENT AND OTHER

More information

2007 Denver Regional Workforce Gap Analysis. New Picture Here (this is a placeholder)

2007 Denver Regional Workforce Gap Analysis. New Picture Here (this is a placeholder) 2007 Denver Regional Workforce Gap Analysis New Picture Here (this is a placeholder) September 14, 2007 ABOUT DEVELOPMENT RESEARCH PARTNERS Development Research Partners specializes in economic research

More information

How To Choose the Right Vendor Information you need to select the IT Security Testing vendor that is right for you.

How To Choose the Right Vendor Information you need to select the IT Security Testing vendor that is right for you. Information you need to select the IT Security Testing vendor that is right for you. Netragard, Inc Main: 617-934- 0269 Email: sales@netragard.com Website: http://www.netragard.com Blog: http://pentest.netragard.com

More information

Spam Testing Methodology Opus One, Inc. March, 2007

Spam Testing Methodology Opus One, Inc. March, 2007 Spam Testing Methodology Opus One, Inc. March, 2007 This document describes Opus One s testing methodology for anti-spam products. This methodology has been used, largely unchanged, for four tests published

More information

Franchise Success Statistics and Factors:

Franchise Success Statistics and Factors: Franchise Success Statistics and Factors: Are Franchised Businesses More Successful Than Independent Businesses? What Information Should Individuals Rely on Before Buying a Franchise? Industry Claims US

More information

Billions of dollars are spent every year

Billions of dollars are spent every year Forecasting Practice Sales Quota Accuracy and Forecasting MARK BLESSINGTON PREVIEW Sales-forecasting authority Mark Blessington examines an often overlooked topic in this field: the efficacy of different

More information

If you have any questions or need additional information, please feel free to contact Carlos Cracraft at 502-564-7976. Thank you.

If you have any questions or need additional information, please feel free to contact Carlos Cracraft at 502-564-7976. Thank you. September 22, 2004 Ms. Helen Parker Regional Administrator U.S. Department of Labor Employment and Training Administration 61 Forsyth Street, S.W., Room 6M12 Atlanta, Georgia 30303 Dear Ms. Parker: We

More information

Errors in Operational Spreadsheets: A Review of the State of the Art

Errors in Operational Spreadsheets: A Review of the State of the Art Errors in Operational Spreadsheets: A Review of the State of the Art Stephen G. Powell Tuck School of Business Dartmouth College sgp@dartmouth.edu Kenneth R. Baker Tuck School of Business Dartmouth College

More information

Performing a data mining tool evaluation

Performing a data mining tool evaluation Performing a data mining tool evaluation Start with a framework for your evaluation Data mining helps you make better decisions that lead to significant and concrete results, such as increased revenue

More information

User Stories Applied

User Stories Applied User Stories Applied for Agile Software Development Mike Cohn Boston San Francisco New York Toronto Montreal London Munich Paris Madrid Capetown Sydney Tokyo Singapore Mexico City Chapter 2 Writing Stories

More information

SalesStaff White Paper Collection. All Leads Are Not Created Equal: Why Lead Quality Matters

SalesStaff White Paper Collection. All Leads Are Not Created Equal: Why Lead Quality Matters SalesStaff White Paper Collection All Leads Are Not Created Equal: Why Lead Quality Matters 1 Lead generation is not simply a game of producing as many leads as possible. That s because not all leads are

More information

Getting the Most from Demographics: Things to Consider for Powerful Market Analysis

Getting the Most from Demographics: Things to Consider for Powerful Market Analysis Getting the Most from Demographics: Things to Consider for Powerful Market Analysis Charles J. Schwartz Principal, Intelligent Analytical Services Demographic analysis has become a fact of life in market

More information

Contribution of S ESOPs to participants retirement security

Contribution of S ESOPs to participants retirement security Contribution of S ESOPs to participants retirement security Prepared for the Employee-Owned S Corporations of America March 2015 Executive summary Since 1998, S corporations have been permitted to maintain

More information

Intelligent Systems: Unlocking hidden business value with data. 2011 Microsoft Corporation. All Right Reserved

Intelligent Systems: Unlocking hidden business value with data. 2011 Microsoft Corporation. All Right Reserved Intelligent Systems: Unlocking hidden business value with data Intelligent Systems 2 Microsoft Corporation September 2011 Applies to: Windows Embedded Summary: An intelligent system enables data to flow

More information