Risk Analysis Approaches to Rank Outliers in Trade Data

Transcription

1 Risk Analysis Approaches to Rank Outliers in Trade Data Vytis Kopustinskas and Spyros Arsenis Abstract The paper discusses ranking methods for outliers in trade data based on statistical information with the objective to prioritize anti-fraud investigation activities. The paper presents a ranking method based on risk analysis framework and discusses a comprehensive trade fraud indicator that aggregates a number of individual numerical criteria. 1 Introduction The detection of outliers in trade data can be important for various practical applications, in particular for prevention of the customs fraud or data quality. From the point of view of a customs inspector, trade transactions detected as outliers may be of interest as due to possible on-going fraud activities. For example, low price outliers might indicate that the specific transaction is undervalued to evade import duties. As another example, low and high price outliers may be indicators of other frauds: VAT fraud or trade based money laundering. The statistical algorithms used to detect outliers in large trade datasets typically produce high number of transactions classified as outliers (Perrotta et al. 2009). Large number of transactions flagged as suspicious are difficult to handle. Therefore the detected outliers must be ranked according to certain criteria in order to prioritize the investigation actions. Different criteria could be used for ranking purposes and they are derived from at least two very distinct information sources: statistical information of outlier diagnostics and customs in-house information systems. V. Kopustinskas ( ) S. Arsenis European Commission, Joint Research Center, Institute for the Protection and Security of the Citizen, Via E. Fermi 2748, Ispra (VA), Italy vytis.kopustinskas@jrc.ec.europa.eu A. Di Ciaccio et al. (eds.), Advanced Statistical Methods for the Analysis of Large Data-Sets, Studies in Theoretical and Applied Statistics, DOI / , Springer-Verlag Berlin Heidelberg

2 138 V. Kopustinskas and S. Arsenis This paper discusses low price outliers ranking methods based on statistical information with the objective to prioritize anti-fraud investigation actions. The presented methodology to rank low price outliers is not generic, but can be extended to other type of fraud patterns by using the same principles. 2 Risk Analysis Framework The risk analysis framework is applicable to the ranking problem of outliers in trade data. The fundamental questions in quantitative risk analysis are the following: 1. What can go wrong? 2. How likely it will happen? 3. If it happens, what consequences are expected? To answer question 1, a list of initiating events should be defined. The likelihood of the events should be estimated and the consequences of each scenario should be assessed. Therefore, quantitatively risk can be defined as the following set of triplets (Kaplan and Garrick 1981): R D< S i ;P i ;C i >; i D 1;:::;n: (1) where S i ith scenario of the initiating events; P i likelihood (probability or frequency) of the scenario i; C i consequence of the scenario; n number of scenarios. In case of outliers in trade data, the triplet can be interpreted in the following way: P likelihood that an outlier is a real fraudulent transaction; C consequence of the fraudulent trade transaction (e.g. unpaid taxes or duties). The interpretation of S can be important only if several methods are used to detect outliers or more than one fraud pattern is behind the observed outlier. 3 Approaches to Rank Low Price Outliers in Trade Data There are several approaches to obtain a numerical ranking of low price outliers in trade data based on the risk analysis framework. The most suitable method is to multiply P and C,whereC is an estimate of the loss to the budget (unpaid duties) and P is probability that the specific transaction is a fraud. The multiplication R D P C provides an average fraud related damage estimate caused by specific trade activity and allows ranking them according to their severity. In orderto use the risk analysisframework (1) we have to estimate the probability.p / of fraud in the trade transaction. This is not an easy quantity to estimate, but we assume that p-value produced by statistical tests for outliers could be a suitable measure. It means that the lower the p-value, the more likely the transaction is

3 Risk Analysis Approaches to Rank Outliers in Trade Data 139 fraudulent. In practice, this can also be a data typing error and not a fraud, but in general low price outliers are suspicious and should be investigated. As the relationship between P and p-value is reverse, the transformation is used: ( P D log 10.pvalue/ 10 ; if pvalue P D 1; if pvalue <10 10 (2) By transformation (2)thep-value is transformed into scale [0, 1]. The scale here is arbitrary and chosen mainly for the purpose of convenience, driven by the fact that extremely low p-values are no more informative for the ranking purposes. The consequence part.c / of (1) can be estimated by multiplying the traded quantity.q/ and transaction unit price difference.u / from the recorded to the estimated fair price: C D Q U D Q.UF U/,whereUF the estimated fair transaction unit price determined by the regression after outliers have been removed; U the unit price as recorded.u D V=Q/; V value of the transaction as recorded. The interpretation of C is an average loss to the budget if the underlying transaction is fraudulent. In fact,.c / value already provides a ranking of outliers and such a ranking has been applied. The fraud risk (RI) can be computed as follows: RI D P QU. The indicator can also be transformed into the [0, 10] scale, as to make its use more standard for investigators. The investigator should start the investigation process from the outlying trade transactions with the highest values of RI. The RI is a simple and easy to understand indicator, however the dataset of detected outliers contains additional statistical information. Table 1 provides a number of criteria which could be used for the development of a comprehensive ranking structure for low price outliers. The criteria listed in Table 1 are all numerical and their higher value is associated with the higher impact to trade fraud risk indicator (FI). Most of the criteria.i 1 I 7 / are easy to understand and compute as they reflect basic statistical information about the dataset. The criterion I 8 reflects inferential statistics from the method that was Table 1 Numerical criteria for the development of ranking structure, indicating their original scale and rescaling method. V PO Trade value by aggregating all destinations; Q PO Trade quantity by aggregating all destinations; MaxN maximum number of non-zero trade transactions No Criteria Original scale Rescaling I 1 Quantity traded, Q [0, 1] log and in-max translformation to [0, 1] I 2 Value traded, V [0, 1] log and min-max transformation to [0, 1] I 3 Average loss, Q U [0, 1] log and min-max transformation to [0, 1] I 4 Ratio UF=U [0, 1] log and min-max transformation to [0, 1] I 5 Ratio V=V PO [0, 1] No I 6 Ratio Q=Q PO [0, 1] No I 7 Number of obs./maxn [0, 1] No I 8 P -value [0, 0.1] log transformation to [0, 1] as in (2) I 9 Final goodness of fit R 2 [0, 1] No I 10 Final R 2 /initial R 2 [0, 1] log and min-max transformation to [0, 1]

4 140 V. Kopustinskas and S. Arsenis used to detect outliers. The ranking structure can be adapted to other price outlier detection methods by using the same principles. The criteria I 9 and I 10 take into account the model goodness of fit by using the coefficient of determination of the linear regression for Q versus V variables. The initial R 2 is computed on all the trade data in a particular trade flow assuming linear regression as the base model, while the final R 2 is computed on the remaining data points after removal of the outliers. The initial R 2 and the final R 2 are used as it might be important to reflect for the change in the model goodness of fit after removal of the outliers. After rescaling as shown in Table 1, the individual criteria.i i / are comparable among themselves. The log-transformation was used for a number of criteria to make the ranking smoother, because for many real data cases it is rather stair-like. The criteria when transformation is actually needed could be more elaborated in the future.the specific weights.w i / must be assigned to each criterion to determine its relative impact to the final indicator score. The most popular method to combine different criteria into a single numerical indicator is to compute a weighted sum: FI D P m id1 w i I i,wherem number of individual criteria. A complication arises from the fact that some criteria could be highly correlated and therefore their correlation matrix must be examined before assignment of weights to each criterion. Without prior knowledge, equal weights could be assigned to non-correlated criteria. However, the correlation matrix analysis is not enough and weights cannot be derived from statistical considerations only, but must by defined by subject matter experts and be closely related to the specific type of fraud in mind. One possible method is analytic hierarchy process (AHP) which provides a rational framework for integrating opinions of many subject matter experts into a single quantitative estimate (Zio 1996). The list of possible criteria presented in Table 1 is not complete and could be modified in future applications. One possible type of analysis that could improve the ranking structure is correspondence analysis. As various associations between trade flow variables could be important and quantitative information about the existing links in the data could be integrated in the ranking: for example, quantitative relationship between products and exporting countries (for import datasets) among all the outliers could be important to determine whether fraudulent activities might be linked to specific origin countries or products. The presented ranking methodology was developed for the situation when trade data represent a single population and low price outliers are detected within it assuming linear regression to be a model of the data. However, the problem of ranking becomes more interesting when outliers are detected in the mixtures of populations (Atkinson et al. 2004): when several populations of different price levels exist and it is not obvious from which populations the outliers are detected. It is an important problem in fraud detection, where fraudulent transactions are hidden within a mixture of several populations. For example, continuous systematic underpricing of selected imports into one country could not be detected by doing single

5 Risk Analysis Approaches to Rank Outliers in Trade Data 141 country analysis. In the case of mixed populations, the ranking structure needs to be further developed. 4 Application of the Ranking Criteria The ranking was applied for the low price outliers detected in the monthly aggregated trade data of agricultural product imports into the EU 27 member states during (dataset containing: product, reporting countries, volume and value). In total, 1,109 low price outliers were detected by using backward search based outlier detection method. The numerical criteria as shown in Table 1 were computed and their mutual correlation is presented in Table 2. As evident from Table 2, several pairs are highly correlated (higher than 0.6). It is not surprising that quantity and value based numerical criteria (I 1, I 2, I 5 and I 6 ) are highly correlated because larger quantity trade transactions are associated with larger value transactions. Inclusion of all these criteria in the ranking at equal weights would have double counting effect on the total score. In customs fraud, misdeclaration of value happens to be much more frequent than misdeclaration of quantity. Considering this, the numerical criteria I 2 (value traded) and I 5 (ratio of value) should be eliminated from the ranking structure. In fact, the decision to eliminate them could have been done before the computations (following the reasoning as above). The high correlation of quantity.i 1 / and average loss.i 3 / is also expected as average loss is a function of quantity. In this case the weight can be equally divided between the two criteria. The same approach can be used for the remaining two highly correlated numerical criteria I 4 and I 10. This correlation is very interesting: ratio of fair price versus recorded price gives similar information as ratio of final model (without outliers) R 2 versus initial (all the data) model goodness of fit R 2.In this case, equal weights of 0.5 were applied. Table 2 Correlation matrix of the numerical criteria I 1 I 10 for the selected application I1 I2 I3 I4 I5 I6 I7 I8 I9 I10 I1 1:00 0:79 0:70 0:07 0:20 0:13 0:12 0:13 0:05 0:02 I2 0:79 1:00 0:74 0:34 0:22 0:01 0:22 0:05 0:06 0:20 I3 0:70 0:74 1:00 0:33 0:05 0:23 0:20 0:19 0:15 0:25 I4 0:07 0:34 0:33 1:00 0:20 0:37 0:06 0:19 0:27 0:69 I 5 0:20 0:22 0:05 0:20 1:00 0:76 0:38 0:06 0:12 0:23 I 6 0:13 0:01 0:23 0:37 0:76 1:00 0:38 0:05 0:07 0:22 I 7 0:12 0:22 0:20 0:06 0:38 0:38 1:00 0:22 0:26 0:06 I 8 0:13 0:05 0:19 0:19 0:06 0:05 0:22 1:00 0:07 0:16 I 9 0:05 0:06 0:15 0:27 0:12 0:07 0:26 0:07 1:00 0:24 I10 0:02 0:20 0:25 0:69 0:23 0:22 0:06 0:16 0:24 1:00

6 142 V. Kopustinskas and S. Arsenis 0,90 0,80 Fraud indicator value 0,70 0,60 0,50 0,40 0,30 0,20 0,10 0, Trade flow number Fig. 1 Ranking of the detected outliers in the EU import trade data (sorted decreasingly) Table 3 The weighting of the ranking structure. No Criteria Weight w i I 1 Quantity traded, Q 0:5 I 2 Value traded, V 0 I 3 Average loss, Q U 0:5 I 4 Ratio UF=U 0:5 I 5 Ratio V=V PO 0 I 6 Ratio Q=Q PO 1 I 7 Number of obs/(maxn D 36) 1 I 8 P -value 1 I 9 Coefficient of determination R 2 1 I 10 Final R 2 =initialr 2 0:5 The weights applied for the ranking procedure are shown in Table 3. The ranking indicator value can be further normalized to scale [0, 1] by dividing by the sum of weights. The computed FI values are shown in Fig. 1. It reflects typical in risk rankings Pareto distribution, where the highest risk is associated with a small number of outliers, while the risk of the rest is distributed more smoothly. The highest and the lowest ranked trade outliers are shown in Figs. 2 and 3. The results of the ranking are as expected: the highest ranked outliers are severe outliers in terms of low price being far away from the regression line and the lowest ranked outliers being close to it. The next step to improve the ranking procedure would be to validate the ranking based on real fraud cases and involve fraud experts in the process of weight

7 Risk Analysis Approaches to Rank Outliers in Trade Data 143 Value, thousands euro Linear trend (no outliers) Linear trend (all data) 200 Outliers Non-outliers Quantity, tons Fig. 2 The highest ranked low price outlier (EU import trade dataset) Linear trend (no outliers) 35 Value, thousands euro Linear trend (all data) 10 5 Outliers Non-outliers Quantity, tons Fig. 3 The lowest ranked low price outlier (EU import trade dataset)

8 144 V. Kopustinskas and S. Arsenis estimation. Both options require a lot of resources for implementation and especially feedback for the ranking validation. Preliminary validation information suggests that severe price outliers could be more linked to data errors than fraudulent activities. Further development of the ranking structure by adding other indicators could address the data quality issues. 5 Final Remarks The paper discusses the trade data outliers ranking methods with the objective to prioritize anti-fraud investigation actions. The risk analysis framework was used to develop a ranking structure based only on available statistical information in trade dataset. A comprehensive trade fraud risk indicator is discussed that combines a number of individual numerical criteria. An application study is presented that produced a ranking of the detected outliers in the aggregated European import trade data during The ranking produced cannot be considered as final due to arbitrary weights that were used for the computations. Derivation of weights is an important part of the ranking methodology, however it cannot be produced only by statistical considerations. Subject matter expert opinions would be valuable to define the weights based on the type of fraud under investigation. The results of the test study show that even arbitrary weights can produce reasonable results. Further fine-tuning of the methodology is depended on feedback and judgments from fraud experts. References Atkinson A.C., Riani M., Cerioli A. (2004) Exploring Multivariate Data With the Forward Search, Springer, New York. Perrotta D., Riani M. and Torti F. (2009) New robust dynamic plots for regression mixture detection. Advances in Data Analysis and Classification, 3(3), Kaplan, S., Garrick, B. J. (1981) On the quantitative definition of risk, Risk Analysis, 1(1), Zio E. (1996) On the use of the analytic hierarchy process in the aggregation of expert judgments, Reliability Engineering and System Safety, 53(2),