Our experimental SIP server installed for testing and developments of the ACD quality routing [101] [102] [103] [104] [105] [106] [107] [108] [109] and of the system designated for the routing of emergency calls [110] was hacked in October 2010. The first call log via the hacked server is dated October 13^th. A significant volume is registered during the weekend from 2010-10-15 through 2010-10-17 [111] [112] [113] [114] [115]. The fraudulent traffic was terminated via the hacked server to several destinations. The server was not integrated into the main billing system, and the calls were not accounted in the central CDR database. For discovering the traffic details only syslog files [116] [117] [118] of the hacked UNIX server are used.

The kamalio server [119] [120] [121] logged the SIP transactions via unix syslog service. This document presents the construction of CDR file from the syslog file (see sections from 2 to 5), provides the output CDR file of fraudulent calls (section 6.4), different statistics on the number of simultaneous calls and destinations dialed (sections 6.1, 6.2, 6.3, 7, and 8.2), and several hypothesis for possible motives of the fraud (section 8.2).

This document has three different target audiences. One target is the training on processing of the text log files and testing of various skills such as the ability to form dialog (call) records from the transaction logs (SIP method/responses) [122] [123] [124] [125]. The logs not belonging to traffic generated by legitimate users will be used publicly for pre-recruitment tests and for internal training of the staff. The authorities processing the complains in relation with the fraudulent calls to Slovenia mobiles can find additional statistics related to the traffic. Finally this document publicly provides the data to other operators for comparison of patterns and for the prevention of frauds in their own networks.

2. Extracting the sufficient subset of records for building the CDR of answered calls

2.1. Syslog file

Our objective is to select the transactions determinant for the establishment of call durations and assemble them into phone calls.

The syslog file [126] [127] [128] contains records for accomplished (answered) SIP transactions. A SIP transaction is often an exchange of two SIP packets, a method followed by a response [129] [130] [131] [132]. The transaction is sent to syslog upon the reception of a response concluding the transaction. The transaction is recorded as a single syslog line. The following is an example of a transaction record. The kamalio server sends to syslog a single string of semicolon separated fields. The following is an example of such a record, where for a visual clarity each field is shown on a separated line aligned by the equal sign. The shown newlines and following spaces are not present in the syslog file.

101014-syslog.txt:Oct 13 13:13:08 ks301129 ser[9108]: NOTICE: acc [acc.c:275]: ACC: transaction answered:

timestamp=1286968388;

method=INVITE;

from_tag=7e6d1b44;

to_tag=65212929765820101013131233;

call_id=ae5d7f4125027e66;

code=183;

reason=Session Progress;

src_user=101;

src_domain=91.121.73.130;

dst_ouser=00972599870738;

dst_user=+972599870738;

dst_domain=212.249.15.9

All lines of syslog file containing transaction records of kamalio where the “call_id” field is present are extracted in a separate log file, which serves us as the main input in our next efforts for building a CDR file.

Description:	All transactions with call_id field
File:	data1\101013+6-11'callid.zip
Size:	6.44MB

We are limiting our research by answered calls only. The output CDR will contain only the answered calls. You can skip the processing of the file and go directly to section 6.4 containing a link to the output CDR file. Sections 2.2, 3, 4, and 5 are provided only for training purposes and are insignificant for administrative efforts related to fraud complains.

2.2. Extraction of sufficient subset

The transaction record appears in the syslog file [133] [134] [135] when the reply to a method is received [136] [137] [138] [139]. At that point, both the key data of the method’s request and of the reply are logged into a single syslog line. To obtain phone call data, we need the records of all INVITE methods having 200 OK replies and of all BYE methods.

The syslog file counts a slightly higher number of BYE transactions that of INVITE methods having 200 success replies. The excess of BYE can be due to losses and retransmissions. Additionally, as you see below, the transaction records are duplicated. The reason of the duplicates is ignored (can be the multi-process nature of the kamalio server).

$ grep ";method=INVITE;.*;code=200;" 101013+6-callid.txt | wc -l

137536

$ grep ";method=BYE;" 101013+6-callid.txt | wc -l

144532

$ grep ";method=INVITE;.*;code=200;" 101013+6-callid.txt | sort | uniq | wc -l

68766

$ grep ";method=BYE;" 101013+6-callid.txt | sort | uniq | wc -l

72249

We eliminate the duplicates and save the successful INVITE methods as well as all the BYE methods into a new text file.

$ egrep "(;method=INVITE;.*;code=200;|;method=BYE;)" 101013+6-callid.txt | sort | uniq | wc -l

141015

$ expr 68766 + 72249

141015

$ egrep "(;method=INVITE;.*;code=200;|;method=BYE;)" 101013+6-callid.txt | sort | uniq > 101013+6-answered.txt

$ u2d 101013+6-answered.txt

101013+6-answered.txt:

$ wc -l 101013+6-answered.txt

141015 101013+6-answered.txt

Description:	All transactions of answered calls
File:	data1\101013+6-12'answered.zip
Size:	3.69 MB

3. Reading the fields of INVITE and BYE transactions

The kamalio service sends to syslog a string with semicolon separated fields containing both the filed name and the value. This section presents the script used for extracting the fields we are interested in. In our regex [140] [141] the quantifiers are followed by not very popular greediness options. If you wish to understand the regex used in our script, read the next subsection 3.1, otherwise skip it and continue with subsection 3.2.

3.1. The greediness options of regex quantifiers

The behavior of commonly used quantifiers “*”, “?”, “+”, “{n,m}” can be tuned by greediness options. If the quantifier (such as “*”) is followed by “?” the quantified subpattern will match the minimum number of times. If the quantifier is followed by “+” it will match the maximum number of times.

By default a quantified subpattern “+” or “*” is greedy. In the following example, the character “a” is matched the maximum possible of times with expression “a+” or “a*” while still allowing the rest of the pattern (the last “a”) to match:

$ echo aaaa | perl -ne '/a+a/; print $&'

aaaa

$ echo aaaa | perl -ne '/a*a/; print $&'

aaaa

If you want it to match the minimum number of times possible, follow the quantifier with a “?”. In the following example “a+” is matched only 1 time, and “a*” zero time (the respective minimums).

$ echo aaaa | perl -ne '/a+?a/; print $&'

$ echo aaaa | perl -ne '/a*?a/; print $&'

Perl provides also the “possessive” quantifier form. Follow the quantifier with “+”. The possessive option matches as much as possible and does not take care of the rest of the regex (whether the rest of the pattern will match or not).

The example below is without the possessive option “+” (after “a+”), and we see that it allows matching of both halves of the regular expression: “a+” and the following “a”.

$ echo aaaa | perl -ne '/(a+)a/; print $1'

aaa

However, when the possessive option is added, the first half of the regular expression eats up all “a”s, without leaving any character for the rest of the regex.

$ echo aaaa | perl -ne '/(a++)a/; print $1'

The regex with the possessive quantifier matches only without the second half.

$ echo aaaa | perl -ne '/(a++)/; print $1'

aaaa

3.2. Retrieving the fields

The following command line script retrieves the unix time stamp (the seconds counted since 1970-01-01), the SIP method (INVITE or BYE), the unique call id, the source user (SIP from field), the destination user (SIP to field before translation), and finally the next hop SIP server (i.e. our vendor). As you see non-greedy quantifier is used for stopping the matching at the first occurrence of the semicolon separator, and the possessive quantifier is used for ensuring the full capture of the IP address (normally redundant as by default the quantifier is greedy).

$ head 101013+6-answered.txt | perl -ne '/ timestamp=(\d+);method=(\w+);.*;call_id=(.*?);.*;src_user=(.*?);.*;dst_ouser=(.*?);.*;dst_domain=([^\r\n]*+)/; printf "%10s %10s %20s %20s %20s %20s\n",$1,$2,$3,$4,$5,$6'

1286968402 INVITE ae5d7f4125027e66 101 00972599870738 212.249.15.9

1286968407 BYE ae5d7f4125027e66 +972599870738 101 188.161.231.133

1286968438 INVITE 9a428e1758434f2e 101 00972599870738 212.249.15.9

1286968510 INVITE f0073042b0657a13 101 00972597516161 212.249.15.9

1286968532 INVITE ab071445d631b666 101 00972597516161 212.249.15.9

1286968590 INVITE b8009e4b3178c70a 101 00972599870738 212.249.15.9

1286968602 BYE b8009e4b3178c70a 101 +972599870738 213.71.2.208

1286968606 BYE ab071445d631b666 101 +972597516161 213.71.2.208

1286968607 BYE f0073042b0657a13 101 +972597516161 213.71.2.208

1286968611 BYE 9a428e1758434f2e 101 +972599870738 213.71.2.208

As expected, the number of extracted lines matches to the total number of sufficient syslog records extracted and shown in section 2.2.

141015

$ wc -l 101013+6-answered.txt

141015 101013+6-answered.txt

The fields are displayed in a comma separated format and with the call-id field at the beginning (for being used as a matching key between INVITE and BYE records). The symbol “A” in the 2^nd field means the beginning of the call charge and “B” the end of the call.

ae5d7f4125027e66,A,1286968402,101,00972599870738,212.249.15.9

ae5d7f4125027e66,B,1286968407,+972599870738,101,188.161.231.133

9a428e1758434f2e,A,1286968438,101,00972599870738,212.249.15.9

f0073042b0657a13,A,1286968510,101,00972597516161,212.249.15.9

ab071445d631b666,A,1286968532,101,00972597516161,212.249.15.9

b8009e4b3178c70a,A,1286968590,101,00972599870738,212.249.15.9

b8009e4b3178c70a,B,1286968602,101,+972599870738,213.71.2.208

ab071445d631b666,B,1286968606,101,+972597516161,213.71.2.208

f0073042b0657a13,B,1286968607,101,+972597516161,213.71.2.208

9a428e1758434f2e,B,1286968611,101,+972599870738,213.71.2.208

3.3. Used vendors

The further processing of the INVITE records of the output shows that calls were routed only via two outgoing vendors. The verification matches with the total of section 2.2.

$ cat 101013+6-answered.txt | perl -ne '/ timestamp=(\d+);method=(\w+);.*;call_id=(.*?);.*;src_user=(.*?);.*;dst_ouser=(.*?);.*;dst_domain=([^\r\n]*+)/; printf "%s,%s,%s,%s,%s,%s\n",$3,$2 eq "BYE"?"B":"A",$1,$4,$5,$6' | awk -F, '$2=="A" {print $6}' | sort | uniq -c

51073 212.249.15.9

17693 217.168.45.4

$ expr 51073 + 17693

68766

4. Building call records

In the following script we are merging together the records having the same call-id. This is achieved by sorting the output lines where the first field is the call-id, and the second field is “A” if the line represents the beginning of the conversation and “B” if it is the end of the call. As a result of merging we will have a set of fields representing the beginning of the call followed by a set of fields representing the end of the call. Normally we will have only pairs of lines with one “A” and one “B” record sharing the same call id. However, more than two A/B records can be merged together in exceptional cases, if multiple INVITEs and BYEs are registered under the same call-id.

0000221ba3287435,A,1287334561,684168,002522168768,217.168.45.4,0000221ba3287435,B,1287334580,002522168768,684168,41.206.153.251,

0000ae1a6842d52c,A,1287327409,000000000000,002522200370,217.168.45.4,0000ae1a6842d52c,B,1287327417,000000000000,002522200370,217.168.45.4,

0000f6798a5f4f6f,A,1287217382,101,0023224006762,212.249.15.9,0000f6798a5f4f6f,B,1287218431,101,+23224006762,213.71.2.208,

00024779b134e721,A,1287187156,101,0038643281242,212.249.15.9,00024779b134e721,B,1287187704,101,+38643281242,213.71.2.208,

0005c56a98078d50,A,1287167799,0000,0038643281289,212.249.15.9,0005c56a98078d50,B,1287167828,+38643281289,0000,41.206.158.7,

0005f5698e073461,A,1287313970,000000000000,002522200185,217.168.45.4,0005f5698e073461,B,1287313976,000000000000,002522200185,217.168.45.4,

000678011129b805,A,1287340940,000000000000,002522168855,217.168.45.4,000678011129b805,B,1287340947,000000000000,002522168855,217.168.45.4,

0006dd42d51dad7b,A,1287290372,000000000000,0022479910595,217.168.45.4,0006dd42d51dad7b,B,1287290382,0022479910595,000000000000,109.253.170.238,

000c222ec747ab3f,A,1287317096,888,0023222291847,217.168.45.4,000c222ec747ab3f,B,1287317219,888,0023222291847,217.168.45.4,

If we now sort the merged fields again according to the time (and not call-id), we will see that the chronologically first call record in the obtained list corresponds to the first INVITE in the file of transactions being processed.

We can now say that the first hacked call was issued on 1286968402 in unix time stamp, corresponding to October 13^th 13:13 to +972599870738. This call was issued from the IP address 188.161.231.133 and it lasted 1286968407 - 1286968402 = 5 seconds.

$ head -1 101013+6-answered.txt

101014-syslog.txt:Oct 13 13:13:22 ks301129 ser[9109]: NOTICE: acc [acc.c:275]: ACC: transaction answered: timestamp=1286968402;method=INVITE;from_tag=7e6d1b44;to_tag=65212929765820101013131233;call_id=ae5d7f4125027e66;code=200;reason=OK;src_user=101;src_domain=91.121.73.130;dst_ouser=00972599870738;dst_user=+972599870738;dst_domain=212.249.15.9

ae5d7f4125027e66,A,1286968402,101,00972599870738,212.249.15.9,ae5d7f4125027e66,B,1286968407,+972599870738,101,188.161.231.133,

9a428e1758434f2e,A,1286968438,101,00972599870738,212.249.15.9,9a428e1758434f2e,B,1286968611,101,+972599870738,213.71.2.208,

f0073042b0657a13,A,1286968510,101,00972597516161,212.249.15.9,f0073042b0657a13,B,1286968607,101,+972597516161,213.71.2.208,

ab071445d631b666,A,1286968532,101,00972597516161,212.249.15.9,ab071445d631b666,B,1286968606,101,+972597516161,213.71.2.208,

b8009e4b3178c70a,A,1286968590,101,00972599870738,212.249.15.9,b8009e4b3178c70a,B,1286968602,101,+972599870738,213.71.2.208,

bd20b03dd73c1c1b,A,1286998646,133,38643322585,212.249.15.9,bd20b03dd73c1c1b,B,1286998649,133,+38643322585,213.71.2.208,

d2228418041d064f,A,1286998656,133,38643322585,212.249.15.9,d2228418041d064f,B,1286998698,133,+38643322585,213.71.2.208,

174abf3e07248517,A,1287092708,0000000000,0038643327405,212.249.15.9,174abf3e07248517,B,1287092787,0000000000,+38643327405,213.71.2.208,

44577f70d5141a07,A,1287092720,0000000000,0038643327405,212.249.15.9,44577f70d5141a07,B,1287092786,0000000000,+38643327405,213.71.2.208,

Before generation of call records let us compute the number of distinct call identifications in the file of all transactions and in the file of transactions corresponding to answered calls only. As expected a fewer number of call id appears in the second file.

$ cat 101013+6-callid.txt | perl -ne '{/;call_id=.*?;/; print $&."\n"}' | head

;call_id=ae5d7f4125027e66;

;call_id=9a428e1758434f2e;

$ cat 101013+6-callid.txt | perl -ne '{/;call_id=.*?;/; print $&."\n"}' | sort | uniq | wc -l

56415

$ cat 101013+6-answered.txt | perl -ne '{/;call_id=.*?;/; print $&."\n"}' | sort | uniq | wc -l

48004

Now the output file “101013+6-calls.txt” is generated and the number of its lines is exactly the same as the number of distinct call identifications in the file “101013+6-answered.txt” of transactions of answered calls.

$ cat 101013+6-answered.txt | perl -ne '/ timestamp=(\d+);method=(\w+);.*;call_id=(.*?);.*;src_user=(.*?);.*;dst_ouser=(.*?);.*;dst_domain=([^\r\n]*+)/; printf "%s,%s,%s,U %s,U %s,%s\n",$3,$2 eq "BYE"?"B":"A",$1,$4,$5,$6' | sort | awk -F, '$1!=id{printf "\r\n"} {printf "%s,",$0} {id=$1}' | sort -n -t, -k 3 > 101013+6-calls.txt

$ wc -l 101013+6-calls.txt

48005 101013+6-calls.txt

The call-id numbers repeated most of the time will take the highest amount of columns in the output file. They probably correspond to concurrent calls issued under the same call-id (due to an error or deliberately).

$ cat 101013+6-answered.txt | perl -ne '/ timestamp=(\d+);method=(\w+);.*;call_id=(.*?);.*;src_user=(.*?);.*;dst_ouser=(.*?);.*;dst_doma

| head -2

22 xr3808484263002c1834218e1921681f@188.161.237.156

22 xr11800770335646c9615328e192168f@188.161.229.236

20 xr94449288617869c3113484e192168f@188.161.237.156

Description:	Call Data Records in Text format
File:	data1\101013+6-13'calls.txt.zip
Size:	1.29MB

In the next section we import the text records into an Excel file and we compute the amount of minutes and calls for dialogs sharing the same call-id.

5. Processing the multiple call-id cases

The multiple call-id cases are processed in an Excel sheet described in this section. Rows of the Excel file contain merged field sets with call-start “A” and call-stop “B” fields. We first compute the number of start “A” and stop “B” records.

See the H2 cell for the formula computing the number of call starts:

=COUNTIF($M2:$DR2,"A")

See the cell I2 for the formula computing the number of stops:

=COUNTIF($M2:$DR2,"B")

As all transactions were sorted so as in the merged line all start fields appear before all stop fields (see section 4), the average of all start times is computed by summing the unix timestamps of the first block corresponding to the field-set of the start events and the average time of all stop events is computed by summing the unix timestamps of the second block corresponding to the field-set of the stop events.

See the formula in cell J2 for the average start time:

=SUM(OFFSET($M2,0,0,1,5*H2))/H2

See the formula in cell K2 for the average stop time. Note that the offset is shifted by 5*H2 positions to skip the start events:

=SUM(OFFSET($M2,0,5*H2,1,5*I2))/I2

The total duration can be computed as follows (see the D2 field):

=TIME(0,0,(K2-J2)*F2)

Where F1 field is equal to the number of simultaneous calls under the same call-id:

=AVERAGE(H2:I2)

The value in the H column is normally always equal to the value in the I column (except non-recoverable packet losses resulting in transaction losses).

Finally the date and time of the call is computed by processing the unix time stamp where we consider also the time shift between UTC and CET (CEST) time zones:

=DATE(1970,1,1)+J2/(3600*24)+2/24

One may have doubts about the way the duration is computed. When we do not know which start event corresponds to which stop (as a single call-id is used for multiple events), it might be unclear why the average of all start times and the average of all stop times is sufficient for computing the correct call duration.

When the same call-id is used for two different calls, there is no way to know how to match the starts with the stops. We have a set of starts on one side and a set of stops on the other.

The following drawing visualizes two starts “A” and two stops “B”.

The above image represents two possible combinations are possible:

The first possibility:

The second possibility:

However, in both cases the sum of durations of two calls is the same:

In the general case also:

Therefore the total duration of all calls sharing the same call-id is the average of stop times minus the average of start times multiplied by the number of simultaneous calls.

Description:	Calls sharing the same call-id
File:	data1\101013+6-13'calls.xls.zip
Size:	7.21MB

The Excel file of this section contains many formulas and is heavy to open. In the next section we present the compact version of the CDR file containing only values resulting from the computation of the number and duration of simultaneous calls.

6. Call Data Records (CDR)

The CDR file containing the output values can be downloaded in section 6.4. Next sections show several statistics resulting from the CDR file. The statistics per country, per from-user field, and per destination number are shown in sections 6.1, 6.2, and 6.3 (also available in the Excel file of section 6.4).

6.1. Statistics per country being called

Country	Code	Calls	Minutes	ACD
Slovenia	386	25'594	153'239.7	6.0
Sierra Leone	232	6'553	32'224.2	4.9
Somalia Republic	252	10'801	24'779.6	2.3
Guinea	224	5'021	12'659.3	2.5
Israel	972	28	118.4	4.2
Macedonia	389	6	4.7	0.8
Zimbabwe	263	1	0.2	0.2

6.2. Statistics over the from user field being used

From user	Calls	Minutes	ACD
101	19'765	76'851.3	3.9
asd300	1'778	22'211.1	12.5
0000	1'648	19'234.2	11.7
000000000000	8'534	18'356.5	2.2
0000000	2'008	16'017.6	8.0
kalnas600	4'451	13'818.2	3.1
dehka	4'272	10'121.5	2.4
000000	868	9'974.5	11.5
foaad_saa	147	6'993.3	47.6
7777	605	6'424.5	10.6
55202033	324	4'294.3	13.3
kalnas500	271	4'036.9	14.9
1111	204	2'923.7	14.3
888888888	350	2'374.3	6.8
888	110	1'745.6	15.9
123	138	1'362.1	9.9
hisham1970	813	1'339.9	1.6
0000000000	468	1'224.7	2.6
684168	779	863.6	1.1
111111111111	85	583.6	6.9
11	71	550.9	7.8
asdf500	98	365.9	3.7
shikso	37	335.2	9.1
marryaina123-1001	17	282.4	16.6
999	13	168.4	13.0
karam155	8	146.4	18.3
RAMY250	10	117.2	11.7
00000000	58	117.0	2.0
5555555	6	101.0	16.8
133	21	48.2	2.3
10	19	32.2	1.7
1001	24	7.2	0.3
anonymous	2	2.0	1.0
441932376101	1	0.6	0.6
250	1	0.1	0.1

6.3. Statistics over the phone number being called

The table of phone numbers being dialed shows only the top used numbers. The full list is available in the CDR Excel file (section 6.4).

To	Calls	Minutes	ACD
0038643281239	1'178	19'643.7	16.7
0038643281242	2'521	11'744.3	4.7
0038643281460	5'267	11'048.3	2.1
0038643281244	3'583	8'592.0	2.4
0023224000936	276	8'249.4	29.9
0023224000935	1'361	7'005.9	5.1
0038643281094	490	6'757.5	13.8
002522200377	1'613	6'564.5	4.1
0038643281081	356	5'629.6	15.8
0023224006762	649	5'587.6	8.6
0038643281286	1'711	5'524.6	3.2
0038643281494	557	5'304.5	9.5
0038643281461	2'259	5'176.2	2.3
0038643281287	366	4'794.5	13.1
0038643281463	695	4'732.7	6.8
0038643281289	330	4'618.0	14.0
0038643281098	779	4'604.2	5.9
0023224000938	2'966	4'385.7	1.5
0038643281234	358	4'065.5	11.4
0038643281498	370	3'840.6	10.4
0038643281288	271	3'811.6	14.1
0038643281230	291	3'742.0	12.9
0038643281233	321	3'741.3	11.7
0038643281499	319	3'711.6	11.6
0038643281238	462	3'619.9	7.8
0038643281231	316	3'566.2	11.3
002522200378	1'387	3'554.2	2.6
0023224006772	919	3'546.2	3.9
0038643281232	292	3'472.5	11.9
0038643281497	243	3'391.4	14.0
0038643281465	505	2'886.7	5.7
0038643281496	272	2'866.9	10.5
0038643281466	374	2'706.8	7.2
002522168653	592	2'477.0	4.2
0038643281241	166	2'323.9	14.0
0038643281476	155	2'292.1	14.8
002522168765	195	1'884.5	9.7
002522168898	403	1'867.7	4.6
002522168652	878	1'840.3	2.1
0022479910583	134	1'714.2	12.8
0022479910596	493	1'678.0	3.4
0038643281080	218	1'654.7	7.6
0022479910594	326	1'484.4	4.6
0022479910584	90	1'214.6	13.5
0022479910595	413	1'198.1	2.9
0022479910597	396	1'171.2	3.0
0022479910598	376	1'125.4	3.0
0023224001570	92	1'024.9	11.1
0038643281190	240	949.9	4.0
0023222291848	29	931.9	32.1
0022479910585	117	835.0	7.1
0038643281464	110	791.9	7.2
0022479910589	191	776.4	4.1

6.4. CDR file

Description:	CDR file created from syslog
File:	data1\101013+6-14'cdr.zip
Size:	1.66MB

7. Traffic distribution chart

The following chart shows the evolution of the distribution of the traffic by countries. The data is presented for hourly intervals. The values represent the number of concurrent parallel calls lasting during a given hour. First the fraudulent traffic was using the connections of Verizon. When Verizon detected the fraud and suspended the calls, the flow interrupted for a couple of hours and then restarted using this time the routes of Colt.

The following two records show the first and last calls routed via Verizon:

Time: 2010-10-13 13:13:22

From: 101

To: Israel

Phone: 00972599870738

Duration: 00:00:05

Via: verizonbusiness.com

Time: 2010-10-16 18:09:12

From: dehka

To: Sierra Leone

Phone: 0023224000938

Duration: 00:00:46

Via: verizonbusiness.com

The fraudulent traffic was interrupted when Verizon detected the fraud and decided to block the calls. In a couple of hours the fraudulent traffic began again, and this time via Colt. The following two records show the first and last calls routed via Colt. The fraud was detected by Colt on Sunday and the calls were blocked.

Time: 2010-10-16 20:55:18

From: 250

To: Israel

Phone: 00972599916699

Duration: 00:00:07

Via: colt.net

Time: 2010-10-17 23:07:44

From: 000000000000

To: Somalia Republic

Phone: 002522168598

Duration: 00:00:19

Via: colt.net

Description:	Distribution chart by hours
File:	data1\101013+6-15'chart.zip
Size:	2.18MB

8. Calls to Slovenia-Mobile-Kosovo Ipkonet

When Verizon’s fraud department detected the pattern, the records of suspected calls to Slovenia were sent to us.

8.1. Comparison of syslog and vendor records

The CDR generated by ourselves from syslog files was compared with the CDR of Verizon containing the calls to Slovenia mobiles. Calls of both CDR matched accurately most of the time. The records in two files were often identical except a time shift from 32 to 34 seconds due to a wrong time on one of the sides.

Description:	Vendor and syslog CDR comparison
File:	data1\101013+6-16'slovenia.zip
Size:	12.8MB

The following records represent the first and last calls appearing in the fraud report of Verizon for calls to Slovenia mobiles:

Time: 2010-10-15 01:48:46

To: 38643281227

Duration: 191 seconds

Time: 2010-10-16 18:08:58

To: 38643281463

Duration: 16

8.2. The number of simultaneous calls to Slovenia mobiles

The entire traffic of 7’554’889 seconds or of 125’914.8 minutes, representing a charge of CHF 38'035.47 (without VAT) was sent to 32 phone numbers only. Except businesses handling simultaneous hot line calls, the multiple answers to the same phone number suggest a fraud. The following table shows the number of parallel calls to each specific individual mobile phone number. The first row of the table contains the 32 mobile phone numbers in question. The rows that follow represent one-hour intervals. The values appearing under individual phone numbers represent the average number of concurrent calls to that specific phone during the entire period of 1-hour intervals.

The table shows that for example during the entire hour from 2010-10-16 04:00 to 04h59 there were in average as many as 34 simultaneous calls to a single phone number +38 64 32 81 23 9, generating a total duration of 2’057.65 minutes during this single hour and corresponding to a cost of CHF 621.56 (per 1 hour and per 1 phone number). The number of simultaneous calls per single phone number reached as high as 91 parallel calls and the total number of parallel simultaneous calls to Slovenia mobiles reached as high as 180 parallel calls (a capacity of 6 full E1 lines).

In case of real mobile phone subscribers, we see neither a technical possibility nor an economical benefit for sending 126'000 minutes to 32 mobile phones in about one day. It is possible that a vendor of Verizon, or a vendor of its vendor provided a wrong answer supervision for all calls to Slovenia mobiles. Such an intermediary fake vendor would benefit from the traffic and can be therefore in the origin of the fraudulent calls. The final owner of the range of numbers in the destination country (such as a small MVNO, OLO, or PNS) can also benefit from the incoming traffic and therefore is also a hypothetical suspect for the origin of the fraudulent traffic.

The following chart is the graphical version of the previous table. The horizontal positions of histograms represent the hours. The total height of histograms at a given hour is the number of simultaneous calls to Slovenia mobiles. Different colors represent one of the 32 individual mobile phone numbers. The height of a single histogram of a single color is the number of simultaneous calls to the corresponding single mobile phone number. For example the chart shows that starting from 6 o’clock in the morning of October 16^th, during one hour, there were 91 simultaneous calls to a single mobile phone subscriber +38643281239.

Description:	Simultaneous calls per phone
File:	data1\101013+6-17'phones.zip
Size:	1.08MB

9. References:

Fraud reports [142] [143] [144] [145] [146] [147]:

http://switzernet.com/3/public/101028-fraud-slovenia/ (this pahe)

http://switzernet.com/public/060801-web/news_detail.php?id=167

http://switzernet.com/public/060801-web/news_detail.php?id=166

http://switzernet.com/3/folders/101018-fraud-slovenia/ (login: fraud)

http://mirror2.switzernet.com/3/folders/101018-fraud-slovenia/ (login: fraud)

http://www.fedpol.admin.ch/content/fedpol/fr/misc/conform.html

ACD quality routing [148] [149] [150] [151] [152] [153] [154] [155] [156]:

http://switzernet.com/public/091020-acd-routing/

http://www.unappel.ch/2/public/091020-acd-routing/

http://unappel.ch/public/091020-acd-routing/

http://intarnet.com/2/public/091020-acd-routing/

http://parinternet.ch/2/public/091020-acd-routing/

http://switzernet.com/public/091029-ACDstat/

http://unappel.ch/public/091029-ACDstat/

http://switzernet.com/public/091217-doc-acd-routing/

http://en.wikipedia.org/wiki/Least-cost_routing

Emergency numbers [157]:

http://unappel.ch/folders/101004-emergency-calls-planning/ (login: ofcom)

Kamalio/OpenSER SIP server/router [158] [159] [160]:

http://www.kamailio.org/

http://sip-router.org/

http://www.iptel.org/ser/

Perl regular expressions [161] [162]:

http://switzernet.com/3/public/101024-regex/

http://perldoc.perl.org/perlre.html

References on syslog file format [163] [164] [165]:

http://www.facetcorp.com/tnotes/facetwin/tn_syslog.html

http://www.syslog.org/

http://lists.rtpproxy.org/pipermail/users/2009-May.txt

References on SIP transactions versus dialogs [166] [167] [168] [169]:

http://www.iptel.org/sip_transaction

http://www.iptel.org/node/20

http://www.ietf.org/rfc/rfc2543.txt

http://www.ietf.org/rfc/rfc3261.txt

10. Glossary

CDR stands for Call Data Records

ACD stands for Average Call Duration

UTC stands for Universal Time Coordinated

CET stands for Central European Time

CEST stands for Central European Summer Time

MVNO stands fro Mobile Virtual Network Operator

OLO stands for Other Licensed Operator

PNS stands for Personal Numbering Service

11. Syslog and CDR files

This section groups all files used along this research. The list contains files with raw syslog records as well as files showing different statistics. The reference that contains the call records and is not heavy to open is the output CDR file [101013+6-14'cdr.xls].

Description:	All transactions of answered calls
File:	data1\101013+6-12'answered.zip
Size:	3.69 MB

Description:	Call Data Records in Text format
File:	data1\101013+6-13'calls.txt.zip
Size:	1.29MB

Description:	Calls sharing the same call-id
File:	data1\101013+6-13'calls.xls.zip
Size:	7.21MB

Call Data Records created from the syslog file:

Description:	CDR file created from syslog
File:	data1\101013+6-14'cdr.zip
Size:	1.66MB

Description:	Distribution chart by hours
File:	data1\101013+6-15'chart.zip
Size:	2.18MB

Description:	Vendor and syslog CDR comparison
File:	data1\101013+6-16'slovenia.zip
Size:	12.8MB

Description:	Simultaneous calls per phone
File:	data1\101013+6-17'phones.zip
Size:	1.08MB

12. Formatting particularities of this document

This section is addressed only to persons editing this or similar documents. This section is unrelated to the subject of the document.

12.1. Styles

The following image shows the styles used in this document. Do not add new styles when editing and updating this document.

12.2. File reference style

The [file reference] table style is bugging. When you open the document the font settings are mixed up. In the modify style pane of the [file reference] style you have to re-apply the [Lucida Console] fonts to the right column of this table style. This will restore all other setting of the style. The procedure must be carried out before printing or saving the document in HTML format.

12.3. Numbering of references

Microsoft field codes are used for auto incremental reference numbers appearing in the document. In order to toggle field codes you have to first remove the hyperlink (Ctrl-K).

To add a new reference you need to copy any of other references and change only the hyperlink. You do not need to care about the numbering. The numbering of all references can be updated in a single step. Select the entire document and in the right-click pop up menu choose [Update Field].

12.4. Deleting the reference number bookmark before printing

Before printing the document, update all fields (as explained in section 12.3) and delete the “iref” bookmark (Alt-I-K). Otherwise, all references will appear under the number of the last reference.

12.5. Conventions on the new versions of the document

The main document file is a numbered index<N>.doc file, where <N> is an incrementing version number of the document. The document must be saved in index<N>.htm file (accompanying by an automatically generated folder index<N>_files). Every time a new version is released, the index.htm file must be deleted, and the last index<N>.htm file must be copied and renamed into new index.htm file. At any moment the index.htm file is a copy of the last index<N>.htm file. The index.htm file can be erased at any time when a new version is released. You must not have index.doc file. The folder index_files (corresponding to index.htm file) must be deleted as the index.htm file will anyway refer to the files located in the folder index<N>_files. At every update you must add in the header of the document your name and under the date of the update a link to the current version of index<N>.htm file (and not to index.htm) for backtracking.

Data files accompanying your document (not the files generated automatically when saving in HTML format) must be located in data<M> folder, where <M> is an incrementing number and is not necessarily equal to <N>. Do not hesitate to create each time your own data<M> folder, instead of adding pieces in already existing data<M> folder of the previous author.

* * *