Reconstructing the CDR file from syslogs of the kamalio SIP router – in relation with the fraud involving Slovenia mobiles
[]
Emin Gabrielyan
Switzernet.com
2. Extracting the sufficient subset of records for building the CDR of answered calls
2.2. Extraction of sufficient subset
3. Reading the fields of INVITE and BYE transactions
3.1. The greediness options of regex quantifiers
5. Processing the multiple call-id cases
6.1. Statistics per country being called
6.2. Statistics over the from user field being used
6.3. Statistics over the phone number being called
8. Calls to Slovenia-Mobile-Kosovo Ipkonet
8.1. Comparison of syslog and vendor records
8.2. The number of simultaneous calls to Slovenia mobiles
12. Formatting particularities of this document
12.4. Deleting the reference number bookmark before printing
12.5. Conventions on the new versions of the document
Our experimental SIP server installed for testing and developments of the ACD quality routing [101] [102] [103] [104] [105] [106] [107] [108] [109] and of the system designated for the routing of emergency calls [110] was hacked in October 2010. The first call log via the hacked server is dated October 13th. A significant volume is registered during the weekend from 2010-10-15 through 2010-10-17 [111] [112] [113] [114] [115]. The fraudulent traffic was terminated via the hacked server to several destinations. The server was not integrated into the main billing system, and the calls were not accounted in the central CDR database. For discovering the traffic details only syslog files [116] [117] [118] of the hacked UNIX server are used.
The kamalio server [119] [120] [121] logged the SIP transactions via unix syslog service. This document presents the construction of CDR file from the syslog file (see sections from 2 to 5), provides the output CDR file of fraudulent calls (section 6.4), different statistics on the number of simultaneous calls and destinations dialed (sections 6.1, 6.2, 6.3, 7, and 8.2), and several hypothesis for possible motives of the fraud (section 8.2).
This document has three different target audiences. One target is the training on processing of the text log files and testing of various skills such as the ability to form dialog (call) records from the transaction logs (SIP method/responses) [122] [123] [124] [125]. The logs not belonging to traffic generated by legitimate users will be used publicly for pre-recruitment tests and for internal training of the staff. The authorities processing the complains in relation with the fraudulent calls to Slovenia mobiles can find additional statistics related to the traffic. Finally this document publicly provides the data to other operators for comparison of patterns and for the prevention of frauds in their own networks.
Our objective is to select the transactions determinant for the establishment of call durations and assemble them into phone calls.
The syslog file [126] [127] [128] contains records for accomplished (answered) SIP transactions. A SIP transaction is often an exchange of two SIP packets, a method followed by a response [129] [130] [131] [132]. The transaction is sent to syslog upon the reception of a response concluding the transaction. The transaction is recorded as a single syslog line. The following is an example of a transaction record. The kamalio server sends to syslog a single string of semicolon separated fields. The following is an example of such a record, where for a visual clarity each field is shown on a separated line aligned by the equal sign. The shown newlines and following spaces are not present in the syslog file.
101014-syslog.txt:Oct 13 13:13:08 ks301129 ser[9108]: NOTICE: acc [acc.c:275]: ACC: transaction answered:
timestamp=1286968388;
method=INVITE;
from_tag=7e6d1b44;
to_tag=65212929765820101013131233;
call_id=ae5d7f4125027e66;
code=183;
reason=Session Progress;
src_user=101;
src_domain=91.121.73.130;
dst_ouser=00972599870738;
dst_user=+972599870738;
dst_domain=212.249.15.9
All lines of syslog file containing transaction records of kamalio where the “call_id” field is present are extracted in a separate log file, which serves us as the main input in our next efforts for building a CDR file.
Description: |
All transactions with call_id field |
File: |
|
Size: |
6.44MB |
We are limiting our research by answered calls only. The output CDR will contain only the answered calls. You can skip the processing of the file and go directly to section 6.4 containing a link to the output CDR file. Sections 2.2, 3, 4, and 5 are provided only for training purposes and are insignificant for administrative efforts related to fraud complains.
The transaction record appears in the syslog file [133] [134] [135] when the reply to a method is received [136] [137] [138] [139]. At that point, both the key data of the method’s request and of the reply are logged into a single syslog line. To obtain phone call data, we need the records of all INVITE methods having 200 OK replies and of all BYE methods.
The syslog file counts a slightly higher number of BYE transactions that of INVITE methods having 200 success replies. The excess of BYE can be due to losses and retransmissions. Additionally, as you see below, the transaction records are duplicated. The reason of the duplicates is ignored (can be the multi-process nature of the kamalio server).
$ grep ";method=INVITE;.*;code=200;" 101013+6-callid.txt | wc -l
137536
$ grep ";method=BYE;" 101013+6-callid.txt | wc -l
144532
$ grep ";method=INVITE;.*;code=200;" 101013+6-callid.txt | sort | uniq | wc -l
68766
$ grep ";method=BYE;" 101013+6-callid.txt | sort | uniq | wc -l
72249
$
We eliminate the duplicates and save the successful INVITE methods as well as all the BYE methods into a new text file.
$ egrep "(;method=INVITE;.*;code=200;|;method=BYE;)" 101013+6-callid.txt | sort | uniq | wc -l
141015
$ expr 68766 + 72249
141015
$ egrep "(;method=INVITE;.*;code=200;|;method=BYE;)" 101013+6-callid.txt | sort | uniq > 101013+6-answered.txt
$ u2d 101013+6-answered.txt
101013+6-answered.txt:
$ wc -l 101013+6-answered.txt
141015 101013+6-answered.txt
$
Description: |
All transactions of answered calls |
File: |
|
Size: |
3.69 MB |
The kamalio service sends to syslog a string with semicolon separated fields containing both the filed name and the value. This section presents the script used for extracting the fields we are interested in. In our regex [140] [141] the quantifiers are followed by not very popular greediness options. If you wish to understand the regex used in our script, read the next subsection 3.1, otherwise skip it and continue with subsection 3.2.
The behavior of commonly used quantifiers “*”, “?”, “+”, “{n,m}” can be tuned by greediness options. If the quantifier (such as “*”) is followed by “?” the quantified subpattern will match the minimum number of times. If the quantifier is followed by “+” it will match the maximum number of times.
By default a quantified subpattern “+” or “*” is greedy. In the following example, the character “a” is matched the maximum possible of times with expression “a+” or “a*” while still allowing the rest of the pattern (the last “a”) to match:
$ echo aaaa | perl -ne '/a+a/; print $&'
aaaa
$ echo aaaa | perl -ne '/a*a/; print $&'
aaaa
$
If you want it to match the minimum number of times possible, follow the quantifier with a “?”. In the following example “a+” is matched only 1 time, and “a*” zero time (the respective minimums).
$ echo aaaa | perl -ne '/a+?a/; print $&'
aa
$ echo aaaa | perl -ne '/a*?a/; print $&'
a
$
Perl provides also the “possessive” quantifier form. Follow the quantifier with “+”. The possessive option matches as much as possible and does not take care of the rest of the regex (whether the rest of the pattern will match or not).
The example below is without the possessive option “+” (after “a+”), and we see that it allows matching of both halves of the regular expression: “a+” and the following “a”.
$ echo aaaa | perl -ne '/(a+)a/; print $1'
aaa
$
However, when the possessive option is added, the first half of the regular expression eats up all “a”s, without leaving any character for the rest of the regex.
$ echo aaaa | perl -ne '/(a++)a/; print $1'
$
The regex with the possessive quantifier matches only without the second half.
$ echo aaaa | perl -ne '/(a++)/; print $1'
aaaa
$
The following command line script retrieves the unix time stamp (the seconds counted since 1970-01-01), the SIP method (INVITE or BYE), the unique call id, the source user (SIP from field), the destination user (SIP to field before translation), and finally the next hop SIP server (i.e. our vendor). As you see non-greedy quantifier is used for stopping the matching at the first occurrence of the semicolon separator, and the possessive quantifier is used for ensuring the full capture of the IP address (normally redundant as by default the quantifier is greedy).
$ head 101013+6-answered.txt | perl -ne '/ timestamp=(\d+);method=(\w+);.*;call_id=(.*?);.*;src_user=(.*?);.*;dst_ouser=(.*?);.*;dst_domain=([^\r\n]*+)/; printf "%10s %10s %20s %20s %20s %20s\n",$1,$2,$3,$4,$5,$6'
1286968402 INVITE ae5d7f4125027e66 101 00972599870738 212.249.15.9
1286968407 BYE ae5d7f4125027e66 +972599870738 101 188.161.231.133
1286968438 INVITE 9a428e1758434f2e 101 00972599870738 212.249.15.9
1286968510 INVITE f0073042b0657a13 101 00972597516161 212.249.15.9
1286968532 INVITE ab071445d631b666 101 00972597516161 212.249.15.9
1286968590 INVITE b8009e4b3178c70a 101 00972599870738 212.249.15.9
1286968602 BYE b8009e4b3178c70a 101 +972599870738 213.71.2.208
1286968606 BYE ab071445d631b666 101 +972597516161 213.71.2.208
1286968607 BYE f0073042b0657a13 101 +972597516161 213.71.2.208
1286968611 BYE 9a428e1758434f2e 101 +972599870738 213.71.2.208
As expected, the number of extracted lines matches to the total number of sufficient syslog records extracted and shown in section 2.2.
$ cat 101013+6-answered.txt | perl -ne '/ timestamp=(\d+);method=(\w+);.*;call_id=(.*?);.*;src_user=(.*?);.*;dst_ouser=(.*?);.*;dst_domain=([^\r\n]*+)/; printf "%10s %10s %20s %20s %20s %20s\n",$1,$2,$3,$4,$5,$6' | wc -l
141015
$ wc -l 101013+6-answered.txt
141015 101013+6-answered.txt
$
The fields are displayed in a comma separated format and with the call-id field at the beginning (for being used as a matching key between INVITE and BYE records). The symbol “A” in the 2nd field means the beginning of the call charge and “B” the end of the call.
$ cat 101013+6-answered.txt | perl -ne '/ timestamp=(\d+);method=(\w+);.*;call_id=(.*?);.*;src_user=(.*?);.*;dst_ouser=(.*?);.*;dst_domain=([^\r\n]*+)/; printf "%s,%s,%s,%s,%s,%s\n",$3,$2 eq "BYE"?"B":"A",$1,$4,$5,$6' | head
ae5d7f4125027e66,A,1286968402,101,00972599870738,212.249.15.9
ae5d7f4125027e66,B,1286968407,+972599870738,101,188.161.231.133
9a428e1758434f2e,A,1286968438,101,00972599870738,212.249.15.9
f0073042b0657a13,A,1286968510,101,00972597516161,212.249.15.9
ab071445d631b666,A,1286968532,101,00972597516161,212.249.15.9
b8009e4b3178c70a,A,1286968590,101,00972599870738,212.249.15.9
b8009e4b3178c70a,B,1286968602,101,+972599870738,213.71.2.208
ab071445d631b666,B,1286968606,101,+972597516161,213.71.2.208
f0073042b0657a13,B,1286968607,101,+972597516161,213.71.2.208
9a428e1758434f2e,B,1286968611,101,+972599870738,213.71.2.208
The further processing of the INVITE records of the output shows that calls were routed only via two outgoing vendors. The verification matches with the total of section 2.2.
$ cat 101013+6-answered.txt | perl -ne '/ timestamp=(\d+);method=(\w+);.*;call_id=(.*?);.*;src_user=(.*?);.*;dst_ouser=(.*?);.*;dst_domain=([^\r\n]*+)/; printf "%s,%s,%s,%s,%s,%s\n",$3,$2 eq "BYE"?"B":"A",$1,$4,$5,$6' | awk -F, '$2=="A" {print $6}' | sort | uniq -c
51073 212.249.15.9
17693 217.168.45.4
$ expr 51073 + 17693
68766
$
In the following script we are merging together the records having the same call-id. This is achieved by sorting the output lines where the first field is the call-id, and the second field is “A” if the line represents the beginning of the conversation and “B” if it is the end of the call. As a result of merging we will have a set of fields representing the beginning of the call followed by a set of fields representing the end of the call. Normally we will have only pairs of lines with one “A” and one “B” record sharing the same call id. However, more than two A/B records can be merged together in exceptional cases, if multiple INVITEs and BYEs are registered under the same call-id.
$ cat 101013+6-answered.txt | perl -ne '/ timestamp=(\d+);method=(\w+);.*;call_id=(.*?);.*;src_user=(.*?);.*;dst_ouser=(.*?);.*;dst_domain=([^\r\n]*+)/; printf "%s,%s,%s,%s,%s,%s\n",$3,$2 eq "BYE"?"B":"A",$1,$4,$5,$6' | sort | awk -F, '$1!=id{printf "\n"} {printf "%s,",$0} {id=$1}' | head
0000221ba3287435,A,1287334561,684168,002522168768,217.168.45.4,0000221ba3287435,B,1287334580,002522168768,684168,41.206.153.251,
0000ae1a6842d52c,A,1287327409,000000000000,002522200370,217.168.45.4,0000ae1a6842d52c,B,1287327417,000000000000,002522200370,217.168.45.4,
0000f6798a5f4f6f,A,1287217382,101,0023224006762,212.249.15.9,0000f6798a5f4f6f,B,1287218431,101,+23224006762,213.71.2.208,
00024779b134e721,A,1287187156,101,0038643281242,212.249.15.9,00024779b134e721,B,1287187704,101,+38643281242,213.71.2.208,
0005c56a98078d50,A,1287167799,0000,0038643281289,212.249.15.9,0005c56a98078d50,B,1287167828,+38643281289,0000,41.206.158.7,
0005f5698e073461,A,1287313970,000000000000,002522200185,217.168.45.4,0005f5698e073461,B,1287313976,000000000000,002522200185,217.168.45.4,
000678011129b805,A,1287340940,000000000000,002522168855,217.168.45.4,000678011129b805,B,1287340947,000000000000,002522168855,217.168.45.4,
0006dd42d51dad7b,A,1287290372,000000000000,0022479910595,217.168.45.4,0006dd42d51dad7b,B,1287290382,0022479910595,000000000000,109.253.170.238,
000c222ec747ab3f,A,1287317096,888,0023222291847,217.168.45.4,000c222ec747ab3f,B,1287317219,888,0023222291847,217.168.45.4,
$
If we now sort the merged fields again according to the time (and not call-id), we will see that the chronologically first call record in the obtained list corresponds to the first INVITE in the file of transactions being processed.
We can now say that the first hacked call was issued on 1286968402 in unix time stamp, corresponding to October 13th 13:13 to +972599870738. This call was issued from the IP address 188.161.231.133 and it lasted 1286968407 - 1286968402 = 5 seconds.
$ head -1 101013+6-answered.txt
101014-syslog.txt:Oct 13 13:13:22 ks301129 ser[9109]: NOTICE: acc [acc.c:275]: ACC: transaction answered: timestamp=1286968402;method=INVITE;from_tag=7e6d1b44;to_tag=65212929765820101013131233;call_id=ae5d7f4125027e66;code=200;reason=OK;src_user=101;src_domain=91.121.73.130;dst_ouser=00972599870738;dst_user=+972599870738;dst_domain=212.249.15.9
$ cat 101013+6-answered.txt | perl -ne '/ timestamp=(\d+);method=(\w+);.*;call_id=(.*?);.*;src_user=(.*?);.*;dst_ouser=(.*?);.*;dst_domain=([^\r\n]*+)/; printf "%s,%s,%s,%s,%s,%s\n",$3,$2 eq "BYE"?"B":"A",$1,$4,$5,$6' | sort | awk -F, '$1!=id{printf "\n"} {printf "%s,",$0} {id=$1}' | sort -n -t, -k 3 | head
ae5d7f4125027e66,A,1286968402,101,00972599870738,212.249.15.9,ae5d7f4125027e66,B,1286968407,+972599870738,101,188.161.231.133,
9a428e1758434f2e,A,1286968438,101,00972599870738,212.249.15.9,9a428e1758434f2e,B,1286968611,101,+972599870738,213.71.2.208,
f0073042b0657a13,A,1286968510,101,00972597516161,212.249.15.9,f0073042b0657a13,B,1286968607,101,+972597516161,213.71.2.208,
ab071445d631b666,A,1286968532,101,00972597516161,212.249.15.9,ab071445d631b666,B,1286968606,101,+972597516161,213.71.2.208,
b8009e4b3178c70a,A,1286968590,101,00972599870738,212.249.15.9,b8009e4b3178c70a,B,1286968602,101,+972599870738,213.71.2.208,
bd20b03dd73c1c1b,A,1286998646,133,38643322585,212.249.15.9,bd20b03dd73c1c1b,B,1286998649,133,+38643322585,213.71.2.208,
d2228418041d064f,A,1286998656,133,38643322585,212.249.15.9,d2228418041d064f,B,1286998698,133,+38643322585,213.71.2.208,
174abf3e07248517,A,1287092708,0000000000,0038643327405,212.249.15.9,174abf3e07248517,B,1287092787,0000000000,+38643327405,213.71.2.208,
44577f70d5141a07,A,1287092720,0000000000,0038643327405,212.249.15.9,44577f70d5141a07,B,1287092786,0000000000,+38643327405,213.71.2.208,
$
Before generation of call records let us compute the number of distinct call identifications in the file of all transactions and in the file of transactions corresponding to answered calls only. As expected a fewer number of call id appears in the second file.
$ cat 101013+6-callid.txt | perl -ne '{/;call_id=.*?;/; print $&."\n"}' | head
;call_id=ae5d7f4125027e66;
;call_id=ae5d7f4125027e66;
;call_id=ae5d7f4125027e66;
;call_id=ae5d7f4125027e66;
;call_id=ae5d7f4125027e66;
;call_id=ae5d7f4125027e66;
;call_id=ae5d7f4125027e66;
;call_id=ae5d7f4125027e66;
;call_id=9a428e1758434f2e;
;call_id=9a428e1758434f2e;
$ cat 101013+6-callid.txt | perl -ne '{/;call_id=.*?;/; print $&."\n"}' | sort | uniq | wc -l
56415
$ cat 101013+6-answered.txt | perl -ne '{/;call_id=.*?;/; print $&."\n"}' | sort | uniq | wc -l
48004
$
Now the output file “101013+6-calls.txt” is generated and the number of its lines is exactly the same as the number of distinct call identifications in the file “101013+6-answered.txt” of transactions of answered calls.
$ cat 101013+6-answered.txt | perl -ne '/ timestamp=(\d+);method=(\w+);.*;call_id=(.*?);.*;src_user=(.*?);.*;dst_ouser=(.*?);.*;dst_domain=([^\r\n]*+)/; printf "%s,%s,%s,U %s,U %s,%s\n",$3,$2 eq "BYE"?"B":"A",$1,$4,$5,$6' | sort | awk -F, '$1!=id{printf "\r\n"} {printf "%s,",$0} {id=$1}' | sort -n -t, -k 3 > 101013+6-calls.txt
$ wc -l 101013+6-calls.txt
48005 101013+6-calls.txt
$
The call-id numbers repeated most of the time will take the highest amount of columns in the output file. They probably correspond to concurrent calls issued under the same call-id (due to an error or deliberately).
$ cat 101013+6-answered.txt | perl -ne '/ timestamp=(\d+);method=(\w+);.*;call_id=(.*?);.*;src_user=(.*?);.*;dst_ouser=(.*?);.*;dst_doma
in=([^\r\n]*+)/; printf "%s,%s,%s,U %s,U %s,%s\n",$3,$2 eq "BYE"?"B":"A",$1,$4,$5,$6' | sort | cut -d, -f1 | sort | uniq -c | sort -n -r
| head -2
22 xr3808484263002c1834218e1921681f@188.161.237.156
22 xr11800770335646c9615328e192168f@188.161.229.236
20 xr94449288617869c3113484e192168f@188.161.237.156
$
Description: |
Call Data Records in Text format |
File: |
|
Size: |
1.29MB |
In the next section we import the text records into an Excel file and we compute the amount of minutes and calls for dialogs sharing the same call-id.
The multiple call-id cases are processed in an Excel sheet described in this section. Rows of the Excel file contain merged field sets with call-start “A” and call-stop “B” fields. We first compute the number of start “A” and stop “B” records.
See the H2 cell for the formula computing the number of call starts:
=COUNTIF($M2:$DR2,"A")
See the cell I2 for the formula computing the number of stops:
=COUNTIF($M2:$DR2,"B")
As all transactions were sorted so as in the merged line all start fields appear before all stop fields (see section 4), the average of all start times is computed by summing the unix timestamps of the first block corresponding to the field-set of the start events and the average time of all stop events is computed by summing the unix timestamps of the second block corresponding to the field-set of the stop events.
See the formula in cell J2 for the average start time:
=SUM(OFFSET($M2,0,0,1,5*H2))/H2
See the formula in cell K2 for the average stop time. Note that the offset is shifted by 5*H2 positions to skip the start events:
=SUM(OFFSET($M2,0,5*H2,1,5*I2))/I2
The total duration can be computed as follows (see the D2 field):
=TIME(0,0,(K2-J2)*F2)
Where F1 field is equal to the number of simultaneous calls under the same call-id:
=AVERAGE(H2:I2)
The value in the H column is normally always equal to the value in the I column (except non-recoverable packet losses resulting in transaction losses).
Finally the date and time of the call is computed by processing the unix time stamp where we consider also the time shift between UTC and CET (CEST) time zones:
=DATE(1970,1,1)+J2/(3600*24)+2/24
One may have doubts about the way the duration is computed. When we do not know which start event corresponds to which stop (as a single call-id is used for multiple events), it might be unclear why the average of all start times and the average of all stop times is sufficient for computing the correct call duration.
When the same call-id is used for two different calls, there is no way to know how to match the starts with the stops. We have a set of starts on one side and a set of stops on the other.
The following drawing visualizes two starts “A” and two stops “B”.
The above image represents two possible combinations are possible:
The first possibility:
The second possibility:
However, in both cases the sum of durations of two calls is the same:
In the general case also:
Therefore the total duration of all calls sharing the same call-id is the average of stop times minus the average of start times multiplied by the number of simultaneous calls.
Description: |
Calls sharing the same call-id |
File: |
|
Size: |
7.21MB |
The Excel file of this section contains many formulas and is heavy to open. In the next section we present the compact version of the CDR file containing only values resulting from the computation of the number and duration of simultaneous calls.
The CDR file containing the output values can be downloaded in section 6.4. Next sections show several statistics resulting from the CDR file. The statistics per country, per from-user field, and per destination number are shown in sections 6.1, 6.2, and 6.3 (also available in the Excel file of section 6.4).
Country |
Code |
Calls |
Minutes |
ACD |
Slovenia |
386 |
25'594 |
153'239.7 |
6.0 |
Sierra Leone |
232 |
6'553 |
32'224.2 |
4.9 |
Somalia Republic |
252 |
10'801 |
24'779.6 |
2.3 |
Guinea |
224 |
5'021 |
12'659.3 |
2.5 |
Israel |
972 |
28 |
118.4 |
4.2 |
Macedonia |
389 |
6 |
4.7 |
0.8 |
Zimbabwe |
263 |
1 |
0.2 |
0.2 |
From user |
Calls |
Minutes |
ACD |
101 |
19'765 |
76'851.3 |
3.9 |
asd300 |
1'778 |
22'211.1 |
12.5 |
0000 |
1'648 |
19'234.2 |
11.7 |
000000000000 |
8'534 |
18'356.5 |
2.2 |
0000000 |
2'008 |
16'017.6 |
8.0 |
kalnas600 |
4'451 |
13'818.2 |
3.1 |
dehka |
4'272 |
10'121.5 |
2.4 |
000000 |
868 |
9'974.5 |
11.5 |
foaad_saa |
147 |
6'993.3 |
47.6 |
7777 |
605 |
6'424.5 |
10.6 |
55202033 |
324 |
4'294.3 |
13.3 |
kalnas500 |
271 |
4'036.9 |
14.9 |
1111 |
204 |
2'923.7 |
14.3 |
888888888 |
350 |
2'374.3 |
6.8 |
888 |
110 |
1'745.6 |
15.9 |
123 |
138 |
1'362.1 |
9.9 |
hisham1970 |
813 |
1'339.9 |
1.6 |
0000000000 |
468 |
1'224.7 |
2.6 |
684168 |
779 |
863.6 |
1.1 |
111111111111 |
85 |
583.6 |
6.9 |
11 |
71 |
550.9 |
7.8 |
asdf500 |
98 |
365.9 |
3.7 |
shikso |
37 |
335.2 |
9.1 |
marryaina123-1001 |
17 |
282.4 |
16.6 |
999 |
13 |
168.4 |
13.0 |
karam155 |
8 |
146.4 |
18.3 |
RAMY250 |
10 |
117.2 |
11.7 |
00000000 |
58 |
117.0 |
2.0 |
5555555 |
6 |
101.0 |
16.8 |
133 |
21 |
48.2 |
2.3 |
10 |
19 |
32.2 |
1.7 |
1001 |
24 |
7.2 |
0.3 |
anonymous |
2 |
2.0 |
1.0 |
441932376101 |
1 |
0.6 |
0.6 |
250 |
1 |
0.1 |
0.1 |
The table of phone numbers being dialed shows only the top used numbers. The full list is available in the CDR Excel file (section 6.4).
To |
Calls |
Minutes |
ACD |
0038643281239 |
1'178 |
19'643.7 |
16.7 |
0038643281242 |
2'521 |
11'744.3 |
4.7 |
0038643281460 |
5'267 |
11'048.3 |
2.1 |
0038643281244 |
3'583 |
8'592.0 |
2.4 |
0023224000936 |
276 |
8'249.4 |
29.9 |
0023224000935 |
1'361 |
7'005.9 |
5.1 |
0038643281094 |
490 |
6'757.5 |
13.8 |
002522200377 |
1'613 |
6'564.5 |
4.1 |
0038643281081 |
356 |
5'629.6 |
15.8 |
0023224006762 |
649 |
5'587.6 |
8.6 |
0038643281286 |
1'711 |
5'524.6 |
3.2 |
0038643281494 |
557 |
5'304.5 |
9.5 |
0038643281461 |
2'259 |
5'176.2 |
2.3 |
0038643281287 |
366 |
4'794.5 |
13.1 |
0038643281463 |
695 |
4'732.7 |
6.8 |
0038643281289 |
330 |
4'618.0 |
14.0 |
0038643281098 |
779 |
4'604.2 |
5.9 |
0023224000938 |
2'966 |
4'385.7 |
1.5 |
0038643281234 |
358 |
4'065.5 |
11.4 |
0038643281498 |
370 |
3'840.6 |
10.4 |
0038643281288 |
271 |
3'811.6 |
14.1 |
0038643281230 |
291 |
3'742.0 |
12.9 |
0038643281233 |
321 |
3'741.3 |
11.7 |
0038643281499 |
319 |
3'711.6 |
11.6 |
0038643281238 |
462 |
3'619.9 |
7.8 |
0038643281231 |
316 |
3'566.2 |
11.3 |
002522200378 |
1'387 |
3'554.2 |
2.6 |
0023224006772 |
919 |
3'546.2 |
3.9 |
0038643281232 |
292 |
3'472.5 |
11.9 |
0038643281497 |
243 |
3'391.4 |
14.0 |
0038643281465 |
505 |
2'886.7 |
5.7 |
0038643281496 |
272 |
2'866.9 |
10.5 |
0038643281466 |
374 |
2'706.8 |
7.2 |
002522168653 |
592 |
2'477.0 |
4.2 |
0038643281241 |
166 |
2'323.9 |
14.0 |
0038643281476 |
155 |
2'292.1 |
14.8 |
002522168765 |
195 |
1'884.5 |
9.7 |
002522168898 |
403 |
1'867.7 |
4.6 |
002522168652 |
878 |
1'840.3 |
2.1 |
0022479910583 |
134 |
1'714.2 |
12.8 |
0022479910596 |
493 |
1'678.0 |
3.4 |
0038643281080 |
218 |
1'654.7 |
7.6 |
0022479910594 |
326 |
1'484.4 |
4.6 |
0022479910584 |
90 |
1'214.6 |
13.5 |
0022479910595 |
413 |
1'198.1 |
2.9 |
0022479910597 |
396 |
1'171.2 |
3.0 |
0022479910598 |
376 |
1'125.4 |
3.0 |
0023224001570 |
92 |
1'024.9 |
11.1 |
0038643281190 |
240 |
949.9 |
4.0 |
0023222291848 |
29 |
931.9 |
32.1 |
0022479910585 |
117 |
835.0 |
7.1 |
0038643281464 |
110 |
791.9 |
7.2 |
0022479910589 |
191 |
776.4 |
4.1 |
Description: |
CDR file created from syslog |
File: |
|
Size: |
1.66MB |
The following chart shows the evolution of the distribution of the traffic by countries. The data is presented for hourly intervals. The values represent the number of concurrent parallel calls lasting during a given hour. First the fraudulent traffic was using the connections of Verizon. When Verizon detected the fraud and suspended the calls, the flow interrupted for a couple of hours and then restarted using this time the routes of Colt.
The following two records show the first and last calls routed via Verizon:
Time: 2010-10-13 13:13:22
From: 101
To: Israel
Phone: 00972599870738
Duration: 00:00:05
Via: verizonbusiness.com
Time: 2010-10-16 18:09:12
From: dehka
To: Sierra Leone
Phone: 0023224000938
Duration: 00:00:46
Via: verizonbusiness.com
The fraudulent traffic was interrupted when Verizon detected the fraud and decided to block the calls. In a couple of hours the fraudulent traffic began again, and this time via Colt. The following two records show the first and last calls routed via Colt. The fraud was detected by Colt on Sunday and the calls were blocked.
Time: 2010-10-16 20:55:18
From: 250
To: Israel
Phone: 00972599916699
Duration: 00:00:07
Via: colt.net
Time: 2010-10-17 23:07:44
From: 000000000000
To: Somalia Republic
Phone: 002522168598
Duration: 00:00:19
Via: colt.net
Description: |
Distribution chart by hours |
File: |
|
Size: |
2.18MB |
When Verizon’s fraud department detected the pattern, the records of suspected calls to Slovenia were sent to us.
The CDR generated by ourselves from syslog files was compared with the CDR of Verizon containing the calls to Slovenia mobiles. Calls of both CDR matched accurately most of the time. The records in two files were often identical except a time shift from 32 to 34 seconds due to a wrong time on one of the sides.
Description: |
Vendor and syslog CDR comparison |
File: |
|
Size: |
12.8MB |
The following records represent the first and last calls appearing in the fraud report of Verizon for calls to Slovenia mobiles:
Time: 2010-10-15 01:48:46
To: 38643281227
Duration: 191 seconds
Time: 2010-10-16 18:08:58
To: 38643281463
Duration: 16
The entire traffic of 7’554’889 seconds or of 125’914.8 minutes, representing a charge of CHF 38'035.47 (without VAT) was sent to 32 phone numbers only. Except businesses handling simultaneous hot line calls, the multiple answers to the same phone number suggest a fraud. The following table shows the number of parallel calls to each specific individual mobile phone number. The first row of the table contains the 32 mobile phone numbers in question. The rows that follow represent one-hour intervals. The values appearing under individual phone numbers represent the average number of concurrent calls to that specific phone during the entire period of 1-hour intervals.
The table shows that for example during the entire hour from 2010-10-16 04:00 to 04h59 there were in average as many as 34 simultaneous calls to a single phone number +38 64 32 81 23 9, generating a total duration of 2’057.65 minutes during this single hour and corresponding to a cost of CHF 621.56 (per 1 hour and per 1 phone number). The number of simultaneous calls per single phone number reached as high as 91 parallel calls and the total number of parallel simultaneous calls to Slovenia mobiles reached as high as 180 parallel calls (a capacity of 6 full E1 lines).
In case of real mobile phone subscribers, we see neither a technical possibility nor an economical benefit for sending 126'000 minutes to 32 mobile phones in about one day. It is possible that a vendor of Verizon, or a vendor of its vendor provided a wrong answer supervision for all calls to Slovenia mobiles. Such an intermediary fake vendor would benefit from the traffic and can be therefore in the origin of the fraudulent calls. The final owner of the range of numbers in the destination country (such as a small MVNO, OLO, or PNS) can also benefit from the incoming traffic and therefore is also a hypothetical suspect for the origin of the fraudulent traffic.
The following chart is the graphical version of the previous table. The horizontal positions of histograms represent the hours. The total height of histograms at a given hour is the number of simultaneous calls to Slovenia mobiles. Different colors represent one of the 32 individual mobile phone numbers. The height of a single histogram of a single color is the number of simultaneous calls to the corresponding single mobile phone number. For example the chart shows that starting from 6 o’clock in the morning of October 16th, during one hour, there were 91 simultaneous calls to a single mobile phone subscriber +38643281239.
Description: |
Simultaneous calls per phone |
File: |
|
Size: |
1.08MB |
Fraud reports [142] [143] [144] [145] [146] [147]:
http://switzernet.com/3/public/101028-fraud-slovenia/ (this pahe)
http://switzernet.com/public/060801-web/news_detail.php?id=167
http://switzernet.com/public/060801-web/news_detail.php?id=166
http://switzernet.com/3/folders/101018-fraud-slovenia/ (login: fraud)
http://mirror2.switzernet.com/3/folders/101018-fraud-slovenia/ (login: fraud)
http://www.fedpol.admin.ch/content/fedpol/fr/misc/conform.html
ACD quality routing [148] [149] [150] [151] [152] [153] [154] [155] [156]:
http://switzernet.com/public/091020-acd-routing/
http://www.unappel.ch/2/public/091020-acd-routing/
http://unappel.ch/public/091020-acd-routing/
http://intarnet.com/2/public/091020-acd-routing/
http://parinternet.ch/2/public/091020-acd-routing/
http://switzernet.com/public/091029-ACDstat/
http://unappel.ch/public/091029-ACDstat/
http://switzernet.com/public/091217-doc-acd-routing/
http://en.wikipedia.org/wiki/Least-cost_routing
Emergency numbers [157]:
http://unappel.ch/folders/101004-emergency-calls-planning/ (login: ofcom)
Kamalio/OpenSER SIP server/router [158] [159] [160]:
Perl regular expressions [161] [162]:
http://switzernet.com/3/public/101024-regex/
http://perldoc.perl.org/perlre.html
References on syslog file format [163] [164] [165]:
http://www.facetcorp.com/tnotes/facetwin/tn_syslog.html
http://lists.rtpproxy.org/pipermail/users/2009-May.txt
References on SIP transactions versus dialogs [166] [167] [168] [169]:
http://www.iptel.org/sip_transaction
http://www.ietf.org/rfc/rfc2543.txt
http://www.ietf.org/rfc/rfc3261.txt
CDR stands for Call Data Records
ACD stands for Average Call Duration
UTC stands for Universal Time Coordinated
CET stands for Central European Time
CEST stands for Central European Summer Time
MVNO stands fro Mobile Virtual Network Operator
OLO stands for Other Licensed Operator
PNS stands for Personal Numbering Service
This section groups all files used along this research. The list contains files with raw syslog records as well as files showing different statistics. The reference that contains the call records and is not heavy to open is the output CDR file [101013+6-14'cdr.xls].
Description: |
All transactions of answered calls |
File: |
|
Size: |
3.69 MB |
Description: |
Call Data Records in Text format |
File: |
|
Size: |
1.29MB |
Description: |
Calls sharing the same call-id |
File: |
|
Size: |
7.21MB |
Call Data Records created from the syslog file:
Description: |
CDR file created from syslog |
File: |
|
Size: |
1.66MB |
Description: |
Distribution chart by hours |
File: |
|
Size: |
2.18MB |
Description: |
Vendor and syslog CDR comparison |
File: |
|
Size: |
12.8MB |
Description: |
Simultaneous calls per phone |
File: |
|
Size: |
1.08MB |
This section is addressed only to persons editing this or similar documents. This section is unrelated to the subject of the document.
The following image shows the styles used in this document. Do not add new styles when editing and updating this document.
The [file reference] table style is bugging. When you open the document the font settings are mixed up. In the modify style pane of the [file reference] style you have to re-apply the [Lucida Console] fonts to the right column of this table style. This will restore all other setting of the style. The procedure must be carried out before printing or saving the document in HTML format.
Microsoft field codes are used for auto incremental reference numbers appearing in the document. In order to toggle field codes you have to first remove the hyperlink (Ctrl-K).
To add a new reference you need to copy any of other references and change only the hyperlink. You do not need to care about the numbering. The numbering of all references can be updated in a single step. Select the entire document and in the right-click pop up menu choose [Update Field].
Before printing the document, update all fields (as explained in section 12.3) and delete the “iref” bookmark (Alt-I-K). Otherwise, all references will appear under the number of the last reference.
The main document file is a numbered index<N>.doc file, where <N> is an incrementing version number of the document. The document must be saved in index<N>.htm file (accompanying by an automatically generated folder index<N>_files). Every time a new version is released, the index.htm file must be deleted, and the last index<N>.htm file must be copied and renamed into new index.htm file. At any moment the index.htm file is a copy of the last index<N>.htm file. The index.htm file can be erased at any time when a new version is released. You must not have index.doc file. The folder index_files (corresponding to index.htm file) must be deleted as the index.htm file will anyway refer to the files located in the folder index<N>_files. At every update you must add in the header of the document your name and under the date of the update a link to the current version of index<N>.htm file (and not to index.htm) for backtracking.
Data files accompanying your document (not the files generated automatically when saving in HTML format) must be located in data<M> folder, where <M> is an incrementing number and is not necessarily equal to <N>. Do not hesitate to create each time your own data<M> folder, instead of adding pieces in already existing data<M> folder of the previous author.
* * *
Copyright © 2010 by Switzernet